Generating a composite table

A composite table stores data in a special format and has the extension .ctx. Let’s look at how to generate and use a composite table through an example:

 

A

B

1

=demo.query("select NAME, SURNAME, GENDER from employee")

 

2

=A1.select(GENDER=="M")

=A1\A2

3

=A2.(NAME).id()

=A3.len()

4

=B2.(NAME).id()

=A4.len()

5

=A1.(SURNAME).id()

=A5.len()

6

[Sales,Technology,R&D,Financial,Admin]

[0,0.5,0.75,0.9,0.97,1]

7

=to(100000).new(#:EID,A6(B6.pseg(rand())):Dept, if(rand()<0.5,"F","M"):Gender, if(Gender=="M",A3(rand(B3)+1), A4(rand(B4)+1))/" "/A5(rand(B5)+1):Name)

 

8

=A7.select(Dept=="Sales")

=A8.len()

9

2020-01-01

=workdays(A9,A9+365)

10

=B9.((a=string(~,"yyMMdd"), to(rand(1000)+ 500).new(a/string(~, "0000"):OID, B9.~:Date,A8(rand(B8)+1).EID:EID, rand(100)*10+200:Amount))).conj().sort(EID, OID)

=A10.groups(EID;count(~):OCount, sum(Amount):OAmount)

11

=file("D:/file/dw/employees.ctx")

=A11.create(#EID,Dept,Gender, Name)

12

 

>B11.append@i(A7.cursor())

13

=B11.attach(stable,OCount,OAmount)

>A13.append@i(B10.cursor())

14

=file("D:/file/dw/orders.ctx")

=A14.create@p(#EID, OCount, OAmount)

15

 

>B14.append(B10.cursor())

16

=B14.attach(otable,#OID,Date,Amount)

>A16.append(A10.cursor().new(EID, OID,Date,Amount))

17

>B11.close()

>B14.close()

A1 retrieves employee data from the demo database. A3 and A4 respectively get the names of male and female employees. A5 gets the employees’ surnames for generating names randomly.

A6 lists the available department names. B6 sets the cumulative percent of the employees in each department for the test data. For example, the percentage of employees in Sales department is 0.5, and that of employees in Finance department is 0.97-0.9=0.07.

A7 generates test data of 100,000 employees:

A8 selects data of the employees in Sales department to generate test order data:

B9 generates a sequence of workdays in the year of 2020 as the order dates in the test data. A10 generates random order data for each work day and sorts the data by EID and OID. Here’s A10’s result:

B10 calculates the number of orders and order amount for each salesperson:

With the data ready, let’s generate a composite table. In B11, f.create(C1,C2,…) function generates the first entity tableemployee – for the composite table file employees.ctx and defines all its fields, which are the same as the fields in A7. The entity table is completely independent and thus called the base table; there must be one and only one base table in a composite table. A field preceded by # is the entity table’s dimension. A dimension is a column in an entity table. The records in an entity table must be ordered by its dimension(s). Here one dimension, EID, is defined for the composite table’s (employees.ctx) base table. The base table has three other fields – Dept, Gender and Name. Data in a composite table is by default stored by column. To store data by row, use @r option in the create function. To perform data query over one or a few records in a composite table, store the table by row for better performance. But, a row-based composite table isn’t the best choice for performing a full table scan and it doesn’t support querying data via the multicursor. Data in a composite table is stored in a compressed way to reduce disk space use. You can use @u option with create() function to disable the compression action.

In B12, T.append(cs) function generates a cursor according to A7’s employee data and populates the cursor data into the composite table’s base table. In the T.append(cs) function, cursor cs and entity table T should have the same structure, which means they need to have same number of fields and same field names. Otherwise, errors will appear in data appending. The function uses @i option to append data to an entity table immediately. Without the option, the function cumulates a certain number of records before writing them in an entity table, or it writes data to a composite table when T'.close() function closes the table. When T.append() function is used to add data, composite table T should not be opened in multiple locations, otherwise data collision may occur.

In A13, T.attach(T',C1,…) function adds another entity table stable in the composite table. It has two fields – OCount (number of orders) and OAmount (order amount). B13 fill this entity table with data. One thing to note is that the seller table stable depends on the base table, or the entity table employee. It is called the attached table of a composite table’s base table; the base table, or the entity table employee as in here, is called stable’s parent table. Except for its own fields, an attached table must contain the parent table’s key field(s). So B10’s table should have the base table’s key EID. As a new key isn’t set for stable, both entity tables have the same dimension EID, which makes them homo-dimension tables. In a composite table, if the base table and its attached table are homo-dimension tables, records in the attached table correspond to certain records in the base table.

B14 defines fields for phyiscal table orders, or the base table in the composite table orders.ctx. EID (salespeople number) is set as the dimension. Here @p option is used to group data by segment by EID field at storage,ensuring that the order records of one salesperson are all retrieved into one segment.

B15 enters the order data into the orders.ctx’s base table.

A16 adds an entity table otable to compoiste table orders.ctx. B16 populates it with A10’s order data.

In compoiste table orders.ctx, otable depends on phyiscal table orders and thus is its attached table; orders is otable’s parent table. The parent table has one key EID, but the attached table has another key OID besides the parent table’s key. In this case, the base table and the attached table are parent-child relationship.

Now we know that a composite table can have more than one entity table. The relationship between multiple entity tables should be the parent-attached relationship. There must be one and only one base table. In line 17, the composite table is closed after its work is done. It is closed by closing the base table only. The close action can be omitted if no data is written to the composite table during computation.