Description:
Group a table sequence and then get the aggregate result cumulatively.
Syntax:
A.groups(x:F,…;y:G,…)
Note:
The function groups and aggregates table sequence A by expression x to generate a new table sequence with F,.. G… as the fields. Namely, during the traversal through members of A, they will be placed to the corresponding result set one by one while a result set is aggregated cumulatively. Compared with the method of first grouping and then aggregation represented by A.group(x:F,…;y:G,…) function, the function has a better performance.
Option:
@o |
Group records by comparing adjacent ones, which is equal to the merging operation, and the result set won’t be sorted. |
@n |
x gets assigned with group numbers which can be used to define the groups. @n and @o are mutually exclusive. |
@u |
Do not sort the result set by x. It doesn’t work with @o/@n. |
@i |
x is a Boolean expression. If the result of x is true, then start a new group. There is only one x. |
@m |
Use parallel algorithm to handle data-intensive or computation-intensive tasks; no definite order for the records in the result set; can’t be used with @o and @i options. |
@0 |
Discard the group over which the result of grouping expression x is null. |
@h |
Used over a grouped table with each group ordered to speed up grouping. |
@t |
Return an empty table sequence with data structure if the grouping and aggregate operation over the sequence returns null. |
@z(…;…) |
Split the sequence according to groups during parallel computation, and the multiple threads share a same result set. |
@e |
Return a table sequence consisting of results of computing function y; expression x is a field of sequence A and y is a function on A; the result of y must be one record of A and y only supports maxp, minp and top@1 when it is an aggregate function. |
Parameter:
A |
A sequence. |
x |
Grouping expression |
F |
Field name of the result table sequence. |
y |
y is the function with which A is traversed. When y is an aggregate function, it only supports sum/count/max/min/top/avg/iterate/icount/median/maxp/minp/concat/var. When the function work with iterate(x,a;Gi,…) function, the latter’s parameter Gi should be omitted. When y isn’t an aggregate function, perform calculation over only the first record in each group. |
G |
Summary field name in the result table sequence. |
Return value:
Post-grouping table sequence
Example:
A |
|
|
1 |
=demo.query("select * from SCORES where CLASS = 'Class one'") |
|
2 |
=A1.groups(STUDENTID:StudentID;sum(SCORE):TotalScore) |
Group by a single field. |
3 |
=demo.query("select * from SCORES") |
|
4 |
=A3.groups(CLASS:Class,STUDENTID:StudentID;sum(SCORE):TotalScore) |
group by multiple fields. |
5 |
=A3.groups@m(STUDENTID:StudentID;sum(SCORE):TotalScore) |
Use @m option to increase performance of big data handling. |
6 |
=A3.groups@o(STUDENTID:StudentID;sum(SCORE):TotalScore) |
Only compare and merge with the neighboring element, and the result set is not sorted. |
7 |
=demo.query("select * from STOCKRECORDS where STOCKID<'002242'") |
|
8 |
=A7.groups@n(if(STOCKID=="000062",1,2):StockID;sum(CLOSING):TotalPrice) |
The value of x is the group number. |
9 |
=demo.query("select * from EMPLOYEE") |
|
10 |
=A9.groups@u(STATE:State;count(STATE):TotalScore) |
Do not sort result set by the sorting field. |
11 |
=A9.groups@i(STATE=="California":IsCalifornia;count(STATE):count) |
Start a new group when STATE=="California". |
12 |
=A3.groups(CLASS:Class,STUDENTID:StudentID;iterate(~~*2,10): Score1) |
Perform iterate operation within each group. |
13 |
=file("D:\\Salesman.txt").import@t() |
|
14 |
=A13.groups@0(Gender:Gender;sum(Age):Total) |
Discard groups where Gender values are nulls. |
15 |
=file("D:/emp10.txt").import@t() |
For data file emp10.txt, every 10 records are ordered by DEPT.
|
16 |
=A15.groups@h(DEPT:dept;sum(SALARY):bouns) |
A15 is grouped and ordered by DEPT, for which @h option is used to speed up grouping.
|
17 |
=A1.groups(STUDENTID:StudentID;SUBJECT,sum(SCORE):SUMSCORE) |
Parameter y isn’t an aggregate function, so the function performs operation over the first record.
|
18 |
=demo.query("select * from SCORES where CLASS = 'Class three'") |
Return an empty table sequence. |
19 |
=A18.groups@t(STUDENTID:StudentID;sum(SCORE):TotalScore) |
Return an empty table sequence with the data structure.
|
A |
|
|
1 |
=demo.query("select * from SCORES") |
|
2 |
=A1.groups@z(STUDENTID:StudentID;sum(SCORE):TotalScore) |
Split A1’s sequence according to groups during parallel computation. |
A |
|
|
1 |
=demo.query("select EID,NAME,GENDER,DEPT,SALARY from employee") |
|
2 |
=A1.groups(DEPT;minp(SALARY)) |
Execute aggregate function minp() and return A3’s records. |
3 |
=A1.groups@e(DEPT;minp(SALARY)) |
Return a table sequence consisting of result records of computing minp(SALARY). |
Related function: