Documentation

esProc

Tutorial

Function Reference

User Reference

External Library Guide

Data File Tool Manual

DQL Tutorial

Cluster Server Manager Manual

SPL WIN Manual
YModel

User Reference

JSON-style Parameter Guide
ReportLite

User Reference
Official Website
http://doc.scudata.com:8888/WEB-INF/layout/application.jsp1

Getting Started

Installation and Basic Uses

Cell Editing

Basic Data Types

Constants

Types of Cells

Multilevel Parameters

Option Syntax

Processing Common Data

esProc JDBC
Programming Logic & Basic Operations

Branches

Loop Statements

Sequence and Table Sequence

Using Sequences

Operations on TSeqs and RSeqs

Updating Table Sequences

Parameters and Variables

Linear Algebra Functions

Loop Computations
Associative Operations

Creating a new or different TSeq

Grouping and Summarizing Data

Sorting and Ranking Data

Alignment Grouping and Enumeration Grouping

Associating Tables by the Foreign Key

Primary Key and Index

Time Key
Database Management

Using SQL

Database Configuration

Database Connection Management

Database Update

Calling Database Stored Procedures

Transaction Isolation Levels
Advanced Coding

Code Block

Creative Uses of Strings

Clearing Cell Values

Special Characters

Subroutines
Integration & Application

Command Line

Java Invocation

JDBC gateway

Deploying JDBC

Encryption

Deploying ODBC

HTTP service

Enterprise edition JDBC
External Memory Computations

The Concept of Cursor

Basic Uses of Cursor

Cursor-based Aggregate Operations

Merge and Join Operations on Cursors

Methods of External Memory Grouping for Large Result Sets

Method of External Memory Sorting

Text Files

Bin Files

Group Cursor

Cross-cellset Cursor

Channel
Parallel Computing

The Built-in Parallelism

8.2 Multithreading

8.3 Multicursor operations
Heterogeneous Data Sources

Excel files

JSON data

XML data

Simple SQL

User-defined Functions
Cluster Computing

The Server Cluster

Clustered files

Cluster computations
Composite Tables

Generating a composite table

Accessing a (multi-zone) composite table

Maintaining composite tables

In-memory tables

Cluster composite tables

Memory zone

Pseudo Table

DQL Tool
Charts

The Basics

Coordinate Systems and Transformation

Coordinate Axes

The Axis Element

The Dot Element

The Line Element

The Column Element

The Sector Element

Text Element & Background Element

Web Integration and Hyperlink

Using Ready-made Graphs

Previous | Next

Sequence and Table Sequence

Read（2464） Like（1） Label: sequence, table sequence, record sequence,

Sequence, table sequence (TSeq) and record sequence (RSeq) are the most common data types esProc uses. This section will explain their respective characteristics as well as relations between them.

2.3.1 Sequences-ordered generic sets

Collectivity

A sequence consists of multiple data items, which are called members of the sequence. A member can be any data type, such as string, integer, float and date, and is allowed to be null. A sequence has the general characteristics of a set, and can be manipulated by set operations. For example:

	A
1	[]
2	[5,6,7]
3	[red,blue,yellow]
4	=["blue","yellow","white"]
5	=A3^A4

In the above cellset, the values of A1, A2 and A3 are as follows:

The values of A1, A2 and A3 are all sequences. A1’s is an empty sequence. The members of A2’s sequence are integers, thus it is also called an integer sequence. A3’s sequence has string members.

In A4, an expression is used to generate a sequence. Different from the constant sequence in A3, the strings in the expression should be placed in double quotes. A5 computes the intersection of the sequence in A3 and the one in A4, with a result being a sequence too. The results of A4 and A5 are as follows:

Genericity

A sequence is a generic set, which is allowed to have members of various data types. A member can also be a sequence. For example:

	A	B
1	[]	=[1,date("2014-05-01")]
2	[5,6,7]	=["blue",[],[5,6,7]]
3	[red,blue,yellow]	=["blue",A1,A2]

In the cellset, members of the sequence in B1 are an integer and a date; members of B2’s sequence, which equals the expression in B3, include sequences. The results of B1, B2 and B3 are as follows:

You can click the sequence members, which are displayed in blue, and see the details.

Orderliness

Generally, a set is unordered, that is, two sets with same members arranged in different ways are equal. But a sequence is ordered, which means two sequences with same members arranged in different ways are not equal. For example:

	A	B
1	[Mike,Tom]	[Tom,Mike]
2	=A1==B1

A2 uses the expression =A1==B1 to check if the two sequences are equal. The result is false:

Orderliness is a universal feature of business data. For example, that Mike comes before Tom may mean that Mike performs better in school; and sorting sales amounts by month can clearly show their changing pattern. It is more convenient to use a sequence to perform order-related calculations. For example:

	A
1	[Mike,Tom]
2	=A1(2)
3	=A1.m(-1)
4	=A1.pos("Tom")
5	=A1.rvs()

A2 fetches the second member of the sequence. The operation can also be expressed as =A1.m(2). A3 fetches the last member of the sequence. A4 gets the ordinal number of member Tom. A5 reverses the sequence to generate a new one. The results of A2, A3, A4 and A5 are as follows:

Other order-related operations include insert operation, delete operation, modify operation, copy operation, compare operation, aggregate operations, getting a sub-sequence, sort operation, rank operation, set operations, mutual transformation of strings and sequences, etc.

An integer sequence is a sequence whose members are integers. There are a set of methods to access an integer sequence. For example:

	A
1	=to(2,5)

A1 generates a sequence as follows:

To generate a sequence beginning with 1, such as =to(1,5), it can be abbreviated to =to(5).

2.3.2 TSeqs-sequences with structure

esProc inherits the concept of data table in relational databases, and defines it as table sequence. Consistent with the relational database’s definition of the data table, every table sequence has its own data structure, which consists of fields. Members of a table sequence are called as records.

Two-dimensional structured data objects

A member of a sequence can be any data type, such as an ordinary type, a sequence and a record; while members of a table sequence must be records of the same structure. The following data object is a table sequence:

Because a table sequence is a two-dimensional structured data object, it is usually generated from SQL statements, text files, bin files, Excel files or an empty table sequence. The results of A1, A2 and A3 in the following cellset are table sequences:

	A
1	=file("Order_Books.txt").import@t()
2	=demo.query("select * from CITIES")
3	=create(OrderID,Client,SellerId,Amount,OrderDate)

A1 generates a table sequence, which is the one in the above, from a text file. A2 generates a table sequence using SQL, and C1 creates an empty table sequence by specifying the field names. The table sequences in A2 and A3 are as follows:

A great number of algorithms for structured data, such as data querying, data sorting, summing up values, finding average value, and merging duplicate records, can be performed on table sequences. For example:

	A
1	=file("Order_Books.txt").import@t()
2	=A1.select(Amount>=20000 && month(Date)==5)
3	=A1.sort(SalesID,-Date)
4	=A1.groups(SalesID, month(Date); round(sum(Amount),2), count(~))

A2 selects records whose Amount is greater than or equal to 20,000 and whose Date falls in the month of May:

A3 sorts the records by SalesID in ascending order and those with the same SalesID by Date in descending order:

A4 groups the records by SalesID and the month, and then sums up Amount and counts the orders in each group:

TSeqs-special sequences

A table sequence is still a sequence, so it has the sequence’s features of collectivity and orderliness, and can use sequence-related functions. A table sequence isn’t generic in the sense of a sequence because its members must be records of the same structure. But, the field values of the records can be generic data, which in this sense is another form of genericity. Thanks to these characteristics, a table sequence is conveniently usable in handling complicated computations, compared with the traditional programming languages.

For example, to compute the monthly growth rate of the sales amount, esProc takes advantage of the table sequence’s orderliness to write the following algorithm:

	A
1	=file("Order_Books.txt").import@t()
2	=A1.groups(month(Date); sum(decimal(string(Amount))):TotalAmount, count(~):Count)
3	=A2.derive(round(TotalAmount/TotalAmount[-1]-1,4): compValue)

A2 computes the total sales amount in each month:

A3 computes the final result:

This instance is based on the collectivity. Assume that in a certain business a big contract is one whose quantity is greater than 1,000 and an important contract is one whose amount is more than 10,000. Please find out: 1. Contracts that are both large and important; 2. The other contracts:

	A	B
1	=file("Order_Books.txt").import@t()
2	Big	=A1.select(Quantity>1000)
3	Importance	=A1.select(Amount>10000)
4	=B2^B3	=A1\A4

B2 selects the big contracts, and then B3 selects the important ones. A4 computes the intersection of the two kinds of contracts, which is the answer to question 1:

B4 answers question 2 by computing the difference of all orders and the result of question 1:

Note: In the above code, A1 contains a table sequence, while, B2, B3, A4 and B4 are all record sequences originating from the table sequence. The difference and relation between the table sequence and the record sequence will be explained in the following section.

2.3.3 RSeqs-the references of TSeq records

Obviously, if each step of computation performed on a table sequence generates a new table sequence, a great deal of memory space will be used. For instance, a table sequence, Order_Books, has 50,000 records, and 30,000 records are obtained by query. If a new table sequence is created, there would be 80,000 records in the memory. In fact, as the queried records are part of the original table sequence, it is unnecessary to create a table sequence anew. You could simply store their references in a certain type of data object. The data object that esProc uses to do this is a record sequence.

Transparency of RSeqs

Usually, it’s not necessary to differentiate between a record sequence and a table sequence, as there’re also uncessary to distinguish between references and physical data. For instance, operations like query, sort and intersection in the previous example can be performed on both record sequences and table sequences with the same syntax:

	A	B
1	=file("Order_Books.txt").import@t()	=A1.to()
2	=A1.select(Amount>=20000 && month(Date)==5)	=B1.select(Amount>=20000 && month(Date)==5)
3	=A1.sort(SalesID,-Date)	=B1.sort(SalesID,-Date)

Note: A1 is a table sequence, from which the record sequence in B1 originates. The results of A2, B2, A3 and B3 are all record sequences.

Set operations can be performed between record sequences, like the manipulation of orders in TSeqs - special sequences . But there isn’t any practical significance to perform set operations between table sequences, because members of different table sequences are different objects and their intersection is always empty.

When data structure changes, esProc will automatically create a new table sequence, like what happens in one of the examples in the above where groups function is used to group and aggregate data, or where new fields are added to a table sequence using derive function.

One-way influence from TSeqs to RSeqs

Different table sequences hold different physical records, so modifying a table sequence will not affect the other ones. But, a record sequence is the references of the records of a table sequence and uses the latter’s physical data as its source; any change of the source table sequence will affect the corresponding record sequence. For example:

	A
1	=file("Order_Books.txt").import@t()
2	=A1.select(SalesID==363)
3	>A1.modify(3, 1:SalesID)

Click on the toolbar to perform the stepwise, and you will see A1’s table sequence, where the third record’s SalesID is 363:

A2 selects from the table sequence all the records whose SalesID is 363 and creates a record sequence:

A3 modifies SalesID of the third record of A1’s table sequence by reassigning 1 to it. Then check the table sequence and you can see that the third record is modified:

And a check of A2’s record sequence shows that the corresponding record changes too, though it isn’t modified directly. This is because it references records of the table sequence as its own records:

If there are multiple record sequences that use the same table sequence as their source, the correponding records in all these record sequences will change accordingly.

But in many cases, this isn’t what a programmer wants to see. Take the above case as an example. Because A3’s modification to the table sequence happens after A2 selects out the desired data (In this case, the corresponding reference hasn’t changed), in the selected data in A2 there is a record whose SalesID isn’t 363. This might cause an error in business.

In order to avoid the error, you should finish modifying the table sequence before the record sequence is created. For example:

TSeq.modify(5, 1000: Amount) / Modify the Amount field of the record whose OrderID equals 5 into 1,000, i.e., less than 2,000.

RSeq =TSeq.select(Amount>2000) / The records whose OrderID equals 5 won’t appear in the RSeq.

	A
1	=file("Order_Books.txt").import@t()
2	>A1.modify(3, 1:SalesID)
3	=A1.select(SalesID==363)

First modify the original data in A2 and then select the desired data. In this way, the record sequence obtained in A3 is as follows:

This result is what the business requires. It’s only natural for the above computing order as long as the reference relation between a table sequence and a record sequence is understood.

It’s important to note that operations to modify the original table sequence, like modify function, are not accepted in a record sequence. The relation between the two is a safe one-way influence, giving users every confidence that they’ll get the right result with the use of record sequences and table sequences.