The Concept of Cursor

Read(3061) Label: cursor,

The concept of cursor is very important for the database. Cursor enables a more flexible manipulation of data, returning data from a table row by row. esProc supports many types of cursor, like database cursor, file cursor and in-memory record sequence cursor, to satisfy various needs in data fetching and processing. Here let’s learn about esProc cursor and its three types.

7.1.1 What is cursor

Using cursor on the database enables returning data in batches, without having to retrieve all of it into the memory at a time. A cursor is like a pointer, retrieving one row each time from the result set by moving its position. This mechanism of retrieving data in a row-wise manner is usually used to process the big data table, which cannot be entirely loaded into the memory that has the capacity limit.

Compared with directly returning the whole result, the database cursor is flexible enough to avoid the memory overflow but is relatively slow in handling data retrieval.

There are differences, however, between an esProc cursor and a database cursor:

1)  An esProc cursor is only used to fetch data but won’t modify a result set.

2)  An esProc cursor traverses data only once in forward direction, which is similar to the setting of TYPE_FORWARD_ONLY in JDBC.

3)  An esProc cursor can be created based on either the database table or the data file, or the in-memory record sequence.

Different types of esProc cursors have similar uses and can be used together to suit requirements.

7.1.2 Database cursor

db.cursor(sql) function can convert data returned from the database into the cursor:

 

A

1

=demo.cursor("select * from EMPLOYEE")

2

=A1.fetch(100)

3

=A1.fetch(100)

4

>A1.close()

A1 creates a database cursor. A2 fetches the first 100 records from it, as shown below:

The cs.fetch() funciton returns data fetched from the cursor as a table sequence.

A3 continues to fetch another 100 records from the result set in order as esProc allows a single traversal of data in a forward-only fashion:

As can be seen from the employee IDs, A3 fetches records from the 101th employee. This is merely an example of fetching data from cursor with the cs.fetch() function. For more about esProc cursor, refer to Basic Uses of Cursor.

If data in a cursor hasn’t been all fetched out when data fetching is over, use cs.close() to close the cursor. It is very important to close the cursor, particularly the database cursor, timely. Because a database cursor gets data from the database, its existence means the connection to the database remains, causing unnecessary memory usage and even the overrun of cursors allowed to be created by the database.

Similar to the db.query() function for data query, database cursor creation allows using parameters, too. For example:

 

A

1

=demo.cursor("select EID, NAME, SURNAME, GENDER, STATE from EMPLOYEE where GENDER=? and EID>?","F",100)

2

=A1.fetch()

In a SQL statement, parameters are represented by question marks (?) whose values are entered in order after the statement and separated by comma. A1’s cursor returns female employees whose EID is greater than 100. A2 returns all records in A1’s cursor:

The cs.fetch() function in A2 doesn’t specify the number of records to be returned, and thus returns all records in the cursor. When data in a cursor has all been fetched out, the cursor will close automatically with no need of calling cs.close().

7.1.3 External file cursor

In actual practice, particularly during big data processing, data is often generated from files rather than databases. Given this, esProc provides retrieving data from external structured files using the cursor. For example, the following text file PersonnelInfo.txt holds a large amount of personnel information:

In this file, each row holds a record and columns are separated by tabs. The cursor is used to retrieve data from this external file and return it also as a cursor. For example:

 

A

1

=file("PersonnelInfo.txt")

2

=A1.cursor@t()

3

>A2.skip(50000)

4

=A2.fetch(100)

5

>A2.close()

A2 creates a cursor based on the external file and uses @t option to retrieve the file’s first row as the column names of the result set. A3 skips the first 50,000 rows of A2’s cursor and then A4 fetches 100 rows:

As the database cursor, data is fetched from an external file cursor through a single forward-only traversal. Thus after A3 skips certain rows using cs.skip(), the data fetching in A4 begins from the 50,001th row. You can also specify certain fields of a file to create a cursor:

 

A

1

=file("PersonnelInfo.txt")

2

=A1.cursor@t(ID, Name, State, Gender)

3

>A2.skip(50000)

4

=A2.fetch(100)

5

>A2.close()

This one creates an external file cursor in A2 using specified fields. Fetch data from A4:

Note: It’s only after field names have been imported using @t option that they can be used to specify fields. Without the option, fields can only be specified directly by their positions, like =A1.cursor(#1, #2, #6, #3).

Further discussion about the external file cursor will be covered in Text Files.

Returning a pure table sequence from a cursor

The features explained in this section only appear in esProc Enterprise Edition. Users who use the other editions can just skip it.

Similar to db.query@v(sql), cs.fetch(), when querying data in a cursor, can also work with @v option to return a pure table sequence. For example:

 

A

B

1

=demo.cursor@v("select EID, NAME, BIRTHDAY from employee")

 

2

=A1.fetch(20)

=ifpure(A2)

3

=A1.fetch@x(20)

=ifpure(A3)

4

=file("PersonnelInfo.txt")

 

5

=A4.cursor@tv(ID, Name, State, Gender)

>A5.skip(90000)

6

=A5.fetch@x(100)

=ifpure(A6)

After a cursor is generated, each operation on it will return a pure table sequence. Here are results of A2, B2, A3 and B3:

 

 

A file cursor can also use @v option. For example, A6 and B6 get the following results: