Processing Common Data

Read(4239) Label: data source,

To make data analysis, first you need to import the to-be-processed data. Most of the time, the data comes from text files or databases. It’s easy and fast to retrieve data in esProc.

1.8.1 Text data

EID

NAME

SURNAME

GENDER

STATE

BIRTHDAY

HIREDATE

DEPT

SALARY

1

Rebecca

Moore

F

California

1984-09-28

2015-01-18

R&D

7000

2

Ashley

Wilson

F

New York

1990-05-28

2018-01-23

Finance

11000

3

Rachel

Johnson

F

New Mexico

1980-10-25

2020-10-09

Sales

9000

4

Emily

Smith

F

Texas

1995-01-14

2016-06-23

HR

7000

5

Ashley

Smith

F

Texas

1985-03-21

2014-06-08

R&D

16000

6

Matthew

Johnson

M

California

1994-05-16

2015-05-16

Sales

11000

7

Alexis

Smith

F

Illinois

1982-06-25

2012-06-24

Sales

9000

8

Megan

Wilson

F

California

1989-02-25

2014-02-26

Marketing

11000

9

Victoria

Davis

F

Texas

1993-10-15

2019-10-16

HR

3000

 

 

 

 

 

 

 

 

esProc uses import function to import data from files:

 

A

1

=file("employee.txt")

2

=A1.import@t()

3

=A1.import()

In A2, import function uses @t option to import the text file’s first line as the column names of the table sequence during data importing. A2’s data is as follows:

Let’s look at what it will be like without @t option. Here’s the resulting table sequence A3 gets:

Besides, the program will automatically identify each field value as a string due to the influence of the original column headers.

1.8.2 Database data

esProc can access various databases through JDBC. Click Datasource Connection from the menu list of Tool to view the datasource manager:

You can connect to or disconnect from a certain datasource, as well as configure the database to be connected through the datasource manager. demo is esProc’s built-in datasource which can be launched by executing esProc\bin\startDataBase.bat under the installation directory. Once it connects to the datasource, esProc gains access to the database and fetches data using SQL:

 

A

1

=demo.query("select * from CITIES")

2

$select * from CITIES

The query function is used to get the result set of executing the SQL command and retrieve it as a table sequence, as with the code in A1. When the database is connected, a SQL statement can follow immediately after $, as A2 shows. The results of A1 and A2 are the same, as shown below:

Besides using the datasource manager, you can use connect function to connect to a data source. In this case, you should close the connection using close function after data is retrieved from the database:

 

A

1

=connect("demo")

2

=A1.query("select * from CITIES")

3

>A1.close()

With this method, A2 also gets the same table sequence of cities information.