Advanced metadata design

Read(982) Label: advanced metadata, metadata,

The Semantic layer subsection has explained the process of creating the metadata file. Besides the basic configuarion, there are advanced configurations in the metadata, involving foreign key, logical table’s level function and subtables. The previous subsection illustrates how to use the advanced configurations, and this subsection will take a further step to inroduce the advanced metadata design.

Foreign key

We can define the foreign key as needed. Similar to the association between database tables, association is created between two DQL tables if a DQL table’s foreign key field points to the primary key of another DQL table.

Foreign key name: The foreign key name can be directly edited, and should be unique within the current table otherwise errors will happen during the foreign key reference. When a foreign key consists of only one field, usually an alias will not be specified for it but instead the field name is used directly. If a foreign key consists of multiple fields, we must use the Foreign key name when referencing a field of table referenced by the foreign key. In the above screenshot, fk1, fk2 and fk3 are all foreign key names and fk2 is selected.

Referenced table name: Name of the table pointed by the foreign key. When the referenced table has a unique primary key and the key does not act as the foreign key, the table is called dimension table (simply called dimension). In the above screenshot, the dimension table is Product.

Referenced field: The primary key of the table pointed by the foreign key. In the above screenshot, the referenced field is ProductID, which is Product table’s primary key.

Foreign key filed: Field values of the current table correspond to those of referenced field one by one. In the above screenshot, Inventory table’s foreign key field is ProdcutID.

In DQL, a foreign key field and a dimension field have equal positions. We can regard a foreign key field as a dimension field.

Any table’s primary key can be regarded as a dimension field. When a table’s primary key is also the foreign key, the primary key’s dimension is the dimension pointed by the foreign key. When a primary key isn’t a foreign key, its dimension is itself. For a non-primary key field, it becomes a dimension field only when it is used as the foreign key and the field’s dimension is the dimension pointed by the foreign key.

Example:

 

According to the above figure, ReturnedPmt table’s CustomerID field points to Customer table’s CustID field. The relationship between ReturnedPmt and Customer is many to one. Related conceptions are as follows:

Foreign key field: ReturnedPmt table’s CustomerID field.

Dimension table: Customer table

Referenced field: Customer table’s primary key, the CustID field

DQL allows referencing a field in the foreign key table as referencing an attribute. In SELECT CustomerID.CustName,Date,Amount FROM ReturnedPmt for instance, CustomerID.CustName references CustName field in Customer table pointed by CustomerID.

 

Tips:

1) In DQL, we can directly use a single-field foreign key through the syntax of foreign key field.foreign key table’s field name.

2) In DQL, we can directly use a composite foreign key through the syntax of foreign key name.foreign key table’s field name.

3) The reference syntax of foreign key name.foreign key table’s field name applies to both composite foreign keys and single-field foreign keys.

Logical table level function

A function whose parameter is a table’s dimension and that returns a different dimension from the original one is called a hierarchy function.

 

A logical table is a single-field table that does not physically exist within a pseudo table. There are two purposes of introducing the concept of logical tables:

One is to let users observe data on different levels of granularity by defining the hierarchical structure for a dimension.

Using the logical table and the level function together lets users define a date hierarchy like year, month and day and enables them to observe data on different levels of time granularity. We can also define a hierarchy like nation, province and city. Business requirements determine what type of hierarchy we need to create. Once such a hierarchy is defined, it can be used within the Metadata Editor throughout.

Generally, the number of tables (or logical tables) to be defined is the same as that of dimensions during hierarchy definition. Take the date hierarchy as an example, usually we just need to define three tables – year, month and day. If a table’s date field stores day data, associate it with the “day” table; if it contains month data, link it with the “month” table, and so on.

During metadata definition, a dimension definition will be automatically generated according to the related information in foreign key configurations when a foreign key association between tables is established. Then we can define computing relationships between dimensions on metadata’s Dim and level tab.

 

Example:

Define a level function on the date dimension. demo.glmd, for instance, defines the the levels of date data through three logical tables – Day, Year and YearMonth:

 

 

 

Note:

1) Use question mark (?) to represent the table containing the source field in a hierarchy expression. In the following instance, a value of “Year” dimension field is computed based on the month field in “YearMonth” dimension using expression int(?/100).

2) A level functioncomputes from the greater level of granularity to a less level. For instance, we can get years from days based on “Day” dimension but not vice versa.

3) A target dimension can be obtained from different source dimensions. A “Year” dimension, for instance, can be accumulated from days or from months/years. For the “Year” dimension, we need to define two hierarchy expressions to get its values from “Day” table’s date and from “YearMonth” table’s month/year.

4) Once hierarchical structures are defined, we can use one of them directly in DQL queries, as shown below:

 

ReturnedPmt table’s Date field is date type. In the table’s foreign key definition, we establish a foreign key association between ReturnedPmt table’s Date field and Day table’s Day field, as shown below:

This way we can get year and month data from Date field in DQL, like getting year in Date using Date#Year and year and month in Date using Date#YearMonth.

 

DQL: SELECT CustomerID,ID,Date,Date#YearMonth YearMonth,Date#Year Year,Amount,SellerID FROM ReturnedPmt

DQL: SELECT sum(Amount) Sum_Amount ON Year FROM ReturnedPmt by Date#Year

DQL: SELECT sum(Amount) Sum_Amount ON YearMonth FROM ReturnedPmt by Date#YearMonth

Example:

Define a level functionon an integer dimension. demo.glmd, for instance, defines the following hierarchy using Province table, Area table and City table:

 

Here are DQL query statatments and results:

DQL: SELECT CustID,CustName,Contact,ContactTitle,CityCode,CityCode#Province Province,CityCode#Area Area FROM Customer

DQL:

  SELECT count(CustID) CustNumber ON Area FROM Customer BY  CityCode#Area

DQL:

SELECT count(CustID) CustNumber ON Province FROM Customer BY  CityCode#Province

The other is to use a logical table as a database view. In this case logical tables can be treated as ordinary tables, for which we can define a series of properties like primary key and foreign key. The effect is the same as operations on real tables.

Tip:

With concepts of the logical table and the hierarchy function, users are able to observe data on a desired level and manipulate data more flexibly.

Subtable

In metadata, if table A has a composite primary key and when the foreign key is defined for part of the primary key fields, table A will be automatically listed as one of the sub tables of the primary table pointed by the above-mentioned foreign key. We can perform an aggregate query on this sub table’s field from the primary table. According to Customer’s sub table ReturnedPmt, for instance, we can perform an aggregate query on a field in the former while querying the latter’s fields.

A primary table is a table pointed by a foreign key consisting of part of the table’s primary key fields; a dimension table of the level function with a single-field primary key; a primary table’s primary table; a homo-dimension table’s primary table; or a primary table’s homo-dimension table.

Note: For a primary table and their sub table, their primary key values should correspond in order and both tables should be ordered by the primary key. The sub table’s foreign key should be linked to the primary table’s primary key field.

Adding a sub table to the metadata can simplify statements for querying them. For example:

 

CustID is the primary key of Customer table. In ReturnedPmt table, the primary key is made up CustomerID and ID; the foreign key CustomerID points to Customer table’s CustID.

Below are foreign key settings for ReturnedPmt table:

 

The sub table ReturnedPmt is automatically added to Customer table:

The syntax of aggregate queries on a sub table’s field from the primary table is different from performing queries on the primary table’s own fields. For example:

-  SELECT

-   CustName,

-   Customer@ReturnedPmt.sum(Amount) Total

-  FROM

-  Customer by CustName

Tips:

1) We can summarize a sub table’s field from the primary table on the condition that the former’s foreign key points to the latter’s primary key field.

2) Syntax of aggregate queries on a sub table’s field from the primary table:

T@S.f(F)  The aggregate expression on the general field F of table T’s sub table S. f is aggregate function count/avg/sum/max/min.

Pseudo fields

By setting up primary key and foreign key, we create association between tables. According to the foreign key, we can get information from Customer table and information of VIPCusotmer and other tables that share the same dimension as Customer; we can get information of City table through Employee table’s foreign key association; and so on.

A general foreign key refers to a table’s foreign key, a homo-dimension table’s foreign key or the foreign key of a general foreign-key-pointed table. Besides, we also treat the general foreign key’s level function as a kind of general foreign key.

A table’s fields, a homo-dimension table’s fields, fields of a general foreign-key-pointed table and a general field’s level function, we call all of them as general fields.

An expression defined through a general field is called a pseudo field, which is treated as a general field but isn’t used as the foreign key.

 

By setting pseudo fields in the metadata, we can rename the commonly used general fields and make the query statements over them more concise and convenient. For example:

Define a general field called VIPCustomer_End in ReturnedPmt as follows:

Use a pseudo field as we use an ordinary field. For instance, rather writing a DQL query as SELECT CustomerID.EndTime#YearMonth FROM ReturnedPmt, we write it as  SELECT VIPCustomer_End FROM ReturnedPmt.

 

Note: A pseudo field is implemented through macro replacement. No level function can be used any longer.