hbase_filter()

Read(2646) Label: hbase filter, filter data,

Description:

Filter data with the HBase-supplied filter.

Syntax:

hbase_filter(filterName,filterArg)  Different filterName values represent different numbers and types of specific parameters, as shown in the following:

hbase_filter(ColumnCountGetFilter,n)

Get the first n values from each row

hbase_filter(ColumnPaginationFilter,limit,columnOffset)

For each row, get field values, whose number is limit, starting from the (columnOffset+1)th column

hbase_filter(ColumnPrefixFilter,str)

For each row, get field values whose corresponding column names have a prefix which can match up str

hbase_filter(ColumnRangeFilter,minColumn,boolean,maxColumn,boolean)

For each row, get values whose corresponding column names fall in a certain range; the parameter boolean indicates an open interval – when the value is false, or a closed interval – when the value is true

hbase_filter(DependentColumnFilter,family,qualifier)

Specify a column and perform filtering to get columns whose timestamp is different

hbase_filter(FamilyFilter,ifequal,my-family)

Specify a column family, whose name is my-family, to perform the filtering; the value of ifequal can be “=”, “!=”, etc.

hbase_filter(FirstKeyOnlyFilter)

Get the first value from each row

hbase_filter(FirstKeyValueMatchingQualifiersFilter,column)

Get the first value from each row that satisfies the specified column qualifier

hbase_filter(FuzzyRowFilter,fuzzyKeysData)

Perform matching over fuzzy row key. Take a list parameter (List>) to do this. The list has a pair of byte arrays, in which the first one is the row key format, and the second one’s value is 0 or 1. 0 means a byte at same position in row key must match; 1 means a byte at same position in row key can be different.

hbase_filter(InclusiveStopFilter,stopRowKey)

Stop data scanning if a specified row is reached

hbase_(KeyOnlyFilter)

Will only return the key components, like rowkey, column family name, column name and timestamp

hbase_filter(MultipleColumnPrefixFilter,prefixes1,prefixes2,...prefixesN)

Allow to specify multiple column prefixes

hbase_filter(PageFilter,n)

Pigination filter for displaying n rows

hbase_filter(PrefixFilter,str)

Specify a prefix for rowkey

hbase_filter(QualifierFilter,cmp,cmpColumn)

To filter based on column qualifier; cmp is a comparison operator like equal (or=) and not equal (or !=); cmpColumn is column qualifier filtered out through comparator.

hbase_filter(RowFilter,cmp,cmpRow)

To filter based on row key; cmp is a comparison operator like equal (or=) and not equal (or !=); cmpColumn is row key filtered out through comparator.

hbase_filter(ValueFilter,cmp,cmpValue)

To filter based on vlaue ; cmp is a comparison operator like equal (or=) and not equal (or !=); cmpColumn is a value filtered out through comparator.

hbase_filter(SingleColumnValueExcludeFilter,family,columnName,ifequal,cmpValue)/ hbase_filter(SingleColumnValueFilter,family,columnName,ifequal,cmpValue

Filter rows based on specified column value; family is the column family name, ifequal is comparison operator, and cmpValue is the result filtered out through comparator. columnName column will be excluded with SingleColumnValueExcludeFilter, while included with SingleColumnValueFilter.

hbase_filter(RandomRowFilter,n)

Filter rows randomly and return a random result set with a certain probability. To perform same RandomRowFilter on same data set for multiple times, different result sets will be returned. Paramameter n is float type; all rows will be filtered away if n<=0, and all will be kept if n>=1.

hbase_filter(SkipFilter,hbase_filter(ValueFilter,ifequal,cmpValue))

A wrapper filter that filters an entire row if any of the cell is ineligible; work with ValueFilter.

hbase_filter(WhileMatchFilter,hbase_filter())

A wrapper filter that stops scanning as soon as the filtering condition is met.

hbase_filter(TimestampsFilter,timeStamp,boolean)

Return only values whose timestamps match up the specified timestamp.

Note:

This external library function (See External Library Guide) filters data with the HBase-supplied filter.

Parameter:

filterName

HBase filter name

filterArg

Filter parameter

Return value:

Filter handle

Example:

 

A

 

1

=hbase_open("hdfs://192.168.0.8:9000")

 

2

=hbase_scan(A1,"emp")

All data of the emp table.

3

=hbase_filter("ColumnCountGetFilter",3)

4

=hbase_scan(A1,"emp";filter:A3)

rowkey is a row name, and doesn’t belong to data columns.

5

=hbase_filter("ColumnPaginationFilter",2, 2)

 

6

=hbase_scan(A1,"emp";filter:A5)

7

=hbase_filter("ColumnPrefixFilter","na")

8

=hbase_scan(A1,"emp";filter:A7)

9

=hbase_filter("ColumnRangeFilter","age", true, "name",false)

10

=hbase_scan(A1,"emp";filter:A9)

11

=hbase_filter("DependentColumnFilter","family","age")

12

=hbase_scan(A1,"emp";filter:A11)

13

=hbase_filter("FamilyFilter","=",hbase_cmp@s("family"))

 

14

=hbase_scan(A1,"emp";filter:A13)

15

=hbase_filter("KeyOnlyFilter")

 

16

=hbase_scan(A1,"emp";filter:A15)

17

=hbase_filter("FirstKeyOnlyFilter")

18

=hbase_scan(A1,"emp";filter:A17)

19

=hbase_filter("FuzzyRowFilter","row1",[1,1,0,0])

 

20

=hbase_scan(A1,"emp";filter:A19)

21

=hbase_filter("InclusiveStopFilter","row3")

 

22

=hbase_scan(A1,"emp";filter:A21)

23

=hbase_filter("MultipleColumnPrefixFilter","na","tel","position")

 

24

=hbase_scan(A1,"emp";filter:A23)

25

=hbase_filter("PrefixFilter","ro")

 

26

=hbase_scan(A1,"emp";filter:A25)

27

=hbase_filter("QualifierFilter","eq", hbase_cmp@s("name"))

 

28

=hbase_scan(A1,"emp";filter:A27)

29

=hbase_filter("RowFilter","eq", hbase_cmp("row1"))

 

30

=hbase_scan(A1,"emp";filter:A29)

31

=hbase_filter("ValueFilter","=", hbase_cmp("C++"))

 

32

=hbase_scan(A1,"emp";filter:A31)

33

=hbase_filter("SingleColumnValueFilter","family","tel" ,"eq",hbase_cmp@s("13"))

 

34

=hbase_scan(A1,"emp";filter:A33)

35

=hbase_filter("SingleColumnValueExcludeFilter","family","tel" ,"eq",hbase_cmp@s("13"))

 

36

=hbase_scan(A1,"emp";filter:A35)

37

=hbase_filter("SkipFilter",hbase_filter("ValueFilter","=", hbase_cmp@s("aaa")))

 

38

=hbase_scan(A1,"emp";filter:A37)

39

=hbase_filter("FirstKeyValueMatchingQualifiersFilter","name")

 

 

 

 

 

 

 

 

40

 

 

 

 

 

 

 

=hbase_scan(A1,"emp";filter:A39)

41

=hbase_filter("TimestampsFilter",1488855959195,1488855959219,1488855959145,false)

 

42

=hbase_scan(A1,"emp";filter:A41)

43

=hbase_filter("RandomRowFilter",0.5)

Filer rows at random.

Related function:

hbase_cmp()

hbase_scan()