Description:
Filter data with the HBase-supplied filter.
Syntax:
hbase_filter(filterName,filterArg) Different filterName values represent different numbers and types of specific parameters, as shown in the following:
hbase_filter(ColumnCountGetFilter,n) |
Get the first n values from each row |
hbase_filter(ColumnPaginationFilter,limit,columnOffset) |
For each row, get field values, whose number is limit, starting from the (columnOffset+1)th column |
hbase_filter(ColumnPrefixFilter,str) |
For each row, get field values whose corresponding column names have a prefix which can match up str |
hbase_filter(ColumnRangeFilter,minColumn,boolean,maxColumn,boolean) |
For each row, get values whose corresponding column names fall in a certain range; the parameter boolean indicates an open interval – when the value is false, or a closed interval – when the value is true |
hbase_filter(DependentColumnFilter,family,qualifier) |
Specify a column and perform filtering to get columns whose timestamp is different |
hbase_filter(FamilyFilter,ifequal,my-family) |
Specify a column family, whose name is my-family, to perform the filtering; the value of ifequal can be “=”, “!=”, etc. |
hbase_filter(FirstKeyOnlyFilter) |
Get the first value from each row |
hbase_filter(FirstKeyValueMatchingQualifiersFilter,column) |
Get the first value from each row that satisfies the specified column qualifier |
hbase_filter(FuzzyRowFilter,fuzzyKeysData) |
Perform matching over fuzzy row key. Take a list parameter (List>) to do this. The list has a pair of byte arrays, in which the first one is the row key format, and the second one’s value is 0 or 1. 0 means a byte at same position in row key must match; 1 means a byte at same position in row key can be different. |
hbase_filter(InclusiveStopFilter,stopRowKey) |
Stop data scanning if a specified row is reached |
hbase_(KeyOnlyFilter) |
Will only return the key components, like rowkey, column family name, column name and timestamp |
hbase_filter(MultipleColumnPrefixFilter,prefixes1,prefixes2,...prefixesN) |
Allow to specify multiple column prefixes |
hbase_filter(PageFilter,n) |
Pigination filter for displaying n rows |
hbase_filter(PrefixFilter,str) |
Specify a prefix for rowkey |
hbase_filter(QualifierFilter,cmp,cmpColumn) |
To filter based on column qualifier; cmp is a comparison operator like equal (or=) and not equal (or !=); cmpColumn is column qualifier filtered out through comparator. |
hbase_filter(RowFilter,cmp,cmpRow) |
To filter based on row key; cmp is a comparison operator like equal (or=) and not equal (or !=); cmpColumn is row key filtered out through comparator. |
hbase_filter(ValueFilter,cmp,cmpValue) |
To filter based on vlaue ; cmp is a comparison operator like equal (or=) and not equal (or !=); cmpColumn is a value filtered out through comparator. |
hbase_filter(SingleColumnValueExcludeFilter,family,columnName,ifequal,cmpValue)/ hbase_filter(SingleColumnValueFilter,family,columnName,ifequal,cmpValue |
Filter rows based on specified column value; family is the column family name, ifequal is comparison operator, and cmpValue is the result filtered out through comparator. columnName column will be excluded with SingleColumnValueExcludeFilter, while included with SingleColumnValueFilter. |
hbase_filter(RandomRowFilter,n) |
Filter rows randomly and return a random result set with a certain probability. To perform same RandomRowFilter on same data set for multiple times, different result sets will be returned. Paramameter n is float type; all rows will be filtered away if n<=0, and all will be kept if n>=1. |
hbase_filter(SkipFilter,hbase_filter(ValueFilter,ifequal,cmpValue)) |
A wrapper filter that filters an entire row if any of the cell is ineligible; work with ValueFilter. |
hbase_filter(WhileMatchFilter,hbase_filter()) |
A wrapper filter that stops scanning as soon as the filtering condition is met. |
hbase_filter(TimestampsFilter,timeStamp,boolean) |
Return only values whose timestamps match up the specified timestamp. |
Note:
This external library function (See External Library Guide) filters data with the HBase-supplied filter.
Parameter:
filterName |
HBase filter name |
filterArg |
Filter parameter |
Return value:
Filter handle
Example:
|
A |
|
1 |
=hbase_open("hdfs://192.168.0.8:9000") |
|
2 |
=hbase_scan(A1,"emp") |
All data of the emp table. |
3 |
=hbase_filter("ColumnCountGetFilter",3) |
|
4 |
=hbase_scan(A1,"emp";filter:A3) |
rowkey is a row name, and doesn’t belong to data columns. |
5 |
=hbase_filter("ColumnPaginationFilter",2, 2) |
|
6 |
=hbase_scan(A1,"emp";filter:A5) |
|
7 |
=hbase_filter("ColumnPrefixFilter","na") |
|
8 |
=hbase_scan(A1,"emp";filter:A7) |
|
9 |
=hbase_filter("ColumnRangeFilter","age", true, "name",false) |
|
10 |
=hbase_scan(A1,"emp";filter:A9) |
|
11 |
=hbase_filter("DependentColumnFilter","family","age") |
|
12 |
=hbase_scan(A1,"emp";filter:A11) |
|
13 |
=hbase_filter("FamilyFilter","=",hbase_cmp@s("family")) |
|
14 |
=hbase_scan(A1,"emp";filter:A13) |
|
15 |
=hbase_filter("KeyOnlyFilter") |
|
16 |
=hbase_scan(A1,"emp";filter:A15) |
|
17 |
=hbase_filter("FirstKeyOnlyFilter") |
|
18 |
=hbase_scan(A1,"emp";filter:A17) |
|
19 |
=hbase_filter("FuzzyRowFilter","row1",[1,1,0,0]) |
|
20 |
=hbase_scan(A1,"emp";filter:A19) |
|
21 |
=hbase_filter("InclusiveStopFilter","row3") |
|
22 |
=hbase_scan(A1,"emp";filter:A21) |
|
23 |
=hbase_filter("MultipleColumnPrefixFilter","na","tel","position") |
|
24 |
=hbase_scan(A1,"emp";filter:A23) |
|
25 |
=hbase_filter("PrefixFilter","ro") |
|
26 |
=hbase_scan(A1,"emp";filter:A25) |
|
27 |
=hbase_filter("QualifierFilter","eq", hbase_cmp@s("name")) |
|
28 |
=hbase_scan(A1,"emp";filter:A27) |
|
29 |
=hbase_filter("RowFilter","eq", hbase_cmp("row1")) |
|
30 |
=hbase_scan(A1,"emp";filter:A29) |
|
31 |
=hbase_filter("ValueFilter","=", hbase_cmp("C++")) |
|
32 |
=hbase_scan(A1,"emp";filter:A31) |
|
33 |
=hbase_filter("SingleColumnValueFilter","family","tel" ,"eq",hbase_cmp@s("13")) |
|
34 |
=hbase_scan(A1,"emp";filter:A33) |
|
35 |
=hbase_filter("SingleColumnValueExcludeFilter","family","tel" ,"eq",hbase_cmp@s("13")) |
|
36 |
=hbase_scan(A1,"emp";filter:A35) |
|
37 |
=hbase_filter("SkipFilter",hbase_filter("ValueFilter","=", hbase_cmp@s("aaa"))) |
|
38 |
=hbase_scan(A1,"emp";filter:A37) |
|
39 |
=hbase_filter("FirstKeyValueMatchingQualifiersFilter","name") |
|
40 |
=hbase_scan(A1,"emp";filter:A39) |
|
41 |
=hbase_filter("TimestampsFilter",1488855959195,1488855959219,1488855959145,false) |
|
42 |
=hbase_scan(A1,"emp";filter:A41) |
|
43 |
=hbase_filter("RandomRowFilter",0.5) |
Filer rows at random. |
Related function: