Skip to content

Instantly share code, notes, and snippets.

@stelcheck
Created October 30, 2012 10:00
Show Gist options
  • Star 47 You must be signed in to star a gist
  • Fork 17 You must be signed in to fork a gist
  • Save stelcheck/3979381 to your computer and use it in GitHub Desktop.
Save stelcheck/3979381 to your computer and use it in GitHub Desktop.
HBase Stargate REST API Scanner Filter Examples

Stargate Scanner Filter Examples

Introduction

So yeah... no documentation for the HBase REST API in regards to what should a filter look like...

So I installed Eclipse, got the library, and took some time to find some of the (seemingly) most useful filters you could use. I'm very green at anything regarding HBase, and I hope this will help anyone trying to get started with it.

What I discovered is that basically, attributes of the filter object follow the same naming than in the documentation. For this reason, I have made the link clickable and direct them to the HBase Class documentation attached to it; check for the instantiation argument names, and you will have your attribute list (more or less).

Don't forget, values are encoded.

References:

{
  "type": "ColumnPrefixFilter",
  "value": "cHJlZml4"
}
{
  "type": "ColumnRangeFilter",
  "minColumn": "Zmx1ZmZ5",
  "minColumnInclusive": true,
  "maxColumn": "Zmx1ZmZ6",
  "maxColumnInclusive": false
}

Could not generate an example, but I guess it should be pretty simple to test if it works just by intuitively plugging variables a certain way...

null

{
  "type": "FamilyFilter",
  "op": "EQUAL",
  "comparator": {
    "type": "BinaryComparator",
    "value": "dGVzdHJvdw\u003d\u003d"
  }
}
{
  "type": "FilterList",
  "op": "MUST_PASS_ALL",
  "filters": [
    {
      "type": "RowFilter",
      "op": "EQUAL",
      "comparator": {
        "type": "BinaryComparator",
        "value": "dGVzdHJvdw\u003d\u003d"
      }
    },
    {
      "type": "ColumnRangeFilter",
      "minColumn": "Zmx1ZmZ5",
      "minColumnInclusive": true,
      "maxColumn": "Zmx1ZmZ6",
      "maxColumnInclusive": false
    }
  ]
}
{
  "type": "FirstKeyOnlyFilter"
}
{
  "type": "InclusiveStopFilter",
  "value": "cm93a2V5"
}
{
  "type": "MultipleColumnPrefixFilter",
  "prefixes": [
    "YWxwaGE\u003d",
    "YnJhdm8\u003d",
    "Y2hhcmxpZQ\u003d\u003d"
  ]
}
{
  "type": "PageFilter",
  "value": "10"
}
{
  "type": "PrefixFilter",
  "value": "cm93cHJlZml4"
}
{
  "type": "QualifierFilter",
  "op": "GREATER",
  "comparator": {
    "type": "BinaryComparator",
    "value": "cXVhbGlmaWVycHJlZml4"
  }
}
{
  "type": "RowFilter",
  "op": "EQUAL",
  "comparator": {
    "type": "BinaryComparator",
    "value": "dGVzdHJvdw\u003d\u003d"
  }
}
{
  "type": "SingleColumnValueFilter",
  "op": "EQUAL",
  "family": "ZmFtaWx5",
  "qualifier": "Y29sMQ\u003d\u003d",
  "latestVersion": true,
  "comparator": {
    "type": "BinaryComparator",
    "value": "MQ\u003d\u003d"
  }
}
{
  "type": "TimestampsFilter",
  "timestamps": [
    "1351586939"
  ]
}
@stelcheck
Copy link
Author

Oh wow, never noticed people were using that! ☘️

@heyitsvajid in HBase you shouldn't need a filter for that, just a start and end key should do... but it's been a long time, so I might be wrong.

@dksifoua
Copy link

Is it possible to do a timestamp range filter?
I think the TimestampsFilter only retrieve data at the timestamp specified (tell me if it is not true)

@sauravbilung
Copy link

Thank you ! You made my day.

@support-fdx
Copy link

How could i get data in reverse order?

@developer-ramesh
Copy link

Hi,

How can I get total count of the record in HBase NodeJS?

Can anyone help me please?

Thanks in advance

@1518751112
Copy link

谢谢,找了好久

@wox080xow
Copy link

wox080xow commented Jul 20, 2022

I use FirstKeyOnlyFilter and KeyOnlyFilter respectively in HBase 1.2 (CDH 5.16), but it did not work and return "503 Service Unavailable".
Does anybody has any idea?

The following is request and response:

wox080xow@lijiayus-MacBook-Pro tmp % curl -vi -X PUT \
  -H "Accept: text/xml" \
  -H "Content-Type: text/xml" \
  -d '<Scanner batch="1000"><filter>{type:"KeyOnlyFilter"}</filter></Scanner>' \
  "http://172.16.##.##:20550/HBASE_FDCLOT/scanner/"
*   Trying 172.16.##.##...
* TCP_NODELAY set
* Connected to 172.16.##.## (172.16.##.##) port 20550 (#0)
> PUT /HBASE_FDCLOT/scanner/ HTTP/1.1
> Host: 172.16.##.##:20550
> User-Agent: curl/7.64.1
> Accept: text/xml
> Content-Type: text/xml
> Content-Length: 71
>
* upload completely sent off: 71 out of 71 bytes
< HTTP/1.1 503 Service Unavailable
HTTP/1.1 503 Service Unavailable
< Content-Length: 13
Content-Length: 13
< Content-Type: text/plain
Content-Type: text/plain

<
Unavailable
* Connection #0 to host 172.16.##.## left intact
* Closing connection 0

The log of HBase REST Server:

2022-07-20 14:55:52,594 DEBUG org.mortbay.log: REQUEST /HBASE_FDCLOT/scanner/ on org.mortbay.jetty.HttpConnection@3453c65b
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@27508c5d
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: session=null
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: servlet=com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: chain=org.apache.hadoop.hbase.rest.filter.GzipFilter-888557915->com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: servlet holder=com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: call filter org.apache.hadoop.hbase.rest.filter.GzipFilter-888557915
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: call servlet com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,597 DEBUG org.apache.hadoop.hbase.rest.ScannerResource: PUT http://172.16.1.217:20550/HBASE_FDCLOT/scanner/
2022-07-20 14:55:52,690 DEBUG org.mortbay.log: RESPONSE /HBASE_FDCLOT/scanner/  503
2022-07-20 14:55:52,698 DEBUG org.mortbay.log: EOF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment