Skip to content

Instantly share code, notes, and snippets.

@stelcheck
Created October 30, 2012 10:00
Show Gist options
  • Star 47 You must be signed in to star a gist
  • Fork 17 You must be signed in to fork a gist
  • Save stelcheck/3979381 to your computer and use it in GitHub Desktop.
Save stelcheck/3979381 to your computer and use it in GitHub Desktop.
HBase Stargate REST API Scanner Filter Examples

Stargate Scanner Filter Examples

Introduction

So yeah... no documentation for the HBase REST API in regards to what should a filter look like...

So I installed Eclipse, got the library, and took some time to find some of the (seemingly) most useful filters you could use. I'm very green at anything regarding HBase, and I hope this will help anyone trying to get started with it.

What I discovered is that basically, attributes of the filter object follow the same naming than in the documentation. For this reason, I have made the link clickable and direct them to the HBase Class documentation attached to it; check for the instantiation argument names, and you will have your attribute list (more or less).

Don't forget, values are encoded.

References:

{
  "type": "ColumnPrefixFilter",
  "value": "cHJlZml4"
}
{
  "type": "ColumnRangeFilter",
  "minColumn": "Zmx1ZmZ5",
  "minColumnInclusive": true,
  "maxColumn": "Zmx1ZmZ6",
  "maxColumnInclusive": false
}

Could not generate an example, but I guess it should be pretty simple to test if it works just by intuitively plugging variables a certain way...

null

{
  "type": "FamilyFilter",
  "op": "EQUAL",
  "comparator": {
    "type": "BinaryComparator",
    "value": "dGVzdHJvdw\u003d\u003d"
  }
}
{
  "type": "FilterList",
  "op": "MUST_PASS_ALL",
  "filters": [
    {
      "type": "RowFilter",
      "op": "EQUAL",
      "comparator": {
        "type": "BinaryComparator",
        "value": "dGVzdHJvdw\u003d\u003d"
      }
    },
    {
      "type": "ColumnRangeFilter",
      "minColumn": "Zmx1ZmZ5",
      "minColumnInclusive": true,
      "maxColumn": "Zmx1ZmZ6",
      "maxColumnInclusive": false
    }
  ]
}
{
  "type": "FirstKeyOnlyFilter"
}
{
  "type": "InclusiveStopFilter",
  "value": "cm93a2V5"
}
{
  "type": "MultipleColumnPrefixFilter",
  "prefixes": [
    "YWxwaGE\u003d",
    "YnJhdm8\u003d",
    "Y2hhcmxpZQ\u003d\u003d"
  ]
}
{
  "type": "PageFilter",
  "value": "10"
}
{
  "type": "PrefixFilter",
  "value": "cm93cHJlZml4"
}
{
  "type": "QualifierFilter",
  "op": "GREATER",
  "comparator": {
    "type": "BinaryComparator",
    "value": "cXVhbGlmaWVycHJlZml4"
  }
}
{
  "type": "RowFilter",
  "op": "EQUAL",
  "comparator": {
    "type": "BinaryComparator",
    "value": "dGVzdHJvdw\u003d\u003d"
  }
}
{
  "type": "SingleColumnValueFilter",
  "op": "EQUAL",
  "family": "ZmFtaWx5",
  "qualifier": "Y29sMQ\u003d\u003d",
  "latestVersion": true,
  "comparator": {
    "type": "BinaryComparator",
    "value": "MQ\u003d\u003d"
  }
}
{
  "type": "TimestampsFilter",
  "timestamps": [
    "1351586939"
  ]
}
@vandelay
Copy link

vandelay commented May 3, 2016

Thank you

@turbopape
Copy link

turbopape commented Jun 1, 2016

Great Job ! Many Thanks!
You have to put these in an XML document right ?
<Scanner ...
<filter ...
the json
?
Just is there a way to let the scanner spit sets of results instead of giving them one by one ?

@riazraza
Copy link

Great Job! Thanks alot.
But why is DependentColumnFilter null? We can use it through REST Client, right?

@rpwils
Copy link

rpwils commented May 30, 2018

do you know if there is a way to paginate a response? I wanted to use something like this or even something with an unlimited response.

@sjschmid
Copy link

How do you AND / OR several filters using the REST API? Can you make an example?

@hapiman
Copy link

hapiman commented Aug 7, 2018

@sjschmid I met the same question, have you solved it ?

@heyitsvajid
Copy link

Is it possible to search with a prefix value on row key?
Similar to RowFilter but using prefix value and not equal to comparison?

@stelcheck
Copy link
Author

Oh wow, never noticed people were using that! ☘️

@heyitsvajid in HBase you shouldn't need a filter for that, just a start and end key should do... but it's been a long time, so I might be wrong.

@dksifoua
Copy link

Is it possible to do a timestamp range filter?
I think the TimestampsFilter only retrieve data at the timestamp specified (tell me if it is not true)

@sauravbilung
Copy link

Thank you ! You made my day.

@support-fdx
Copy link

How could i get data in reverse order?

@developer-ramesh
Copy link

Hi,

How can I get total count of the record in HBase NodeJS?

Can anyone help me please?

Thanks in advance

@1518751112
Copy link

谢谢,找了好久

@wox080xow
Copy link

wox080xow commented Jul 20, 2022

I use FirstKeyOnlyFilter and KeyOnlyFilter respectively in HBase 1.2 (CDH 5.16), but it did not work and return "503 Service Unavailable".
Does anybody has any idea?

The following is request and response:

wox080xow@lijiayus-MacBook-Pro tmp % curl -vi -X PUT \
  -H "Accept: text/xml" \
  -H "Content-Type: text/xml" \
  -d '<Scanner batch="1000"><filter>{type:"KeyOnlyFilter"}</filter></Scanner>' \
  "http://172.16.##.##:20550/HBASE_FDCLOT/scanner/"
*   Trying 172.16.##.##...
* TCP_NODELAY set
* Connected to 172.16.##.## (172.16.##.##) port 20550 (#0)
> PUT /HBASE_FDCLOT/scanner/ HTTP/1.1
> Host: 172.16.##.##:20550
> User-Agent: curl/7.64.1
> Accept: text/xml
> Content-Type: text/xml
> Content-Length: 71
>
* upload completely sent off: 71 out of 71 bytes
< HTTP/1.1 503 Service Unavailable
HTTP/1.1 503 Service Unavailable
< Content-Length: 13
Content-Length: 13
< Content-Type: text/plain
Content-Type: text/plain

<
Unavailable
* Connection #0 to host 172.16.##.## left intact
* Closing connection 0

The log of HBase REST Server:

2022-07-20 14:55:52,594 DEBUG org.mortbay.log: REQUEST /HBASE_FDCLOT/scanner/ on org.mortbay.jetty.HttpConnection@3453c65b
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: sessionManager=org.mortbay.jetty.servlet.HashSessionManager@27508c5d
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: session=null
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: servlet=com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: chain=org.apache.hadoop.hbase.rest.filter.GzipFilter-888557915->com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: servlet holder=com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: call filter org.apache.hadoop.hbase.rest.filter.GzipFilter-888557915
2022-07-20 14:55:52,594 DEBUG org.mortbay.log: call servlet com.sun.jersey.spi.container.servlet.ServletContainer-1693226694
2022-07-20 14:55:52,597 DEBUG org.apache.hadoop.hbase.rest.ScannerResource: PUT http://172.16.1.217:20550/HBASE_FDCLOT/scanner/
2022-07-20 14:55:52,690 DEBUG org.mortbay.log: RESPONSE /HBASE_FDCLOT/scanner/  503
2022-07-20 14:55:52,698 DEBUG org.mortbay.log: EOF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment