nsabharwal/gist:ba7d875e058add1b4843

## gistfile1.txt
Please see the following details on Apache Phoenix “sql skin for HBase” .

Phoenix

The following details are based on a test done in one of my lab environments. You can see that we can run sql, secondary indexes, explain plan, data load and bulk load by using phoenix.

Table definition

drop table if exists crime;

create table crime (

caseid varchar,

Date varchar,

block varchar,

description varchar,

sdesc varchar,

ldesc varchar,

arrest char(2),

domestic char(2),

lat float,

long float

constraint PK PRIMARY KEY (caseid)

);

Sql prompt can be launched from your desktop by using

sqlline.py <zookeeper server>:<zookeeper port>

0: jdbc:phoenix:lab> !describe CRIME

+————+————-+————+————-+————+————+————-+—————+—————-+—————-+————+————+—————+

| TABLE_CAT  | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE  | TYPE_NAME  | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX |  NULLABLE  | COLUMN_DEF | SQL_DATA_TYPE |

+————+————-+————+————-+————+————+————-+—————+—————-+—————-+————+————+—————+

| null       | null        | CRIME      | CASEID      | 12         | VARCHAR    | null        | null          | null           | null           | 0          | null       | null          |

| null       | null        | CRIME      | DATE        | 12         | VARCHAR    | null        | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | BLOCK       | 12         | VARCHAR    | null        | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | DESCRIPTION | 12         | VARCHAR    | null        | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | SDESC       | 12         | VARCHAR    | null        | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | LDESC       | 12         | VARCHAR    | null        | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | ARREST      | 1          | CHAR       | 2           | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | DOMESTIC    | 1          | CHAR       | 2           | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | LAT         | 6          | FLOAT      | null        | null          | null           | null           | 1          | null       | null          |

| null       | null        | CRIME      | LONG        | 6          | FLOAT      | null        | null          | null           | null           | 1          | null       | null          |

+————+————-+————+————-+————+————+————-+—————+—————-+—————-+————+————+———————+

0: jdbc:phoenix:lab> select count(1) from CRIME where ARREST=’Y';

+————+

| COUNT(1) |

+————+

|1092172   |

+————+

1 row selected (9.964 seconds)

Secondary Indexes : http://phoenix.apache.org/secondary_indexing.html

0: jdbc:phoenix:lab> create index index_arrest on CRIME (arrest);

3,837,776 rows affected (95.428 seconds)

0: jdbc:phoenix:lab> select count(1) from CRIME where ARREST=’Y';

+————+

| COUNT(1) |

+————+

|1092172   |

+————+

1 row selected (2.169 seconds)

0: jdbc:phoenix:lab> select count(1) from CRIME where ARREST=’N';

+————+

| COUNT(1) |

+————+

|2745604   |

+————+

1 row selected (4.055 seconds)

explain select count(1) from CRIME where ARREST=’Y';

+————+

|   PLAN    |

+————+

| CLIENT PARALLEL 32-WAY RANGE SCAN OVER INDEX_ARREST [‘Y’] |

|     SERVER AGGREGATE INTO SINGLE ROW |

+————+

2 rows selected (0.03 seconds)

Statistics: http://phoenix.apache.org/update_statistics.html

Tuning : http://phoenix.apache.org/tuning.html

Infomarion: http://phoenix.apache.org/language/index.html

You can see table in HBASE

hbase(main):004:0> list

TABLE

CRIME

Data load

/usr/lib/phoenix/bin/psql.py -t CRIME lab:2181 crim4.csv

Bulk load  – Be careful about ignore-errors parameter

HADOOP_CLASSPATH=/usr/lib/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /usr/lib/phoenix/phoenix-4.0.0.2.1.4.0-632-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool –table CRIME  –input /tmp/crim4.csv  -z lab:2181 –ignore-errors
	Please see the following details on Apache Phoenix “sql skin for HBase” .

	Phoenix

	The following details are based on a test done in one of my lab environments. You can see that we can run sql, secondary indexes, explain plan, data load and bulk load by using phoenix.

	Table definition

	drop table if exists crime;

	create table crime (

	caseid varchar,

	Date varchar,

	block varchar,

	description varchar,

	sdesc varchar,

	ldesc varchar,

	arrest char(2),

	domestic char(2),

	lat float,

	long float

	constraint PK PRIMARY KEY (caseid)

	);

	Sql prompt can be launched from your desktop by using

	sqlline.py <zookeeper server>:<zookeeper port>

	0: jdbc:phoenix:lab> !describe CRIME

	+————+————-+————+————-+————+————+————-+—————+—————-+—————-+————+————+—————+

	\| TABLE_CAT \| TABLE_SCHEM \| TABLE_NAME \| COLUMN_NAME \| DATA_TYPE \| TYPE_NAME \| COLUMN_SIZE \| BUFFER_LENGTH \| DECIMAL_DIGITS \| NUM_PREC_RADIX \| NULLABLE \| COLUMN_DEF \| SQL_DATA_TYPE \|

	+————+————-+————+————-+————+————+————-+—————+—————-+—————-+————+————+—————+

	\| null \| null \| CRIME \| CASEID \| 12 \| VARCHAR \| null \| null \| null \| null \| 0 \| null \| null \|

	\| null \| null \| CRIME \| DATE \| 12 \| VARCHAR \| null \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| BLOCK \| 12 \| VARCHAR \| null \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| DESCRIPTION \| 12 \| VARCHAR \| null \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| SDESC \| 12 \| VARCHAR \| null \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| LDESC \| 12 \| VARCHAR \| null \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| ARREST \| 1 \| CHAR \| 2 \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| DOMESTIC \| 1 \| CHAR \| 2 \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| LAT \| 6 \| FLOAT \| null \| null \| null \| null \| 1 \| null \| null \|

	\| null \| null \| CRIME \| LONG \| 6 \| FLOAT \| null \| null \| null \| null \| 1 \| null \| null \|

	+————+————-+————+————-+————+————+————-+—————+—————-+—————-+————+————+———————+

	0: jdbc:phoenix:lab> select count(1) from CRIME where ARREST=’Y';

	+————+

	\| COUNT(1) \|

	+————+

	\|1092172 \|

	+————+

	1 row selected (9.964 seconds)

	Secondary Indexes : http://phoenix.apache.org/secondary_indexing.html

	0: jdbc:phoenix:lab> create index index_arrest on CRIME (arrest);

	3,837,776 rows affected (95.428 seconds)

	0: jdbc:phoenix:lab> select count(1) from CRIME where ARREST=’Y';

	+————+

	\| COUNT(1) \|

	+————+

	\|1092172 \|

	+————+

	1 row selected (2.169 seconds)

	0: jdbc:phoenix:lab> select count(1) from CRIME where ARREST=’N';

	+————+

	\| COUNT(1) \|

	+————+

	\|2745604 \|

	+————+

	1 row selected (4.055 seconds)

	explain select count(1) from CRIME where ARREST=’Y';

	+————+

	\| PLAN \|

	+————+

	\| CLIENT PARALLEL 32-WAY RANGE SCAN OVER INDEX_ARREST [‘Y’] \|

	\| SERVER AGGREGATE INTO SINGLE ROW \|

	+————+

	2 rows selected (0.03 seconds)

	Statistics: http://phoenix.apache.org/update_statistics.html

	Tuning : http://phoenix.apache.org/tuning.html

	Infomarion: http://phoenix.apache.org/language/index.html

	You can see table in HBASE

	hbase(main):004:0> list

	TABLE

	CRIME

	Data load

	/usr/lib/phoenix/bin/psql.py -t CRIME lab:2181 crim4.csv

	Bulk load – Be careful about ignore-errors parameter

	HADOOP_CLASSPATH=/usr/lib/hbase/lib/hbase-protocol.jar:/etc/hbase/conf hadoop jar /usr/lib/phoenix/phoenix-4.0.0.2.1.4.0-632-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool –table CRIME –input /tmp/crim4.csv -z lab:2181 –ignore-errors