HBase Integration Hive

HBase Integration with Hive
Setup HBase Integration with Hive:

For setting up of HBase integration with Hive, we mainly require a few jar files present in
$HIVE_HOME/lib or $HBASE_HOME/lib directory. The required jar files are:
zookeeper-*.jar
hive-hbase-handler-*.jar
guava-*.jar
hbase-*.jar
es
We need to add the paths for the above jar files to value of hive.aux.jars.path in hivesite.xml configuration file,
Te
ch
no
lo
gi
<property>
<name>hive.aux.jars.path</name>
<value>file:///home/training/apache-hive-0.13.1-bin/lib/hive-hbase-handler-0.13.1.jar,
file:///home/training/apache-hive-0.13.1-bin/lib/zookeeper-3.4.5.jar,
file:///home/training/apache-hive-0.13.1-bin/lib/guava-11.0.2.jar,
...
</value>
<description>A comma separated list (with no spaces) of the jar files required for Hive-HBase
integration</description>
</property>
VS
Verify HBase Integration with Hive: (Managing HBase tables using Hive)
Lets create a new hbase table via hive shell. To test the hbase table creations we need
Hadoop and HBase daemons running,
start-all.sh
start-hbase.sh
Below is a sample hbase table creation DDL statement. In this, we are creating
hbase_table_emp table in Hive and emp table in HBase. This table will contain 3 columns in
Hive - key int, name string and role string. There are mapped to two columns name and
role belonging to cf1 column family. Here :key is specified at the beginning of the
hbase.columns.mapping property which automatically maps to first column (id int) in Hive
table.
CREATE TABLE hbase_table_emp(id int, name string, role string)

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role")
TBLPROPERTIES ("hbase.table.name" = "emp");
es
VS
Te
ch
no
lo
gi
Lets verify this table emp in HBase shell and view its metadata.
$ hbase shell
hbase> list
hbase> describe emp
We cannot directly load data into hbase table emp with load data inpath hive
command. We have to copy data into it from another hive table. Lets create another test
hive table with the same schema as hbase_table_emp and we will insert records into it with
hive load data input command.
no
lo
gi
es
hive> create table testemp(id int, name string, role string) row format delimited
fields terminated by \t;
hive> load data local inpath /home/siva/sample.txt into table testemp;
hive> select * from testemp;
ch
Lets copy contents into hbase_table_emp table from testemp and verify its contents.
VS
Te
hive> insert overwrite table hbase_table_emp select * from testemp;

hive> select * from hbase_table_emp;
Lets see the contents of emp table from hbase shell,

$ hbase shell
hbase> scan emp
es
gi
lo
no
So we have successfully integrated HBase with Hive by creating and populating new HBase
tables from Hive shell.
ch
Mapping Existing HBase Tables to Hive:
VS
Te
Similar to creating new HBase tables, we can also map HBase existing tables to Hive. To
give Hive access to an existing HBase table with multiple columns and families, we need to
use CREATE EXTERNAL TABLE. Again, hbase.columns.mapping is required (and will be
validated against the existing HBase table's column families), whereas hbase.table.name
is optional.
For testing this, we will create user table in HBase as shown below and map this to Hive
table.
hbase(main):002:0>
hbase(main):003:0>
hbase(main):004:0>
hbase(main):005:0>
hbase(main):006:0>
hbase(main):007:0>
hbase(main):008:0>
hbase(main):009:0>
hbase(main):010:0>
hbase(main):011:0>
create 'user', 'cf1', 'cf2'

put 'user', 'row1', 'cf1:a', 'value1'
put 'user', 'row1', 'cf1:b', 'value2'
put 'user', 'row1', 'cf2:c', 'value3'
put 'user', 'row2', 'cf1:b', 'value5'
put 'user', 'row3', 'cf1:a', 'value6'
describe 'user'
scan 'user'
es
gi
lo
no
ch
Te
VS
Lets create corresponding Hive table for the above user HBase table. Below is the DDL for
creation of external table hbase_table_user
$ hive
hive> CREATE EXTERNAL TABLE hbase_table_user(key string, val1 string, val2
string, val3 string)
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:a,cf1:b,cf2:c")
TBLPROPERTIES("hbase.table.name" = "user");
Verify the contents of hbase_table_user table.
hive> DESCRIBE hbase_table_user;
hive> SELECT * FROM hbase_table_user;
es
gi
lo
no
ch
So, we have successfully mapped HBase table with Hive External table.
Hive MAP to HBase Column Family
VS
Te
Here's how a Hive MAP datatype can be used to access an entire column family. Each row
can have a different set of columns, where the column names correspond to the map keys
and the column values correspond to the map values.
CREATE TABLE hbase_table_1(value map<string,int>, row_key int)

WITH SERDEPROPERTIES (
"hbase.columns.mapping" = "cf:,:key"
);
INSERT OVERWRITE TABLE hbase_table_1 SELECT map(bar, foo), foo FROM pokes
WHERE foo=98 OR foo=100;
(This example also demonstrates using a Hive column other than the first as the HBase row
key.)
Here's how this looks in HBase (with different column names in different rows):
hbase(main):012:0> scan "hbase_table_1"
ROW
COLUMN+CELL
100
column=cf:val_100, timestamp=1267739509194, value=100
98
column=cf:val_98, timestamp=1267739509194, value=98
2 row(s) in 0.0080 seconds

And when queried back into Hive:
hive> select * from hbase_table_1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
...
OK
{"val_100":100} 100
{"val_98":98}
98
Time taken: 3.808 seconds
VS
Te
ch
no
lo
gi
es
Note that the key of the MAP must have datatype string, since it is used for naming the
HBase column.

HBase Integration Hive

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

HBase Integration Hive

Transféré par

Droits d'auteur :

Formats disponibles

HBase Integration with Hive

Setup HBase Integration with Hive:

CREATE TABLE hbase_table_emp(id int, name string, role string)

hive> insert overwrite table hbase_table_emp select * from testemp;

Lets see the contents of emp table from hbase shell,

Mapping Existing HBase Tables to Hive:

create 'user', 'cf1', 'cf2'

Hive MAP to HBase Column Family

CREATE TABLE hbase_table_1(value map<string,int>, row_key int)

2 row(s) in 0.0080 seconds

Vous aimerez peut-être aussi