Académique Documents
Professionnel Documents
Culture Documents
| 2014, Cognizant
What is Hive?
2 | 2014, Cognizant
What is Hive?
3 | 2014, Cognizant
Why Hive?
4 | 2014, Cognizant
Hive Architecture
5 | 2014, Cognizant
Hive Components
Metastore
Implements metastore server
Query Processor
Processing framework implementation for translation of HQL
Into Map reduce jobs
6 | 2014, Cognizant
Hive SerDe
7 | 2014, Cognizant
Hive SerDe
8 | 2014, Cognizant
Hive Metastore
Embedded (Default)
Local
Remote
| 2014, Cognizant
Hive Metastore - Embedded
10 | 2014, Cognizant
Hive Metastore - Local
Hive client will open the connection to the datastore and make
queries against it
11 | 2014, Cognizant
Hive Metastore - Remote
All Hive clients will make a connection to the metastore server and
server queries the datastore for metadata
12 | 2014, Cognizant
13 | 2014, Cognizant
Hive Workflow
14 | 2014, Cognizant
Hive Execution Workflow
15 | 2014, Cognizant
Hive Connection
16 | 2014, Cognizant
Command Line Interface
$HIVE_HOME/bin/hive
17 | 2014, Cognizant
Hive Server 2
$HIVE_HOME/bin/hiveserver2
18 | 2014, Cognizant
Hive Server 2
19 | 2014, Cognizant
Beeline
Beeline is started with the JDBC url of the hive server 2 which
depends on the port where Hive server 2 was started
$HIVE_HOME/bin/beeline u jdbc:hive2://localhost:10000
20 | 2014, Cognizant
Beeswax
21 | 2014, Cognizant
Hive Data Model
22 | 2014, Cognizant
Data Unit
23 | 2014, Cognizant
Data Types
tinyint arrays
smalint structs
int maps
bigint union
boolean
string
float
binary
double
timestamp
24 | 2014, Cognizant
Data Types
name map(first,Adams,last,John)
names array(Hello,World);
Name[1] = World
25 | 2014, Cognizant
File Format
26 | 2014, Cognizant
File Format
TextFile
Each record is a line in the file
Suitable for sharing data with other tools
Can be viewed and edited manually
SequenceFile
Flat files that stores binary key, value pair
Supports Uncompressed, Record compressed and Block
compressed formats
RCFile
Stores columns of a table in a record columnar way
ORC File
Optimized Row Columnar file format
27 | 2014, Cognizant
File Format
28 | 2014, Cognizant
Hive Query Language
Subset of SQL
29 | 2014, Cognizant
Hive Databases
30 | 2014, Cognizant
Hive Database Commands
Creating Database
CREATE DATABASE [IF NOT EXISTS] mydb1;
Listing Databases
SHOW DATABASES;
Describing database
DESCRIBE DATABASE mydb1;
DESCRIBE DATABASE EXTENDED mydb1;
Using Database
USE mydb1;
31 | 2014, Cognizant
Hive Database Commands
Dropping Database
DROP DATABASE [IF EXISTS] mydb1;
DROP DATABASE [IF EXISTS] mydb1 CASCADE;
Altering Databases
SHOW DATABASES;
Describing database
ALTER DATABASE mydb1 Set DBPROPERTIES(edited-
by = CTS);
32 | 2014, Cognizant
Hive Tables
Managed tables
External tables
33 | 2014, Cognizant
Managed Table
Not a good choice for sharing the data with other tools/apps.
34 | 2014, Cognizant
External Table
When the table is dropped, data is not deleted and only metadata
is deleted
35 | 2014, Cognizant
Table Commands
36 | 2014, Cognizant
Table Commands
Describe Table
DESCRIBE employees
DESCRIBE employees.name
Alter Table
ALTER TABLE employees ADD COLUMN (lastname STRING)
Drop Table
DROP TABLE employees;
37 | 2014, Cognizant
Table Commands
Add Partition
38 | 2014, Cognizant
Hive Partitions
39 | 2014, Cognizant
Hive Partitions
Static Partition
Dynamic Partition
Partition names are determined based on the partition column
values
set hive.exec.dynamic.partition= true
set hive.exec.dynamic.partition.mode=nonstrict
40 | 2014, Cognizant
41 | 2014, Cognizant
42 | 2014, Cognizant
43 | 2014, Cognizant
Hive Bucketing
Users can specify the number of buckets for their data set
44 | 2014, Cognizant
Hive Bucketing
45 | 2014, Cognizant
Hive DML Data Load
Hive supports Data load from Local File System and HDFS
From HDFS
LOAD DATA INPATH /user/data/emp.txt INTO TABLE employee
Schema on Read
On load Hive do not check that the files in the table directory
confirms to the table schema
Any mismatch will result in error only at the time of querying
46 | 2014, Cognizant
Hive DML Data Load
Hive also supports data loading from one table into another table or
at time of table creation
47 | 2014, Cognizant
Hive DML Select
HQL allows writing the SQL queries to retrieve data using the
SELECT statement
Examples:
SELECT * FROM employee;
SELECT * FROM employee LIMIT 5;
SELECT * FROM employee where name LIKE M%;
48 | 2014, Cognizant
Hive DML Select
GROUP BY
HAVING
Supports
ORDER BY
SORT BY
DISTRIBUTE BY
49 | 2014, Cognizant
Hive DML Select
ORDER BY
Performs total ordering of the query result set
Results in long running queries
SORT BY
Arrange data in each reducer in ASC or DESC order
Each reducer output is sorted
DISTRIBUTE BY
Used in association with SORT BY
Helps in controlling how map output can be divided among
reducers
50 | 2014, Cognizant
Hive DML JOINS
Hive supports equi joins and no support for non- equi joins
INNER JOIN
LEFT/RIGHT/FULL OUTER JOIN
LEFT SEMI JOIN
CROSS JOIN
When joining 3 or more tables, if every join condition uses the same
key, single Map Reduce job will be used
In every MR join , the last table is streamed into the reducer while
rest tables are buffered(in memory).
51 | 2014, Cognizant
Hive DML JOINS
Examples
52 | 2014, Cognizant
Hive DML JOINS
Join Types
53 | 2014, Cognizant
Hive DML VIEWS
Examples
Creating view
54 | 2014, Cognizant
Hive DML INDEX
Examples
CREATE INDEX emp_idx ON
TABLE employee(name)
AS COMPACT
WITH DEFERRED REBUILD
55 | 2014, Cognizant
Hive Functions
56 | 2014, Cognizant
UDF
Example:
o Mathematical round ,floor,ceil
o String ascii, concat, lower, upper
o Date year , month
57 | 2014, Cognizant
UDAF
Example:
o Count, max, min, sum
58 | 2014, Cognizant
UDTF
Example:
o Explode takes array as an input and produces each array
element as separate row
59 | 2014, Cognizant
Custom Map Reduce
Hive provides support for writing the custom mapper and reducer to
override the default map reduce generated by hive
Example
FROM (
FROM pv_users
MAP pv_users.userid, pv_users.date
USING 'map_script'
CLUSTER BY key) map_output
INSERT OVERWRITE TABLE
pv_users_reduced
REDUCE map_output.key, map_output.value
USING 'reduce_script'
AS date, count;
60 | 2014, Cognizant
THANK YOU