Vous êtes sur la page 1sur 19

AcadGild

Learn. Do. Earn.


Categories
Mastering Hadoop eBook
Home / Big Data and Hadoop / Hive Real Life Use Cases

29
December

2016

Hive Real Life Use Cases

Practice Query Creation


This blog of Big Data will be a good practice for Hive Beginners , for practicing
query creation. At the end, you will be able to create a table, load data to the
table and perform analytical analysis on the dataset provided in Hive real life
use cases.

The Topic for this blog as referring to the real life dataset of Petrol suppliers.
And in the second part, we are covering real world Olympic records.
The sample dataset is made in reference to todays rate of petrol
consumption.But is not an actual data. It has been modified and published in
terms to learning only.

Here in this blog we will compare top petrol sellers of the world and get
some hands-on experience in Hive. Go to the link If you need to download
and setup Hive.

PETROL:
DATA SET: https://drive.google.com/open?
id=0B1QaXx7tpw3SMTBqLUQwX0lOWnM

DATA SET DESCRIPTION:

Fascinated by latest technology?


Subscribe to our blog and we'll keep you updated with latest technologies like
AngularJS, Machine Learning, Big Data, Spark, Android, Analytics, NodeJS,
Cloud Computing and many more!

Subscribe, Its Free!

ColumnNO.

Column1:

Column2: ,
Column3:

Column4:

Column5:

Column6:

Column7:

Creation of Table in Hive and Loading of data


create table petrol (distributer_id STRING,distributer_name
STRING,amt_IN STRING,amy_OUT STRING,vol_IN INT,vol_OUT INT,year
INT) row format delimited fields terminated by , stored as textfile;

load data local inpath /home/acadgild/Downloads/petrol.txt into table


petrol;

1)In real life what is the total amount of petrol in volume sold by every
distributor?
SELECT distributer_name,SUM(vol_OUT) FROM petrol GROUP BY
distributer_name;
2)Which are the top 10 distributors IDs for selling petrol and also display the
amount of petrol sold in volume by them individually?

SELECT distributer_id,vol_OUT FROM petrol order by vol_OUT desc limit


10;

3)Find real life 10 distributor name who sold petrol in the least amount.
SELECT distributer_id,vol_OUT FROM petrol order by vol_OUT limit 10;

4)Try One yourself

The constraint to this query is the difference between volumeIN and


volumeOuT is illegal in real life if greater than 500. As we see all distributors
are receiving patrols on every next cycle.

List all distributors who have this difference, along with the year and
the difference which they have in that year.

Hint: (vol_IN-vol_OUT)>500

xxx xxx xxx

We all know how much interesting Olympics held every time, and we all love
it.In real life Olympic Games are considered the worlds foremost sports
competition with more than 200 nations participating. The Olympic Games
are held every four years , with the Summer and Winter Games alternating
by occurring every four years but two years apart.
The sample dataset is made in reference to real life Olympic competition.But
is not an actual data. It has been modified and published in terms to learning
only.

Olympic:
DATA SET:
https://drive.google.com/file/d/0B1QaXx7tpw3SaEE3bEFTQTMzNzg/view?
usp=sharing

Olympic Data analysis using Hive

ColumnNO.

Column1:

Column2: ,

Column3:

Column4:

Column5:

Column6:

Column7:

Column8:

Column9:
Column10:

Creation of Table in Hive and Loading of data

create table olympic (athelete STRING,age INT,country STRING,year


STRING,closing STRING,sport STRING,gold INT,silver INT,bronze INT,total
INT) row format delimited fields terminated by \t stored as textfile;

load data local inpath /home/acadgild/Downloads/olympic_data.csv


into table olympic;

Using the dataset list the total number of medals won by each country in
swimming.
select country,SUM(total) from olympic where sport = Swimming
GROUP BY country;
2)Display real life number of medals India won year wise.

select year,SUM(total) from olympic where country = India GROUP BY


year;

3)Find the total number of medals each country won display the name along
with total medals.

select country,SUM(total) from olympic GROUP BY country;

4)Find the real life number of gold medals each country won.

select country,SUM(gold) from olympic GROUP BY country;


5) Try One yourself

Which country got medals for Shooting, year wise classification?

Hope this blog helped you in learning Hive with real life scenario, which we
come through our everyday life.

While I leave you with a simple query to solve, keep visiting our site
ACADGILD for more practice, queries on Hive and other trending
technologies.

Share this:
Click to share on Twitter (Opens in new window)
33
Click to share on Facebook (Opens in new window)
33
Click to share on Google+ (Opens in new window)
Click to share on Pocket (Opens in new window)
Click to share on Reddit (Opens in new window)
Click to share on Tumblr (Opens in new window)
12
Click to share on LinkedIn (Opens in new window)
12
Related

Partitioning In Hive
April 25, 2016
In "Big Data and Hadoop"
Partitioning In Hive
November 9, 2015
In "AcadGild"
Hive Use Case - Real Estate Analysis
December 2, 2016
In "Big Data and Hadoop"

Prateek Kumar

Prateek Kumar has been working with AcadGild as an Associate analyst with rich expertise in Big
data and Hadoop development and Administration. He has been a Java enthusiast and been
associated with implementation of many Big data projects. AcadGild was founded with the vision of
"Learn. Do. Earn". We provide skill development courses based on current industry needs. But what
sets us apart is earning opportunities we provide after successful completion of course. We also
provide live mentoring and 24x7 support. Our mentors are industry thought leaders in their
respective fields. We provide courses for Android Programming, Big Data, Front End, Full Stack,
AngularJS, NodeJS and Android Programming for children.

Previous Article

India's Cyber Security During Times of Demonetization


Next Article

Introduction to Machine Learning Using Spark


Related Posts

How to Import Table from MySQL to HBase


January 12, 2017
Aviation Data Analysis Using Apache Hive
January 10, 2017

Scheduling Hadoop Jobs Using Jenkins


January 10, 2017
Leave a Reply

Comments *

Name *

Email *
Website
Notify me of follow-up comments by email.
Notify me of new posts by email.
Yes, I Want To Boost My Career & Increase My Salary!

Your Name (required)

Your Email (required)

Your Contact Number (required)

Your Message

Video Tutorials
Error type: "Forbidden". Error message: "Daily Limit Exceeded. The quota will be reset at midnight
Pacific Time (PT). You may monitor your quota usage and adjust limits in the API Console:
https://console.developers.google.com/apis/api/youtube/quotas?project=81241235201" Domain:
"usageLimits". Reason: "dailyLimitExceeded".

Did you added your own Google API key? Look at the help .

Check in YouTube if the id UCaQfgvMsjpImSxrJQDBjd-Q belongs to a channelid. Check the FAQ of


the plugin or send error messages to support .
Like what you see? Subscribe to our blog

We send only 1 email in a week

Search

Categories

AcadGild
An Hour with the Crackerjacks of the Tech World

Android

Android For Kids

AngularJS

Big Data and Hadoop

Careers

Cloud computing

Cyber Security

Database

Digital Marketing

Ethical Hacking

Front End

Full Stack

Hadoop Administration

IOS

Java

Java
Kids

Linux Administration

Mongo DB

NodeJS

Others

Python

Quiz

R & Machine Learning

Scala

Spark

Techie Fridays

Uncategorized
Get Social

Whats Trending
Recent Posts

How to Import Table from MySQL to HBase


January 12, 2017

Is PMP Certification Worth all the Hassle?


January 11, 2017

Aviation Data Analysis Using Apache Hive


January 10, 2017

Scheduling Hadoop Jobs Using Jenkins


January 10, 2017
Archives

January 2017

December 2016

November 2016

October 2016

September 2016

August 2016

July 2016

June 2016

May 2016

April 2016

March 2016

February 2016

January 2016

December 2015
November 2015

September 2015

August 2015

July 2015

June 2015

May 2015

November 2014

October 2014

September 2014

August 2014
Categories

AcadGild

An Hour with the Crackerjacks of the Tech World

Android

Android For Kids

AngularJS

Big Data and Hadoop

Careers

Cloud computing

Cyber Security

Database

Digital Marketing

Ethical Hacking

Front End

Full Stack

Hadoop Administration

IOS

Java

Java

Kids

Linux Administration
Mongo DB

NodeJS

Others

Python

Quiz

R & Machine Learning

Scala

Spark

Techie Fridays

Uncategorized

Tags

AcadGild
ActionBarSherlock
AIDL
ambari
ambari tutorial
Android
Android AIDL
Android App
android app development
Android fragment
android mistakes
Anndroid Developer
Apache Hbase
apache HIVE
apache spark
big data
Big Data Developement
big data hadoop
Big Data Interview Questions
career
career option
Dependency Injection
hadoop
Hadoop Administration
hadoop tutorial
HBase
HDFS
hive
How to Write a Custom UDF for Hive
JavaScript MVC framework
job opportunities in hadoop
local file system
machine learning
mapreduce
mistakes in while creating an app
mysql
picasso
programming
Spark
spark sql
Spark sql use cases
What is AIDL
Write a Custom UDF
Write a Custom UDF for Hive
Write a Custom UDF for Hive in Python
Like what you see? Subscribe to our blog

We send only 1 email in a week

Copyright 2016. ACAD GILD .

Want To Become A Big Data


Developer?
This FREE E-Book Can Help

This book helped me in giving me a clear road map of things I need to do to


become a Big Data Developer. Very comprehensive and simple to understand
for beginners. Highly recommended!

Jean Jacob - Aspiring Big Data Developer


Subscribe to our blog and get this E-book absolutely free. 10,000+ downloads and
counting!

Sign up and download!


x

Looking For A Course Which Comes With A Job Guarantee? Yes, Tell Me More!

Vous aimerez peut-être aussi