Bienvenue sur Scribd !

Ignorer le carrousel

1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides

Transféré par

गोपाल शर्मा

0% ont trouvé ce document utile (0 vote)

16 vues11 pages

Copyright

Formats disponibles

PDF, TXT ou lisez en ligne sur Scribd

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Signaler ce document

1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides

Droits d'auteur :

Formats disponibles

Téléchargez comme PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

0% ont trouvé ce document utile (0 vote)

16 vues11 pages

1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides

Transféré par

गोपाल शर्मा

1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides

Droits d'auteur :

Formats disponibles

Téléchargez comme PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

Passer à la page

Vous êtes sur la page 1sur 11

Rechercher à l'intérieur du document

SQL on Hadoop - Analyzing

Big Data with Hive

Ahmad Alkilani
www.pluralsight.com
Introduction to Hadoop

Ahmad Alkilani
www.pluralsight.com
Outline

 Why Hadoop? Motivation

 Hadoop architecture and distributed computing
 HDFS
 MapReduce
 Getting up and running
Motivation for Hadoop

Memory C CPU
P
Memory
DISK U
Google
 ~40 Billion Web Pages x 30 KB each = Petabyte
 Today’s average disk speed reads about 120 MB/sec
 Little over 3 months to read the web!
 Approximately 1,000 drives to store and use
Distributed Computing Challenges

 Scale out with distributed computing

 Hadoop based on Google’s implementation
 Volume, Velocity, and Variety
 Recover from failures Name Node Job Tracker
 Shared nothing architecture
 Hadoop file system (HDFS)
Data Node
CPU Data Node Data Node Data node
CPU
 MapReduce Disk Disk

Data Node Data Node Data Node Data Node

Data Node
CPU Data Node Data Node
CPU Data Node

Disk Disk
Hadoop File System (HDFS)
Server Rack A Server Rack B
64 64 64 64
MB MB MB MB
MapReduce

Data Node Data Node Data Node

One mapper per block
Map Map Map
Block of data Block of data Block of data
Parallel distributed processing given
a file is split into blocks
across multiple servers.

5 Value 9 Value 2 Value

9 Value 2 Value 3 Value
9 Value 3 Value 2 Value
Shuffle and Sort
7 Value 7 Value

Folder in HDFS
Reducer A Reducer B
Word Count Example
Key Value

Byte offset This is the first line

Byte offset This is the second
line
Key Value Key Value Key Value Key Value
This 1 This 1 This 1 line 1
is 1 is 1 This 1 line 1
the 1 the 1 the 1 is 1
first 1 second 1 the 1 is 1
line 1 line 1 second 1
first 1

Reducer A Reducer B
first 1 is 2
second 1 line 2
the 2
This 2
Basic commands using HDFS

Hadoop Demo
Environment Setup

 Course focus is on development

 Use a Virtual Machine image to follow along with examples
 Pseudo distributed sandbox
 Replication factor set to 1
 Name Node, Job Tracker, Data Node, and Task Tracker on a single machine
 Demos using Hortonworks’ HDP sandbox
 Hive 0.10, 0.11 and above
Summary

 Distributed computing and scaling out to solve big data problems

 Key system characteristics
 Built to handle failures
 Move processing to the data
 Failures are inevitable. Embracing this allows for solutions built on
commodity servers
 MapReduce
 Mapper assigned to each block of data
 Key-value pairs are both the input to and output of each phase
 Keys must implement WritableComparable interface
 Shuffle and Sort plays a key role in solving problem

Vous aimerez peut-être aussi

1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides
Document11 pages
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides
iraj
Pas encore d'évaluation
Hadoop Overview-Tutorial-20081128 PDF
Document31 pages
Hadoop Overview-Tutorial-20081128 PDF
TrurlScribd
Pas encore d'évaluation
BDP 2023 03
Document59 pages
BDP 2023 03
Aayush Airan
Pas encore d'évaluation
Chapter4 PDF
Document50 pages
Chapter4 PDF
Vard Farrell
Pas encore d'évaluation
Lecture 5 - Hadoop and Mapreduce
Document30 pages
Lecture 5 - Hadoop and Mapreduce
mba20238
Pas encore d'évaluation
6 - Hadoop - Hdfs and Map Reduce
Document43 pages
6 - Hadoop - Hdfs and Map Reduce
Wong pi wen
Pas encore d'évaluation
Hadoop Tutorials: Daniel Lanza Zbigniew Baranowski
Document49 pages
Hadoop Tutorials: Daniel Lanza Zbigniew Baranowski
Ravi Kumar
Pas encore d'évaluation
Introduction To The Big Data Ecosystem
Document13 pages
Introduction To The Big Data Ecosystem
Rico Martenstyaro
Pas encore d'évaluation
Lecture 2
Document28 pages
Lecture 2
ahmed hossam
Pas encore d'évaluation
9 Hadoop PDF
Document59 pages
9 Hadoop PDF
Amine Hamdouchi
Pas encore d'évaluation
Hadoop: Fasilkom/Pusilkom UI (Credit: Samuel Louvan)
Document44 pages
Hadoop: Fasilkom/Pusilkom UI (Credit: Samuel Louvan)
Johan Rizky Aditya
Pas encore d'évaluation
Hadoop Introduction Final
Document13 pages
Hadoop Introduction Final
Acharya Kandala
Pas encore d'évaluation
Hadoop and HBase
Document31 pages
Hadoop and HBase
Sami Ali
Pas encore d'évaluation
Hadoop Introduction
Document29 pages
Hadoop Introduction
debmatra
Pas encore d'évaluation
Intro ToHadoop-Unit 04
Document24 pages
Intro ToHadoop-Unit 04
Asiya Khan
Pas encore d'évaluation
HDFS 79
Document74 pages
HDFS 79
bhargavi
Pas encore d'évaluation
Module 3 - Mapreduce
Document40 pages
Module 3 - Mapreduce
Aditya Raj
Pas encore d'évaluation
BDP 2024 06
Document14 pages
BDP 2024 06
Jay Nagwani
Pas encore d'évaluation
Hadoop Infrastructure: Dr. Rizal Fathoni Aji Daya Adianto, M.Kom
Document47 pages
Hadoop Infrastructure: Dr. Rizal Fathoni Aji Daya Adianto, M.Kom
Eka Dewi Fitriani
Pas encore d'évaluation
Apache Hadoop and Spark:: and Use Cases For Data Analysis
Document48 pages
Apache Hadoop and Spark:: and Use Cases For Data Analysis
satish.sathya.a2012
Pas encore d'évaluation
Basics of RDD
Document84 pages
Basics of RDD
justin maxton
Pas encore d'évaluation
Hadoop Course Outline UPDATED SURESH
Document5 pages
Hadoop Course Outline UPDATED SURESH
Renu Pareek
Pas encore d'évaluation
Hadoop Arch and Storage
Document23 pages
Hadoop Arch and Storage
Bhagwan Bharose
Pas encore d'évaluation
Chapter 05 Google Re
Document69 pages
Chapter 05 Google Re
William
Pas encore d'évaluation
Distributed and Cloud Computing
Document58 pages
Distributed and Cloud Computing
18JE0254 CHIRAG JAIN
Pas encore d'évaluation
Devops Roadmap
Document1 page
Devops Roadmap
s
Pas encore d'évaluation
Big Data & Hadoop Class Syllabus: Hdfs Clients
Document2 pages
Big Data & Hadoop Class Syllabus: Hdfs Clients
punitha.jagan536
Pas encore d'évaluation
Hadoop and MR Programming: DR G Sudha Sadasivam Professor Cse, PSGCT
Document71 pages
Hadoop and MR Programming: DR G Sudha Sadasivam Professor Cse, PSGCT
VALANARR COMPUTERS
Pas encore d'évaluation
CCA131 v1
Document154 pages
CCA131 v1
Nadeem Khan
Pas encore d'évaluation
Tableau Module 10
Document55 pages
Tableau Module 10
Ms Madhuri
Pas encore d'évaluation
Design big data batch and interactive solutions
Document56 pages
Design big data batch and interactive solutions
Shiv
Pas encore d'évaluation
Speed Up Your Queries With Hive LLAP Engine On Hadoop or in The Cloud
Document29 pages
Speed Up Your Queries With Hive LLAP Engine On Hadoop or in The Cloud
Somasekhar Ganti
Pas encore d'évaluation
Leveraging Consistent Hashing in Your Python Applications
Document39 pages
Leveraging Consistent Hashing in Your Python Applications
Hemant Jain
Pas encore d'évaluation
BigData Hadoop
Document87 pages
BigData Hadoop
Sivaprasad Reddy
100% (1)
Container Orchestration For Big Data Workloads Final 72717 301083
Document66 pages
Container Orchestration For Big Data Workloads Final 72717 301083
Balaji
Pas encore d'évaluation
2-HadoopArchitecture HDFS
Document84 pages
2-HadoopArchitecture HDFS
venkatraji719568
Pas encore d'évaluation
Open-Source Solution For Huge Data Sets: Hadoop Committer - Apache Software Foundation 11/23/2008
Document56 pages
Open-Source Solution For Huge Data Sets: Hadoop Committer - Apache Software Foundation 11/23/2008
h2a Chandu
Pas encore d'évaluation
Hadoop vs Apache Spark
Document6 pages
Hadoop vs Apache Spark
indolent56
Pas encore d'évaluation
Unit 5 Frameworks and Visualizatoins Hadoop Map Reduce Architecture and Example
Document45 pages
Unit 5 Frameworks and Visualizatoins Hadoop Map Reduce Architecture and Example
slogeshwari
Pas encore d'évaluation
The Big Data Technology Landscape
Document36 pages
The Big Data Technology Landscape
Ponnusamy S Pichaimuthu
Pas encore d'évaluation
Hdfs Architecture and Hadoop Mapreduce
Document10 pages
Hdfs Architecture and Hadoop Mapreduce
Nishkarsh Shah
Pas encore d'évaluation
Lecture 1 Parallel Databases
Document30 pages
Lecture 1 Parallel Databases
Kumkumo Kussia Kossa
Pas encore d'évaluation
Bigdata Engineer Complete Syllabus: Presented by
Document21 pages
Bigdata Engineer Complete Syllabus: Presented by
Chepuri Sravan Kumar
Pas encore d'évaluation
Cluster Basics
Document34 pages
Cluster Basics
Jay Nagwani
Pas encore d'évaluation
Technologies For Handling Big Data: Prepared By: Saidatul Rahah Hamidi
Document49 pages
Technologies For Handling Big Data: Prepared By: Saidatul Rahah Hamidi
syahmina
Pas encore d'évaluation
Spark Using Scala - Developer Training 1.0
Document106 pages
Spark Using Scala - Developer Training 1.0
ahmed_sft
Pas encore d'évaluation
Adobe Scan Dec 05, 2023
Document7 pages
Adobe Scan Dec 05, 2023
dk singh
Pas encore d'évaluation
2_hadoop_ecosystem
Document41 pages
2_hadoop_ecosystem
tranngocbaooooo12062003
Pas encore d'évaluation
02 Hadoop Architecture and HDFS
Document74 pages
02 Hadoop Architecture and HDFS
arjun.ec633
Pas encore d'évaluation
BD Merged
Document330 pages
BD Merged
Aman CSE050
Pas encore d'évaluation
TDD: Topics in Distributed Databases: Parallel Database Management Systems
Document38 pages
TDD: Topics in Distributed Databases: Parallel Database Management Systems
Tatiana umba
Pas encore d'évaluation
Master-Slave Architecture and Key Daemons in Hadoop
Document11 pages
Master-Slave Architecture and Key Daemons in Hadoop
Subha
Pas encore d'évaluation
DS Tools Lec 03
Document24 pages
DS Tools Lec 03
Youmna Eid
Pas encore d'évaluation
Hadoop by Dr. Kamal Gulati
Document33 pages
Hadoop by Dr. Kamal Gulati
Yash Tiwari
Pas encore d'évaluation
Unit 2 Lecture - 04 - HDFS PDF
Document40 pages
Unit 2 Lecture - 04 - HDFS PDF
Vaibhavi Sangawar
Pas encore d'évaluation
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Document37 pages
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Dank Boii
Pas encore d'évaluation
Introduction to Big Data and Hadoop Framework
Document29 pages
Introduction to Big Data and Hadoop Framework
Manoj K Upadhyaya
100% (1)
BDA - II Sem - II Mid
Document4 pages
BDA - II Sem - II Mid
Polikanti Goutham
100% (1)
Big Data and Analytics and MapReduce 29052023 054155pm
Document35 pages
Big Data and Analytics and MapReduce 29052023 054155pm
Talha Mughal
Pas encore d'évaluation
Learn Hive in 24 Hours
D'Everand
Learn Hive in 24 Hours
Alex Nordeen
Pas encore d'évaluation
1 Deep Learning - Table of Content
Document8 pages
1 Deep Learning - Table of Content
Arief Prihantoro
Pas encore d'évaluation
JW Player 6.8.4616 Ads Edition Document
Document2 pages
JW Player 6.8.4616 Ads Edition Document
गोपाल शर्मा
0% (1)
Install - Flutter PDF
Document1 page
Install - Flutter PDF
गोपाल शर्मा
Pas encore d'évaluation
Fast Algorithms For Mining Association Rules
Document2 pages
Fast Algorithms For Mining Association Rules
गोपाल शर्मा
Pas encore d'évaluation
Flutter Documentation - Flutter
Document3 pages
Flutter Documentation - Flutter
गोपाल शर्मा
Pas encore d'évaluation
Shiva Audio PDF
Document1 page
Shiva Audio PDF
गोपाल शर्मा
Pas encore d'évaluation
Krishnamurti A Biography
Document548 pages
Krishnamurti A Biography
गोपाल शर्मा
Pas encore d'évaluation
Getting Location in NativeScript - DEV Community ? - ?? - ?
Document5 pages
Getting Location in NativeScript - DEV Community ? - ?? - ?
गोपाल शर्मा
Pas encore d'évaluation
Computed Properties and Watchers - Vue - Js
Document6 pages
Computed Properties and Watchers - Vue - Js
गोपाल शर्मा
Pas encore d'évaluation
Install - Flutter
Document5 pages
Install - Flutter
गोपाल शर्मा
Pas encore d'évaluation
FFmpeg, HLS - Google खोज
Document2 pages
FFmpeg, HLS - Google खोज
गोपाल शर्मा
Pas encore d'évaluation
Choose A Plan, Step 2 of 3 - Scribd
Document4 pages
Choose A Plan, Step 2 of 3 - Scribd
गोपाल शर्मा
Pas encore d'évaluation
What Is LightGBM, How To Implement It - How To Fine Tune The Parameters
Document2 pages
What Is LightGBM, How To Implement It - How To Fine Tune The Parameters
गोपाल शर्मा
Pas encore d'évaluation
Isha Lahar - Sep 2018
Document70 pages
Isha Lahar - Sep 2018
गोपाल शर्मा
Pas encore d'évaluation
New Ticket For Sunday
Document2 pages
New Ticket For Sunday
DiyaNegi
Pas encore d'évaluation
Java Android Developer Resume Mahesh Paul Singh
Document1 page
Java Android Developer Resume Mahesh Paul Singh
गोपाल शर्मा
Pas encore d'évaluation
Choose A Plan, Step 2 of 3 - Scribd PDF
Document1 page
Choose A Plan, Step 2 of 3 - Scribd PDF
गोपाल शर्मा
Pas encore d'évaluation
DR Shabir Choudhry (Official)
Document1 page
DR Shabir Choudhry (Official)
गोपाल शर्मा
Pas encore d'évaluation
Ashish Kedia's Answer To How Do I Practice Programming Everyday
Document1 page
Ashish Kedia's Answer To How Do I Practice Programming Everyday
गोपाल शर्मा
Pas encore d'évaluation
What Mistakes Did Nehru Make During His Tenure, That Is Costing India Now PDF
Document3 pages
What Mistakes Did Nehru Make During His Tenure, That Is Costing India Now PDF
गोपाल शर्मा
Pas encore d'évaluation
Power Development Department, J & K Online Payment Receipt
Document1 page
Power Development Department, J & K Online Payment Receipt
गोपाल शर्मा
Pas encore d'évaluation
Acio Interview - What Willbe Asked in Acio Interview - Quora
Document3 pages
Acio Interview - What Willbe Asked in Acio Interview - Quora
गोपाल शर्मा
Pas encore d'évaluation
Golang tutorial TOC: Introduction to variables, functions & more
Document3 pages
Golang tutorial TOC: Introduction to variables, functions & more
गोपाल शर्मा
Pas encore d'évaluation
What Mistakes Did Nehru Make During His Tenure, That Is Costing India Now
Document3 pages
What Mistakes Did Nehru Make During His Tenure, That Is Costing India Now
गोपाल शर्मा
Pas encore d'évaluation
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides PDF
Document11 pages
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides PDF
गोपाल शर्मा
Pas encore d'évaluation
Unique Jana Gana Mana Quora
Document2 pages
Unique Jana Gana Mana Quora
गोपाल शर्मा
Pas encore d'évaluation
Scripts Used in Demo
Document1 page
Scripts Used in Demo
गोपाल शर्मा
Pas encore d'évaluation
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides PDF
Document11 pages
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides PDF
गोपाल शर्मा
Pas encore d'évaluation
Export CKD Files Read Me
Document4 pages
Export CKD Files Read Me
vladko_mi_lin4222
Pas encore d'évaluation
Programmable Switching DC Power Supply
Document6 pages
Programmable Switching DC Power Supply
Paolo Cristiani
Pas encore d'évaluation
Example Curriculum Vitae
Document4 pages
Example Curriculum Vitae
Gugun Nugraha
Pas encore d'évaluation
Turnitin - Originality Report - BTP Report
Document7 pages
Turnitin - Originality Report - BTP Report
atharvapatle1214129
Pas encore d'évaluation
Microsoft Word (Microsoft 365 Apps and Office 2019) : Exam MO-100
Document2 pages
Microsoft Word (Microsoft 365 Apps and Office 2019) : Exam MO-100
Saber Raddaoui
Pas encore d'évaluation
HP LJ P2035 P2055 Partslist
Document21 pages
HP LJ P2035 P2055 Partslist
albathorres
Pas encore d'évaluation
SQL and Xquery Tutorial For Ibm Db2, Part 5:: Data Comparison
Document26 pages
SQL and Xquery Tutorial For Ibm Db2, Part 5:: Data Comparison
prasanthnow
Pas encore d'évaluation
DSP Unit 5
Document34 pages
DSP Unit 5
Maggi Felix
Pas encore d'évaluation
Cloud Pricing Models: Taxonomy, Survey, and Interdisciplinary Challenges
Document51 pages
Cloud Pricing Models: Taxonomy, Survey, and Interdisciplinary Challenges
mcclaink06
Pas encore d'évaluation
Olimex Eeg Users Manual Revision C
Document25 pages
Olimex Eeg Users Manual Revision C
alderen
Pas encore d'évaluation
Minishell: As Beautiful As A Shell
Document7 pages
Minishell: As Beautiful As A Shell
All Caps
Pas encore d'évaluation
NCKU Compiler 2021 Assignment 1 Paged
Document6 pages
NCKU Compiler 2021 Assignment 1 Paged
Arthue1316
Pas encore d'évaluation
Blender Shorcuts
Document2 pages
Blender Shorcuts
Hamilton Grimaldi
Pas encore d'évaluation
Install redhat package with YUM without RHN
Document7 pages
Install redhat package with YUM without RHN
kank riyan
Pas encore d'évaluation
Datasheet Tic116d
Document6 pages
Datasheet Tic116d
J Andres CM
Pas encore d'évaluation
NavCad PWD Module NA Jul-Aug 14
Document1 page
NavCad PWD Module NA Jul-Aug 14
theleepiper8830
Pas encore d'évaluation
Change Floppy Disk Drive Model Roche Hitachi 911 Analyzer
Document3 pages
Change Floppy Disk Drive Model Roche Hitachi 911 Analyzer
Jose Godoy
Pas encore d'évaluation
TPD2 Catalog 86C0
Document4 pages
TPD2 Catalog 86C0
Manuel Vieira
Pas encore d'évaluation
Abinitio Interview
Document70 pages
Abinitio Interview
Chandni Kumari
100% (5)
Sybase 12 Install Instructions
Document29 pages
Sybase 12 Install Instructions
John Thornton
Pas encore d'évaluation
Sap HR Preview PDF
Document12 pages
Sap HR Preview PDF
Yashika Saraf
100% (1)
Word Processing Fundamentals Explained
Document12 pages
Word Processing Fundamentals Explained
Queven James Elemino
Pas encore d'évaluation
Smart Battery Data Specification Revision 1.1 Errata
Document5 pages
Smart Battery Data Specification Revision 1.1 Errata
redmsbattery
Pas encore d'évaluation
1984 Video Games
Document7 pages
1984 Video Games
Simon Grisales Henao
Pas encore d'évaluation
Battery Connection - Power Cartel PDF
Document5 pages
Battery Connection - Power Cartel PDF
Lon James
Pas encore d'évaluation
DRD700 Manual
Document26 pages
DRD700 Manual
ak1828
Pas encore d'évaluation
Previous GATE Questions On Number Systems
Document4 pages
Previous GATE Questions On Number Systems
helithi
Pas encore d'évaluation
Poetry
Document97 pages
Poetry
Lora Cunanan
100% (1)
Rr-Us470 v21 Installation
Document12 pages
Rr-Us470 v21 Installation
Douglassena
Pas encore d'évaluation
Export Failed: The Data Cannot Be Displayed Because The Query Returned Too Many Records. Please Try Filtering Your Data in SAP Analytics Cloud
Document2 pages
Export Failed: The Data Cannot Be Displayed Because The Query Returned Too Many Records. Please Try Filtering Your Data in SAP Analytics Cloud
VENKATESH
Pas encore d'évaluation