Big Data

Transféré par

Naimish Solanki

0% ont trouvé ce document utile (0 vote)

67 vues18 pages

BigData AndHadoop

Copyright

Formats disponibles

PPTX, PDF, TXT ou lisez en ligne sur Scribd

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Signaler ce document

BigData AndHadoop

Droits d'auteur :

Formats disponibles

Téléchargez comme PPTX, PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

0% ont trouvé ce document utile (0 vote)

67 vues18 pages

Big Data

Transféré par

Naimish Solanki

BigData AndHadoop

Droits d'auteur :

Formats disponibles

Téléchargez comme PPTX, PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

Passer à la page

Vous êtes sur la page 1sur 18

Rechercher à l'intérieur du document

Rahul Agarwal

irahul.com
Amr Awadallah: http://www.sfbayacm.org/wp/wp-
content/uploads/2010/01/amr-hadoop-acm-dm-
sig-jan2010.pdf
Hadoop: http://hadoop.apache.org/
Computerworld:
http://www.computerworld.com/s/article/350908/5_
Indispensable_IT_Skills_of_the_Future
Ashish Tushoo: http://www.sfbayacm.org/wp/wp-
content/uploads/2010/01/sig_2010_v21.pdf
Big data: http://en.wikipedia.org/wiki/Big_data
Chukwa: http://www.cca08.org/papers/Paper-13-
Ariel-Rabkin.pdf
Dean, Ghemawat:
http://labs.google.com/papers/mapreduce.html
Big Data Problem
What is Hadoop
HDFS
MapReduce
HBase
PIG
HIVE
Chukwa
ZooKeeper
Q&A
Extremely large datasets that are hard to deal
with using Relational Databases
Storage/Cost
Search/Performance
Analytics and Visualization
Need for parallel processing on hundreds of
machines
ETL cannot complete within a reasonable time
Beyond 24hrs never catch up

System shall manage and heal itself
Automatically and transparently route around
failure
Speculatively execute redundant tasks if certain
nodes are detected to be slow
Performance shall scale linearly
Proportional change in capacity with resource
change
Compute should move to data
Lower latency, lower bandwidth
Simple core, modular and extensible
A scalable fault-tolerant grid operating
system for data storage and processing
Commodity hardware
HDFS: Fault-tolerant high-bandwidth clustered
storage
MapReduce: Distributed data processing
Works with structured and unstructured data
Open source, Apache license
Master (named-node) Slave architecture
HDFS
(Hadoop Distributed File System)
HBase (key-value store)
MapReduce (Job Scheduling/Execution System)
Pig (Data Flow) Hive (SQL)
BI Reporting ETL Tools
Z
o
o
K
e
e
p
e
r

(
C
o
o
r
d
i
n
a
t
i
o
n
)

(Streaming/Pipes APIs)
C
h
u
k
w
a

(
M
o
n
i
t
o
r
i
n
g
)

Block Size = 64MB
Replication Factor = 3
Patented Google framework
Distributed processing of large datasets

map (in_key, in_value) ->
list(out_key, intermediate_value)
reduce (out_key,
list(intermediate_value)) ->
list(out_value)

Project's goal is the hosting of very large
tables - billions of rows X millions of columns
- atop clusters of commodity hardware
Hadoop database, open-source version of
Google BigTable
Column-oriented
Random access, realtime read/write
Random access performance on par with
open source relational databases such as
MySQL
High level language (Pig Latin) for expressing
data analysis programs
Compiled into a series of MapReduce jobs
Easier to program
Optimization opportunities

grunt> A = LOAD 'student' USING
PigStorage() AS (name:chararray, age:int,
gpa:float);
grunt> B = FOREACH A GENERATE name;
Managing and querying structured data
MapReduce for execution
SQL like syntax
Extensible with types, functions, scripts
Metadata stored in a RDBMS (MySQL)
Joins, Group By, Nesting
Optimizer for number of MapReduce required

hive> SELECT a.foo FROM invites a WHERE
a.ds='<DATE>';
A highly available, scalable, distributed,
configuration, consensus, group
membership, leader election, naming, and
coordination service
Cluster Management
Load balancing
JMX monitoring
Data collection
system for
monitoring
distributed
systems
Agents to collect
and process logs
Monitoring and
analysis
Hadoop
Infrastructure
Care Center
Hadoop
Affordable
Storage/Compute
Structured or
Unstructured
Resilient Auto
Scalability

Relational Databases
Interactive
response times
ACID
Structured data
Cost/Scale
prohibitive

Vous aimerez peut-être aussi

Oracle Workflow API Reference: B10286-01 April 2003
Document408 pages
Oracle Workflow API Reference: B10286-01 April 2003
Naimish Solanki
Pas encore d'évaluation
Custom Objects Example Use Cases in Product Hub
Document9 pages
Custom Objects Example Use Cases in Product Hub
Naimish Solanki
Pas encore d'évaluation
Oracle Workflow API Reference: B10286-01 April 2003
Document408 pages
Oracle Workflow API Reference: B10286-01 April 2003
Naimish Solanki
Pas encore d'évaluation
Oracle® Product Information Management: User's Guide Release 12.1
Document410 pages
Oracle® Product Information Management: User's Guide Release 12.1
Naimish Solanki
Pas encore d'évaluation
R12 Moac
Document38 pages
R12 Moac
Naimish Solanki
Pas encore d'évaluation
115 On Tug
Document634 pages
115 On Tug
veena_14
Pas encore d'évaluation
I Repository
Document22 pages
I Repository
GLO
Pas encore d'évaluation
Oracle Bills of Material User's Guide
Document332 pages
Oracle Bills of Material User's Guide
David Bickerdyke
Pas encore d'évaluation
Oracle 10g: Infrastructure For Grid Computing: An Oracle White Paper September 2003
Document15 pages
Oracle 10g: Infrastructure For Grid Computing: An Oracle White Paper September 2003
Naimish Solanki
Pas encore d'évaluation
Handout
Document14 pages
Handout
sdawdwsq
Pas encore d'évaluation
Oracle AP Technical Reference Manual
Document682 pages
Oracle AP Technical Reference Manual
Bryan Ordoñez
100% (1)
Oracle 10g: Infrastructure For Grid Computing: An Oracle White Paper September 2003
Document15 pages
Oracle 10g: Infrastructure For Grid Computing: An Oracle White Paper September 2003
Naimish Solanki
Pas encore d'évaluation
XML (BI ) Publisher Live With All The Bells and Whistles: Sridhar Bogelli Kshitij Kumar
Document45 pages
XML (BI ) Publisher Live With All The Bells and Whistles: Sridhar Bogelli Kshitij Kumar
Naimish Solanki
Pas encore d'évaluation
Oracle R12 TableView Changes
Document130 pages
Oracle R12 TableView Changes
sanjayapps
80% (10)
R12 Features and Functionality in TCA and CDH
Document19 pages
R12 Features and Functionality in TCA and CDH
ajaythane
Pas encore d'évaluation
MOAC Setup R12
Document57 pages
MOAC Setup R12
Naimish Solanki
Pas encore d'évaluation
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
D'Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Évaluation : 4 sur 5 étoiles
4/5 (5784)
The Little Book of Hygge: Danish Secrets to Happy Living
D'Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Évaluation : 3.5 sur 5 étoiles
3.5/5 (399)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
D'Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Évaluation : 4 sur 5 étoiles
4/5 (890)
Shoe Dog: A Memoir by the Creator of Nike
D'Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Évaluation : 4.5 sur 5 étoiles
4.5/5 (537)
Grit: The Power of Passion and Perseverance
D'Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Évaluation : 4 sur 5 étoiles
4/5 (587)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
D'Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Évaluation : 4.5 sur 5 étoiles
4.5/5 (474)
The Yellow House: A Memoir (2019 National Book Award Winner)
D'Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Évaluation : 4 sur 5 étoiles
4/5 (98)
Team of Rivals: The Political Genius of Abraham Lincoln
D'Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Évaluation : 4.5 sur 5 étoiles
4.5/5 (234)
Fear: Trump in the White House
D'Everand
Fear: Trump in the White House
Bob Woodward
Évaluation : 3.5 sur 5 étoiles
3.5/5 (738)
Never Split the Difference: Negotiating As If Your Life Depended On It
D'Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Évaluation : 4.5 sur 5 étoiles
4.5/5 (838)
The Emperor of All Maladies: A Biography of Cancer
D'Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Évaluation : 4.5 sur 5 étoiles
4.5/5 (271)
Yes Please
D'Everand
Yes Please
Amy Poehler
Évaluation : 4 sur 5 étoiles
4/5 (1888)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
D'Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Évaluation : 3.5 sur 5 étoiles
3.5/5 (231)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
D'Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Évaluation : 4.5 sur 5 étoiles
4.5/5 (265)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
D'Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Évaluation : 4.5 sur 5 étoiles
4.5/5 (344)
Principles: Life and Work
D'Everand
Principles: Life and Work
Ray Dalio
Évaluation : 4 sur 5 étoiles
4/5 (599)
On Fire: The (Burning) Case for a Green New Deal
D'Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Évaluation : 4 sur 5 étoiles
4/5 (72)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
D'Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Évaluation : 3.5 sur 5 étoiles
3.5/5 (2219)
Rise of ISIS: A Threat We Can't Ignore
D'Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Évaluation : 3.5 sur 5 étoiles
3.5/5 (137)
The Unwinding: An Inner History of the New America
D'Everand
The Unwinding: An Inner History of the New America
George Packer
Évaluation : 4 sur 5 étoiles
4/5 (45)
Angela's Ashes: A Memoir
D'Everand
Angela's Ashes: A Memoir
Frank McCourt
Évaluation : 4.5 sur 5 étoiles
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
D'Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Évaluation : 4 sur 5 étoiles
4/5 (1090)
The Glass Castle: A Memoir
D'Everand
The Glass Castle: A Memoir
Jeannette Walls
Évaluation : 4.5 sur 5 étoiles
4.5/5 (1711)
John Adams
D'Everand
John Adams
David McCullough
Évaluation : 4.5 sur 5 étoiles
4.5/5 (2409)
Steve Jobs
D'Everand
Steve Jobs
Walter Isaacson
Évaluation : 4.5 sur 5 étoiles
4.5/5 (806)
Bad Feminist: Essays
D'Everand
Bad Feminist: Essays
Roxane Gay
Évaluation : 4 sur 5 étoiles
4/5 (1015)
The Outsider: A Novel
D'Everand
The Outsider: A Novel
Stephen King
Évaluation : 4 sur 5 étoiles
4/5 (1800)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
D'Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Évaluation : 4.5 sur 5 étoiles
4.5/5 (119)
The Light Between Oceans: A Novel
D'Everand
The Light Between Oceans: A Novel
M.L. Stedman
Évaluation : 4.5 sur 5 étoiles
4.5/5 (789)
The Woman in Cabin 10
D'Everand
The Woman in Cabin 10
Ruth Ware
Évaluation : 3.5 sur 5 étoiles
3.5/5 (2322)
A Man Called Ove: A Novel
D'Everand
A Man Called Ove: A Novel
Fredrik Backman
Évaluation : 4.5 sur 5 étoiles
4.5/5 (4609)
Wolf Hall: A Novel
D'Everand
Wolf Hall: A Novel
Hilary Mantel
Évaluation : 4 sur 5 étoiles
4/5 (3811)
Brooklyn: A Novel
D'Everand
Brooklyn: A Novel
Colm Tóibín
Évaluation : 3.5 sur 5 étoiles
3.5/5 (1937)
The Perks of Being a Wallflower
D'Everand
The Perks of Being a Wallflower
Stephen Chbosky
Évaluation : 4.5 sur 5 étoiles
4.5/5 (2099)
A Tree Grows in Brooklyn
D'Everand
A Tree Grows in Brooklyn
Betty Smith
Évaluation : 4.5 sur 5 étoiles
4.5/5 (1929)
Sing, Unburied, Sing: A Novel
D'Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Évaluation : 4 sur 5 étoiles
4/5 (1103)
Little Women
D'Everand
Little Women
Louisa May Alcott
Évaluation : 4 sur 5 étoiles
4/5 (104)
The Constant Gardener: A Novel
D'Everand
The Constant Gardener: A Novel
John le Carré
Évaluation : 3.5 sur 5 étoiles
3.5/5 (104)
The Art of Racing in the Rain: A Novel
D'Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Évaluation : 4 sur 5 étoiles
4/5 (4193)
Manhattan Beach: A Novel
D'Everand
Manhattan Beach: A Novel
Jennifer Egan
Évaluation : 3.5 sur 5 étoiles
3.5/5 (791)
Her Body and Other Parties: Stories
D'Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Évaluation : 4 sur 5 étoiles
4/5 (821)
DWH
Document5 pages
DWH
Rabindra P.Singh
Pas encore d'évaluation
Basic Training - Session No. 1
Document33 pages
Basic Training - Session No. 1
Ivan Lizarazo
100% (1)
Introduction To RAPID 3HAC029364-001 Rev - en
Document56 pages
Introduction To RAPID 3HAC029364-001 Rev - en
Avi Koren
Pas encore d'évaluation
Software Testing On E-Learning Madrasahs Using Blackbox Testing
Document9 pages
Software Testing On E-Learning Madrasahs Using Blackbox Testing
oman
Pas encore d'évaluation
Display Keyboard - VVVV
Document3 pages
Display Keyboard - VVVV
Yungaro David Santalla Aquim
Pas encore d'évaluation
ACH File Specifications
Document18 pages
ACH File Specifications
sambireddyn
Pas encore d'évaluation
Human Resource Information Systems
Document23 pages
Human Resource Information Systems
krishnapal verma
100% (1)
Hillcrest 1st Term Exam
Document34 pages
Hillcrest 1st Term Exam
Oyinade Adeolu
Pas encore d'évaluation
Handover Power Boost
Document1 page
Handover Power Boost
pote
Pas encore d'évaluation
Guia de Aprendizaje - Internet Reading
Document9 pages
Guia de Aprendizaje - Internet Reading
Andrea Avila
Pas encore d'évaluation
Mech 3 Win
Document7 pages
Mech 3 Win
Santiago de la Esperanza
Pas encore d'évaluation
USB Mass Storage Driver Exif Viewer DP Editor Exif Launcher: Quick Start Guide
Document21 pages
USB Mass Storage Driver Exif Viewer DP Editor Exif Launcher: Quick Start Guide
Jo Pete
Pas encore d'évaluation
Huawei HG8245H ONT Datasheet
Document4 pages
Huawei HG8245H ONT Datasheet
Agus Wahyu
Pas encore d'évaluation
How To Create Signed Based Reclassification in Group Reporting
Document9 pages
How To Create Signed Based Reclassification in Group Reporting
Jose Eximeno
Pas encore d'évaluation
SSN College of Engineering - Chennai - Brochure
Document44 pages
SSN College of Engineering - Chennai - Brochure
Alphin
Pas encore d'évaluation
MarketMoni Loan Application Form & Requirement PDF
Document6 pages
MarketMoni Loan Application Form & Requirement PDF
EmeakamaG.Emakama
Pas encore d'évaluation
Logcat STDERR1685605143810
Document2 pages
Logcat STDERR1685605143810
Fajar Udin
Pas encore d'évaluation
SecureClient 5.6 UsersGuide All ENU
Document124 pages
SecureClient 5.6 UsersGuide All ENU
nxth@inbox.com
Pas encore d'évaluation
Lecture 4
Document25 pages
Lecture 4
nithilan92
Pas encore d'évaluation
CIT-110 Assignment 03 Body Mass Index
Document4 pages
CIT-110 Assignment 03 Body Mass Index
Stephen Ramos
Pas encore d'évaluation
Mô Dem Quang Telenetic
Document2 pages
Mô Dem Quang Telenetic
Hongha Chau
Pas encore d'évaluation
8251 Usart
Document14 pages
8251 Usart
mohit mishra
Pas encore d'évaluation
5 Techniques Creating Java Web Services From WSDL NetBeans Zone
Document10 pages
5 Techniques Creating Java Web Services From WSDL NetBeans Zone
Vinh Le
Pas encore d'évaluation
Special Instructions To The Bidder For Participating in E-Tender
Document17 pages
Special Instructions To The Bidder For Participating in E-Tender
Madhav Purohit
Pas encore d'évaluation
Utkarsh Patil: Career Objective
Document3 pages
Utkarsh Patil: Career Objective
UTKARSH PATIL
Pas encore d'évaluation
Stranchat: Department of Computer Science, Christ University
Document54 pages
Stranchat: Department of Computer Science, Christ University
AryanKumar
Pas encore d'évaluation
Robert Bosch Placement Pattern 2 - Freshers Choice
Document14 pages
Robert Bosch Placement Pattern 2 - Freshers Choice
fresherschoice
Pas encore d'évaluation
Protecting Your Sensitive Data in The Oracle E Business Suite
Document38 pages
Protecting Your Sensitive Data in The Oracle E Business Suite
mtalmasri
Pas encore d'évaluation
HPE ProLiant HPE ML350 Gen10 4110, 877621-371
Document4 pages
HPE ProLiant HPE ML350 Gen10 4110, 877621-371
edi
Pas encore d'évaluation
Smart Gas Station APP
Document7 pages
Smart Gas Station APP
Sultan mohammed
Pas encore d'évaluation