Vous êtes sur la page 1sur 8

HADOOP Course

Content
By Mr. Kalyan, 7+ Years of Realtime
Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
Introduction to Big Data and Hadoop

Big Data
What is Big Data?
Why all industries are talking about Big Data?
What are the issues in Big Data?
Storage
What are the challenges for storing big data?
Processing
What are the challenges for processing big data?
What are the technologies support big data?
Hadoop
Data Bases
Traditional
NO SQL

Hadoop
What is Hadoop?
History of Hadoop
Why Hadoop?
Hadoop Use cases

Advantages and Disadvantages of


Hadoop Importance of Different
Ecosystems of Hadoop
Importance of Integration with other BigData
solutions Big Data Real time Use Cases

HDFS (Hadoop Distributed File


System)
HDFS Architecture
Name Node
Importance of Name Node
What are the roles of Name Node
What are the drawbacks in Name Node
Secondary Name Node
Importance of Secondary Name Node
What are the roles of Secondary Name
Node
What are the drawbacks in Secondary
Name Node
Data Node
Importance of Data Node
What are the roles of Data Node
What are the drawbacks in Data Node
Data Storage in HDFS
How blocks are storing in DataNodes
How
is relatedworks
to MapReduce
split size
How it
replication
in Data Nodes
How to write the
files2nd
in HDFS
#204,
Floor, Annapurna Block, Aditya Enclave, Ameerpet,
Hyderabad.
How to read the files in HDFS
Ph: 040 6514 2345, 0970 320 2345. E-mail: info@orienit.com www

HADOOP Course
Content
By Mr. Kalyan, 7+ Years of Realtime
Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
HDFS Replication factor
Importance of HDFS Replication factor in production environment
Can we change the replication for a particular file or folder
Can we change the replication for all files or folders
Accessing HDFS
CLI(Command Line Interface) using hdfs commands
Java Based Approach
HDFS Commands
Importance of each command
How to execute the command
Hdfs admin related commands explanation
Configurations
Can we change the existing configurations of hdfs or not?
Importance of configurations
How to overcome the Drawbacks in HDFS
Name Node failures
Secondary Name Node failures
Data Node failures
Where does it fit and Where
doesn't fit? Exploring the Apache
HDFS Web UI How to configure
the Hadoop Cluster
How to add the new nodes
( Commissioning )
How to remove the existing
nodes ( De-Commissioning )
How to verify the Dead Nodes
How to start the Dead
Nodes Hadoop 2.x.x version
features Introduction to
Namenode fedoration
Introduction to Namenode
High Availabilty
Difference between Hadoop
1.x.x and Hadoop 2.x.x versions

MAPREDUCE
Map Reduce architecture
JobTracker
Importance of JobTracker
What are the roles of
JobTracker
What are the
drawbacks in JobTracker
TaskTracker
Importance of
Key
Value Text Input
TaskTracker
Format
What are the
roles
of Floor, Annapurna Block, Aditya Enclave, Ameerpet,
#204,
2nd
TaskTracker
Hyderabad.
What are
Ph:the
040 6514 2345, 0970 320 2345. E-mail: info@orienit.com www

HADOOP Course
Content
By Mr. Kalyan, 7+ Years of Realtime
Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
Sequence File Input Format
Nline Input Format
Importance of Input Format in Map Reduce
How to use Input Format in Map Reduce
How to write custom Input Format's and its Record Readers
Output Format's in Map Reduce
Text Output Format
Sequence File Output Format
Importance of Output Format in Map Reduce
How to use Output Format in Map Reduce
How to write custom Output Format's and its Record Writers
Mapper
What is mapper in Map Reduce Job
Why we need mapper?
What are the Advantages and Disadvantages of mapper
Writing mapper programs
Reducer
What is reducer in Map Reduce Job
Why we need reducer ?
What are the Advantages and Disadvantages of reducer
Writing reducer programs
Combiner
What is combiner in Map Reduce Job
Why we need combiner?
What are the Advantages and Disadvantages of Combiner
Writing Combiner programs
Partitioner
What is Partitioner in Map Reduce Job
Why we need Partitioner?
What are the Advantages and Disadvantages of Partitioner
Writing Partitioner programs
Distributed Cache
What is Distributed Cache in Map Reduce Job
Importance of Distributed Cache in Map Reduce job
What are the Advantages and Disadvantages of Distributed Cache
Writing Distributed Cache programs
Counters
What is Counter in Map Reduce Job
Why we need Counters in production environment?
How to Write Counters in Map Reduce programs
Importance of Writable and Writable Comparable Api's
How to write custom Map Reduce Keys using Writable
How to write custom Map Reduce Values using Writable Comparable
Joins
Map Side Join
What is the importance of Map Side Join
Where
we
are
using
it it
Where
we
are
using
Reduce Side Join
#204, 2nd Floor, Annapurna Block, Aditya Enclave, Ameerpet,
What is the
Hyderabad.
importancePh:
of Reduce
040 6514 2345, 0970 320 2345. E-mail: info@orienit.com www

HADOOP Course
Content
By Mr. Kalyan, 7+ Years of Realtime
Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
What is the difference between Map Side join and Reduce Side Join?

Compression techniques
Importance of Compression techniques in production environment
Compression Types
NONE, RECORD and BLOCK
Compression Codecs
Default, Gzip, Bzip, Snappy and LZO
Enabling and Disabling these
techniques for all the Jobs
Enabling and Disabling these
techniques for a particular Job
Map Reduce Schedulers
FIFO Scheduler
Capacity Scheduler
Fair Scheduler
Importance of Schedulers in
production environment
How to use Schedulers in production
environment
Map Reduce Programming Model
How to write the Map Reduce jobs in
Java
Running the Map Reduce jobs in
local mode
Running the Map Reduce jobs in
pseudo mode
Running the Map Reduce jobs in
cluster mode
Debugging Map Reduce Jobs
How to debug Map Reduce Jobs in
Local Mode.
How to debug Map Reduce Jobs in
Remote Mode.

YARN (Next Generation Map


Reduce)
What is YARN?
What is the importance of YARN?
Where we can use the concept of

YARN in Real Time


What is difference between YARN
and Map Reduce
Data Locality
What is Data Locality?
Will Hadoop follows Data Locality?
Speculative Execution
What is Speculative Execution?
Will Hadoop follows Speculative
Execution?
#204, 2nd Floor, Annapurna Block, Aditya Enclave, Ameerpet,
Map Reduce Commands
Hyderabad.
Importance Ph:
of each
command
040 6514
2345, 0970 320 2345. E-mail: info@orienit.com www

HADOOP Course
Content
Apache PIG

By Mr. Kalyan, 7+ Years of Realtime


Exp.
Metastore
M.Tech,metastore
IIT Kharagpur,
Gold Medalist.
embedded
configuration

Introduction to Apache
Pig Map Reduce Vs
Apache Pig SQL Vs
external metastore configuration
Apache Pig Different
UDF's
data types in Pig
How to write the UDF's in Hive
Modes Of Execution
How to use the UDF's in Hive
in Pig
Importance of UDF's in Hive
Local Mode
UDAF's
Map Reduce Mode
How to use the UDAF's in Hive
Execution
Importance of UDAF's in Hive
Mechanism
UDTF's
Grunt Shell
How to use the UDTF's in Hive
Script
Importance of UDTF's in
Embedded
Hive How to write a complex
UDF's
Hive queries What is Hive Data
How to write the
Model?
UDF's in Pig
Partitions
How to use the
Importance of Hive
UDF's in Pig
Partitions
in production
Importance of
environment
UDF's in Pig
Limitations of Hive
Filter's
Partitions
How to write the
How to write Partitions
Filter's in Pig
Buckets
How to use the
Importance of Hive
Filter's in Pig
Buckets
in production
Importance of
environment
Filter's in Pig
How to write Buckets
Load Functions
Apache
SerDe Zookeeper
How to write the
Introduction
to zookeeper
Importance
of Hive
Load Functions in Pig
Pseudo
mode
installations
SerDe's
in
production
How to use the
environment
Zookeeper
cluster
Apache
Load Functions in Pig

How
to
write
installations
BasicSerDe
Importance
of Load
HIVE
Hive
programs
How
to integrate the
Functions in Pig
commands execution
Introduction
Hive
and Hbase
Store
Apache
Hive Functions
How to use the
architecture
Hbase Hbase
Store Functions in
Driver
introduction
Pig
Compiler
Hbase usecases
Importance of Store Functions
Semantic
Hbase basics
in Pig Transformations in Pig
Analyzer
Column
How to write the complex
Hive Integration with
families
pig scripts How to integrate
Hadoop Hive Query
Scans
the Pig and Hbase
Language(Hive QL) SQL
Hbase
installation
VS Hive QL
Local mode
Hive Installation and
Psuedo
Configuration Hive, Mapmode
Reduce and Local-Mode Hive
Cluster
DLL and DML Operations
mode
Hive Services
Hbase
#204, 2nd Floor, Annapurna Block, Aditya Enclave, Ameerpet,
CLI
Architecture
Hyderabad.
Hiveserver
Storage
Ph: 040 6514 2345, 0970 320 2345. E-mail:
info@orienit.com www

HADOOP Course
Content
Mapreduce
integration
Mapreduce over
Hbase
Hbase
Usage
Key
design
Bloom
Filters

Versionin
g

Coproces
sors
Filters
Hbase
Clients
REST SQOOP
Apache
Thrift
Introduction
to Sqoop
Hiveclient and Server
MySQL
Web
Installation
Sqoop Installation
Based
UI
How to connect to Relational Database
Hbase
using
Sqoop Sqoop Commands and
Admin
Examples on Import
Schema
and
Export commands
definition

Apache
Basic
FLUME
CRUD

operation to
Introduction
s Flume
fume
installation
Apache
Flume agent
usage
and Flume
OOZIE
examples
Introduction
execution to
oozie Oozie
installation
Executing oozie workfow
Apache
jobs Monitering Oozie
workfow jobs
Mahout

By Mr. Kalyan, 7+ Years of Realtime


Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
MongoDB
Introduction to
MongoDB MongoDB
Apache
installation
MongoDB
Nutch
examples
Introduction to
Nutch Nutch
Installation Nutch
Cloudera
Examples

Distribution
Introduction to Cloudera
Cloudera Installation
Cloudera Certification
details How to use
cloudera hadoop
Hortonworks
What are the main
differences
between
Distribution
Cloudera and Apache
Introduction
to Hortonworks
hadoop
Hortonworks Installation
Hortonworks Certification
details How to use
Hortonworks hadoop
What are the main differences between Hortonworks
and
ApacheEMR
hadoop
Amazon
Introduction to Amazon EMR and
Amazon Ec2 How to use Amazon EMR
and Amazon Ec2
Why to use Amazon EMR and
Importance
this New technologies
Advancedof and

architectural discussions

Mahout (Machine Learning


Algorithms) Storm (Real time
data streaming) Cassandra
(NOSQL database)
MongoDB (NOSQL
database) Solr (Search
Introduction to
engine)
mahout Mahout
Nutch (Web
installation Mahout
Crawler) Lucene
Apache
examples
(Indexing data)
Cassandra
Ganglia, Nagios
Introduction to
(Monitoring
tools) for this Course
Pre-Requisites
Cassandra Cassandra
Cloudera,
Hortonworks,
MapR,
Amazon
EMR
Java Basics
like OOPS
Concepts,
Interfaces,
Classes
examples
and
Abstract
Classes
etc
(Free
Java
classes
as
part of
Storm
(Distributions) How to crack the Cloudera
course) questions
Introduction to
certification

SQL
Basic Knowledge ( Free SQL classes as part of
Storm Storm
course)
examples
#204, 2nd Floor, Annapurna Block,
Enclave, Ameerpet,
Linux Aditya
Basic Commands
(Provided in our blog)
Hyderabad.
Ph: 040 6514 2345, 0970 320 2345. E-mail: info@orienit.com www

HADOOP Course
Content
By Mr. Kalyan, 7+ Years of Realtime
Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
Administration topics:
Hadoop Installations
Local mode (hands on installation on ur laptop)
Psuedo mode (hands on installation on ur laptop)
Cluster mode (hands on 20 node cluster setup in our lab)
Nodes Commissioning and De-commissioning in Hadoop Cluster
Jobs Monitoring in Hadoop Cluster
Fair Scheduler (hands on installation on ur laptop)
Capacity Scheduler (hands on installation on ur laptop)
Hive Installations
Local mode (hands on installation on ur laptop)
With internal Derby
Cluster mode (hands on installation on ur laptop)
With external Derby
With external MySql
Hive Web Interface (HWI) mode (hands on installation on ur laptop)
Hive Thrift Server mode (hands on installation on ur laptop)
Derby Installation (hands on installation on ur laptop)
MySql Installation (hands on installation on ur laptop)
Pig Installations
Local mode (hands on installation on ur laptop)
Mapreduce mode (hands on installation on ur laptop)
Hbase Installations
Local mode (hands on installation on ur laptop)
Psuedo mode (hands on installation on ur laptop)
Cluster mode (hands on installation on ur laptop)
With internal Zookeeper
With external Zookeeper
Zookeeper Installations
Local mode (hands on installation on ur laptop)
Cluster mode (hands on installation on ur laptop)
Sqoop Installations
Sqoop installation with MySql (hands on installation on ur laptop)
Sqoop with hadoop integration (hands on installation on ur laptop)
Sqoop with hive integration (hands on installation on ur laptop)
Flume Installation
Psuedo mode (hands on installation on ur laptop)
Oozie Installation
Psuedo mode (hands on installation on ur laptop)
Mahout Installation
Local mode (hands on installation on ur laptop)
Psuedo mode (hands on installation on ur laptop)
MongoDB Installation
Psuedo mode (hands on installation on ur laptop)
Nutch Installation
Psuedo mode (hands on installation on ur laptop)
#204, 2nd Floor, Annapurna Block, Aditya Enclave, Ameerpet,
Hyderabad.
Ph: 040 6514 2345, 0970 320 2345. E-mail: info@orienit.com www

HADOOP Course
Content
By Mr. Kalyan, 7+ Years of Realtime
Exp.
M.Tech, IIT Kharagpur, Gold Medalist.
Cloudera Hadoop Distribution installation
Hadoop
Hive
Pig
Hbase
Hue
HortonWorks Hadoop Distribution installation
Hadoop
Hive
Pig
Hbase
Hue

Hadoop ecosystem Integrations:


Hadoop and Hive Integration
Hadoop and Pig Integration
Hadoop and HBase Integration
Hadoop and Sqoop Integration
Hadoop and Oozie Integration
Hadoop and Flume Integration
Hive and Pig Integration
Hive and HBase integration
Pig and HBase integration
Sqoop and RDBMS Integration
Mahout and Hadoop Integration

What we are
offering to you
Hands on MapReduce programming around 20+ programs these will make you to pefect in

MapReduce both concept- wise and programatically


Hands on 5 POC's will be provided (These POC's will help you perfect in Hadoop and it's

ecosystems)
Hands on 20 Node cluster setup in our Lab.
Hands on installation for all the Hadoop and ecosystems in your laptop.
Well documented Hadoop material with all the topics covering in the course
Well documented Hadoop blog contains frequent interview questions along with the answers and
latest updates on BigData technology.
Real time projects explanation will be provided.
Mock Interviews will be conducted on one-to-one basis.
Discussing about hadoop interview questions daily base.
Resume preparation with POC's or Project's based on your experiance.

#204, 2nd Floor, Annapurna Block, Aditya Enclave, Ameerpet,


Hyderabad.
Ph: 040 6514 2345, 0970 320 2345. E-mail: info@orienit.com www

Vous aimerez peut-être aussi