Vous êtes sur la page 1sur 3

Hadoop Learning Catalogue

Rahul Chaudhari

Course Objective: Help develop professional value for individuals


interested in technology and solution across domains by delivering enough
theory and practical Modules that simulate industry level requirements and
workflows.
Module 01 - INTRODUCTION
Introduction to Hadoop
o Compare Hadoop vs. Traditional large scale systems
o How Hadoop comes to the rescue
o Problems solved by Hadoop
o The Apache Hadoop project, distributions and components
o HDFS Architecture Discussion
Module 02 - BIG PICTURE & MAP REDUCE
Hadoop & Big picture
HDFS Architecture Deep-Dive
What is Map Reduce
o Overview
o Hello World Program Word Count
o Map Reduce Deep-Dive
o Subtle differences with MRV1 & MRV2
Module 03 - GETTING STARTED
Introducing Hortonworks and Cloudera Distributions
Introduction to Cloudera Manager Interface
Introduction to Ambari
Learn Hadoop using Hue
o Introduction
o Navigate & Browse HDFS
o Hive Interactions
Creating New tables
Running simple to complex Hive QL
HBase Interactions
Create New Table & Column families
Insert, Update and Delete Operations
Query data
o Run PIG scripts
Module 04 - HADOOP INSTALLATION

Module

Module

Module

Module

Module

Module

Installation modes Local, Pseudo Distributed, Fully Distributed


Overview of Installation procedure
Configuration Files Overview
Hortonworks HDP Installation Demo (optional)
05 - HADOOP INSTALLATION continues
Operationalizing Hadoop Cluster
Cluster Monitoring Tools
Hadoop CLI
Hadoop logging and troubleshoot
06 - APACHE HIVE Deep Dive
Hive Architecture
Hive DDL Create/Show/Drop Database/Table
Hive Table Types Internal/External/ORC/Compression
Hive QL Select, Filter, Join, Group By
Difference Hive & RDBMS
07 - APACHE HIVE Advanced
Custom Map Reduce with Hive
Hive SerDe
Hive UDF
Hive UDAF
Sample Exercises and Solution
08 - APACHE PIG Deep Dive
PIG Architecture
PIG Data Types
PIG Latin Language Constructs
Sample Exercises and Solution
09 - NoSQL Databases
Introduction to NoSQL
CAP Theorem
RDBMS Vs NoSQL
Types of NoSQL Databases
HBase Introduction
10 - APACHE HBASE
Use cases for HBase
HBase Architecture
HBase Deep Dive

Module

Module

Module

Module

Module

HBase CLI
HBase Programming API Introduction
11 - APACHE SQOOP
How it Works
Performance Considerations
Exercises to move Data to/from MySQL
12 - APACHE OOZIE
Introduction
Scheduling Basics
Exercise to create and run a workflow
13 - APACHE FALCON
Need for Apache Falcon
Architecture
Exercise to create a data pipeline
14 - ADVANCED MAPREDUCE (optional)
Custom Partitioner
Map Side Join
Distributed Join
Distributed Cache
Reduce Side Join
Counters
Custom Input Format
Understanding Output Formats
JUnit and MRUnit Testing Frameworks
15 - MAPREDUCE Design Patterns (optional)
Why Study Design Patterns
Types of MR Design Patterns
Examples and Exercises

Vous aimerez peut-être aussi