Vous êtes sur la page 1sur 8

Big Data / Hadoop Administration Training

Course Content
BigData/Hadoop Admin Course Content
1. Module – 1 & Session - 1

a. Understanding Big Data Basics

b. Big Data Use Cases

c. Introduction to Hadoop
INTRODUCTION TO HADOOP

d. Understanding Hadoop Ecosystem

e. Introduction to HDFS

i. Introduction to Namenode

ii. Introduction to Datanode

iii. Introduction to Secondary Namenode

f. Introduction to MapReduce

i. Introduction to JobTracker

ii. Introduction to TaskTracker

g. Summarizing Hadoop Architecture

h. Roles and Responsibilities of a Hadoop Administrator


2. Module – 2 & Session – 2 & 3

a. Linux internals

i. Commands that are required

ii. Linux basics

b. Hadoop Cluster Installation Pre-requisites

i. Pre-requisites of Hadoop Installation

1. Softwares Download

2. Preparing your VM

3. Enabling VM with VMware


HADOOP 1

4. Understanding mandatory changes in the operating system

c. Installation and Configuration

i. Understanding Hadoop cluster installation modes

ii. Understanding Hadoop version 1 installation and configuration

iii. Passwordless SSH setup

3. Session - 4

a. Hands-On Practice for creating a Hadoop cluster

i. Helping individually in practicing Hadoop cluster installation


4. Module – 3 & Session - 5

a. Hadoop Cluster Planning

i. Recommended Hadoop cluster configuration

1. Hardware/Software/Network

2. Recommended configuration for Master and Slave Nodes

3. Sample Base configuration

4. Hadoop Different Distributions in the market

b. Hadoop performance tuning

i. Important Hadoop tuning parameters to understand


UNDERSTANDING HADOOP INTERNALS

ii. Hadoop Cluster Benchmarking Jobs – How to run the jobs

5. Module – 4 & Session – 6 & 7

a. Job Schedulers

i. FIFO Scheduler

ii. Fair Scheduler

b. Backup and Recovery

i. Data backup

ii. Meta-data backup

iii. Hadoop Quotas

iv. Safemode

v. Hadoop Ports

c. DistCP

d. Security

i. How to secure your cluster using Kerberos

e. Upgrades

i. Upgrading Hadoop cluster from Hadoop 1 to Hadoop 2


6. Module – 5 & Session – 8

a. Hadoop 2.0 new features

b. YARN

i. Understanding Resource Manager

ii. Understanding Application Master

iii. Understanding Node Manager

iv. Understanding Hadoop 2 Job Execution Framework


HADOOP 2

c. Hadoop 2 Multi-node cluster creation

i. Pre-requisites of Hadoop Installation

ii. Softwares Download

iii. Preparing your VM

iv. Enabling VM with VMware

v. Understanding mandatory changes in the operating system

vi. Installation and Configuration

vii. Understanding Hadoop version 2 installation and configuration

viii. Passwordless SSH setup


7. Session - 9

a. Practice Hadoop 2 multi-node Cluster Creation

i. Helping individuals in practicing Hadoop 2 cluster installation

b. Sample Yarn Job execution

8. Module – 6 & Session – 10 & 11


HADOOP 2

a. Understanding Issues of Hadoop 1

b. Understanding improvements in Hadoop 2

c. Namenode Federation

i. Enable segregation of HDFS using multiple namenodes

d. Namenode – High Availability

i. Achieving Namenode High-Availability using Quorum Journal Manager

ii. Achieving Namenode High-Availability using Network File System

9. Session - 12

a. Implementation of NN High Availability


i. Helping individuals achieving Namenode High Availability
10. Module – 7 & Session – 13, 14

a. Hadoop Ecosystem Introduction

i. Understanding the integration of Hadoop ecosystem

b. Touchbase with Hive

i. What is Hive

ii. Architecture of Hive

iii. Understanding Hive metastore concepts

c. HBase

i. Understading HBase Basics

ii. Understanding HBase storage Model

iii. Understanding HBase Architecture


HADOOP ECOSYSTEM

iv. Cluster Installation and Configuration


ECOSYSTEM

d. Pig

i. What is Pig?

ii. How Pig integrates with Hadoop cluster?

iii. Demo of Pig Jobs using MapReduce

e. Sqoop

i. What is Sqoop?

ii. How to import and export the data from Sqoop to RDBMS?

iii. Example of Sqoop jobs using MySQL

f. Flume

i. What is Flume?

ii. Sample Flume jobs


11. Module – 8 & Session - 15

a. Understanding the internals of Cloudera Manager


CLOUDERA CLUSTER
INSTALLATION

b. Understanding the automation of Hadoop installation using Cloudera Manager

c. Understanding Cloudera Hadoop Distribution and Cloudera Manager

d. Understanding the underlying directory structure of Cloudera Hadoop

e. Cloudera Hadoop Cluster Installation – CDH

Vous aimerez peut-être aussi