Vous êtes sur la page 1sur 6

IND PH: +91-9000380723

USA PH: +1-(999)-666-5174

HADOOP DEVELOPER CONTENT


Introduction to Hadoop
What is Distributed File System? Problems with Traditional Large-Scale Systems Introduction to Hadoop Brief history of Hadoop RDBMS/SQL vs. Hadoop DWH vs. Hadoop Scaling with Hadoop Introduction to the Hadoop Ecosystem Business Use cases on Health Care /Banking Industry Assignment -1

HADOOP Cluster Setup


Hadoop Installation & Configuration Setting up Standalone system Setting up pseudo distributed cluster Installing Hadoop in Pseudo Distributed Mode , Understanding Important configuration files ,their Properties and Demon Threads Hadoop Daemon Addresses and Ports, Other Hadoop Properties SSH Configuration Basic Unix/Linux Commands Hands-On Assignment -2

HDFS Deep Dive


Significance of HDFS in Hadoop Features of HDFS HDFS Architecture Daemons of Hadoop Name Node and its functionality Data Node and its functionality Secondary Name Node and its functionality Job Tracker and its functionality Task Track and its functionality Data Flow (Anatomy of a File Read, Anatomy of a File Write, Coherency Model)
1|Page

Email: hodoopbykoti@gmail.com

IND PH: +91-9000380723

USA PH: +1-(999)-666-5174

Heartbeats, Data Node commissioning/decommissioning Rack Awareness, Block Scanner, Balancer, Trash, Health Check Exploring the HDFS Web UI Parallel Copying with DISTCP Hadoop Archives Hadoop Commands Hands on Live Environment Assignment -3 Map Reduce The Map Reduce Flow Hadoop Data Types Functional - Concept of Mappers, Functional - Concept of Reducers Basic Map Reduce API Concepts Writing Map Reduce Drivers, Mappers and Reducers in Java The Execution Framework Combiner Partitioner Shuffle and Sort Speculative Execution Speeding Up Hadoop Development by Using Eclipse Hands-On Exercise: Writing a Map Reduce Program Differences Between the Old and New Map Reduce APIs Exploring the Map Reduce Web UI Creating Input and Output Formats in Map Reduce Jobs Text Input Format Key Value Input Format Sequence File Input Format How to debug Map Reduce Jobs in Local and Pseudo cluster Mode. OutPut Formats (TextOutput, BinaryOutPut, Multiple Output) Joining Data sets in Map Reduce Delving Deeper Into The Hadoop API More Advanced Map Reduce Programming Graph Manipulation in Hadoop Algorithms Traversing Graph etc. Business Use Case: Facial Recognition against CCTV video files using Map Reduce Unit Testing Map Reduce Jobs. Assignment -4
2|Page

Email: hodoopbykoti@gmail.com

IND PH: +91-9000380723

USA PH: +1-(999)-666-5174

Pigs Eat Anything


What Is Pig? Pig Use Cases How Pig Works Installing and Configuring Pig Pig Latin and the Grunt shell Modes Of Execution in Pig Local Mode Map Reduce OR Distributed Mode Loading data Data types and schemas Pig Latin details: structure, functions, expressions, relational operators Intro to User Defined Functions and Scripts How to write pig script Advance Pig Latin, Evaluation and Filter functions, Pig and Ecosystem Real time use cases Health Care Industry Hands on Exercise: Using Pig for ETL Processing Assignment -5

Hive for Structured Data


Hive Introduction Hive Architecture Hive Meta Store Comparison with Traditional Database (Schema on Read Versus Schema on Write, Updates, Transactions and Indexes) Hive Schema and Data Storage Hive Setup and Configuration Hive vs Pig HiveQL and Hive Shell Creating Hive Tables Loading Data into Hive Retrieving Data with the SELECT Command Joining Tables Storing Query Results in HDFS Partitioning Data Bucketing Data Hive Variables
3|Page

Email: hodoopbykoti@gmail.com

IND PH: +91-9000380723

USA PH: +1-(999)-666-5174

The Hive CLI Hive and Thrift Hive Transform Hands on Exercises Playing with huge data and Querying extensively Debugging and Troubleshooting Hive User Defined Functions Appending Data into existing Hive Table Custom Map/Reduce in Hive Overview of Text Processing Important String Functions Using Regular Expressions in Hive Sentiment Analysis and N-Grams Hands on Exercise Assignment -6 Real-time I/O with HBase HBase Introduction HBase Architecture HBase versions and origins HBase vs. RDBMS HBase Master and Region Servers Data Modeling Column Families and Regions Bloom Filters and Block Indexes Write Pipeline/ Read Pipeline Catalog Tables Compactions The HBase Shell Running the Shell Creating the Tables Accessing Data in Tables Administration Scripting HBase Administration Monitoring Backup Tools Compression
4|Page

Email: hodoopbykoti@gmail.com

IND PH: +91-9000380723

USA PH: +1-(999)-666-5174

Managed Operations Capacity Planning Map Reduce Integration Assignment -7

Sqoop
Introduction ETL Concepts Introduction to Sqoop Setup and Configuration of Sqoop MySQL client and Server Installation How to connect to Relational Database using Sqoop Sqoop Import Connecting to a Database Server Selecting the Data to Import Free-form Query Imports Controlling Parallelism Controlling the Import Process Controlling type mapping Incremental Imports File Formats Importing Data into Hive Importing Data into Hbase Hands on Exercise Working with Imported Data Importing Large Objects Sqoop Export Introduction Inserts vs Updates Exports and Transactions Hands on Exercise Assignment -8

Flume
What is Flume? Setup and Configuration of Flume Flume Architecture How it works?
5|Page

Email: hodoopbykoti@gmail.com

IND PH: +91-9000380723

USA PH: +1-(999)-666-5174

Reliability Scalability Manageability Extensibility Assignment -9

Zookeeper
The Zookeeper Service (Data Modal, Operations, Implementation, Consistency, Sessions, States) Building Applications with Zookeeper (Zookeeper in Production) Assignment -10

REAL TIME PROJECT


Health Care Dataset: It has all the details of Health Care System over a period of time using which you may find out Member policy logins, Provide Services, Treatment Methadone Abstract, Early Dropout Abstract, Payment Processing to Providers and agents etc.

Additional Features
Cloudera HADOOP Developer/Admin Certification Guidance HADOOP Installation process and Configuration Well Versed Materials Which Covers Hadoop Ecosystem, UNIX and JAVA Separate JAVA and Unix Training for Beginners We also have a 24x7 Support

6|Page

Email: hodoopbykoti@gmail.com