Big Data Hadoop Certification

This is course is designed for aspiring Hadoop Developers. We will start from scratch and at the end of the course you should be confident enough for any type of Hadoop interview or project.

Curriculum:

Module 1- Introduction to Hadoop

What is Big Data?
What is Hadoop?
The relation between Big Data and Hadoop.
What is the need for going ahead with Hadoop?
Scenarios to apt Hadoop Technology in REAL TIME Projects
Challenges with Big Data

How Hadoop is addressing Big Data Changes
Comparison with Other Technologies

Different Components of Hadoop Echo System

Module 2- Introduction to HDFS (Hadoop Distributed File System)

What is a Cluster Environment?
Cluster Vs Hadoop Cluster.
The significance of HDFS in Hadoop
Features of HDFS
Storages aspects of HDFS

Module 3 – HDFS Architecture – 5 Daemons of Hadoop

Name Node and its functionality
Data Node and its functionality
Job Tracker and its functionality
Task Track and its functionality
Secondary Name Node and its functionality.

Module 4 – Data Storage in Data Nodes – Fail Over Mechanism

Data Storage in Data Nodes
Fail Over Mechanism in Hadoop – Replication
Replication Configuration
Custom Replication
Design Constraints with Replication Factor

Module 5 – Accessing HDFS

CLI(Command Line Interface) and HDFS Commands
Java Based Approach

Module 6- Map Reduce

Why is Map Reduce essential in Hadoop?
Processing Daemons of Hadoop
Job Tracker

Task Tracker

Module 7- Input Split

Input Split
Need of Input Split in Map Reduce
Input Split Size
Input Split Size Vs Block Size
Input Split Vs Mappers

Module 8 – Map Reduce Life Cycle

Communication Mechanism of Job Tracker & Task Tracker
Input Format Class
Record Reader Class
Success Case Scenarios
Failure Case Scenarios
Retry Mechanism in Map Reduce

Module 9 - Map Reduce Programming Model

Different phases of Map Reduce Algorithm
Different Data Types in Map Reduce
How to write a basic Map Reduce Program

Driver Code

Mapper Code

Reducer Code

Module 10 – Input format in Map reduce

Text Input Format
Key Value Text Input Format
Nine Input Format
DB Input Format
Sequence File Input Format.
How to Use the specific Output format in Map Reduce

Module 11 – Output Format’s in Map Reduce

Text Output Format
Key Value Text Input Format
Nine Input Format
DB Input Format
Sequence File Input Format.
How to Use the specific Output format in Map Reduce

Module 12- Combiner and Practitioner in Map Reduce

Is combiner mandate in Map Reduce
How to Use the Combiner class in Map Reduce
Performance tradeoffs w.r.to Combiner
Importance of Practitioner class in Map Reduce
How to use the Partitioner class in Map Reduce
Hash Partitioner Functionality
How to write a custom Partitioner

Module 13- Compression techniques in Map Reduce

Importance of Compression in Map Reduce
What is CODEC
Compression Types
Gzip Codec
Bzip Codec
LZO Codec
Snappy Codec
Configurations w.r.to Compression Techniques
How to customize the Compression per one job Vs all the job.

Module 14- Joins and Data Localization in Map Reduce

Joins – in Map Reduce

How to debug MapReduce jobs in Local and Pseudo cluster Mode
Data Localization in Map Reduce

Module 15- Introduction to Apache Pig

Introduction to Apache Pig
SQL Vs Apache Pig
Different data types in Pig
Modes of Execution in Pig

Module 16 – Scripting in PIG

Execution Mechanism

Module 17 – Develop Complex Pig Script

Embedded
Transformations in Pig
How to develop the Complex Pig Script

Module 18 – Bags, Tuples and UDFs in PIG

Bags, Tuples, and fields in PIG
UDF’s in Pig

When to use Map Reduce & Apache PIG in REAL TIME Projects

Module 19 – Introduction to Apache Hive

Hive Introduction
Need of Apache HIVE in Hadoop
Hive Architecture

Meta Store in Hive

Module 20- Hive Query Language Scripting

Hive Integration with Hadoop
Hive Query Language (Hive QL)
Configuring Hive With MySQL Metastore
SQL Vs Hive QL

Module 21 – Data Slicing Mechanisms and Data Types in Hive

Data Slicing Mechanisms

Collection Data Types in HIVE

Module 22- UDFS in HIVE

User Defined Functions(UDFs) in HIVE

Hive Serializer/De-serializer – SerDe
HIVE – HBase Integration

Module 23 –Apache SQOOP and it’s commands

Introduction to Sqoop.
MySQL client and Server Installation
How to connect to Relational Database using Sqoop
Different Sqoop Commands

Module 24- Apache HBase and Map reduce Integration

HBase Introduction
HDFS Vs HBase
HBase Use cases
HBase basics

HBase Architecture
Clients

Map Reduce Integration
Map Reduce over HBase
HBase Admin

Module 25- APACHE FLUME

Flume Introduction
Flume Architecture
Flume Master, Flume Collector, and Flume Agent
Flume Configurations
Real-Time Use Case using Apache Flume

Module 26 – Apache Oozie

Oozie Introduction
Oozie Architecture
Oozie Configuration Files
Oozie Job Submission

Module 27 - YARN (Yet Another Resource Negotiator)-Next Gen.Map Reduce

What is YARN?
YARN Architecture

When should we go ahead with YARN
Classic Map Reduce Vs YARN Map Reduce, Different Configuration Files for YARN

Module 28 - MongoDB (As part of NoSQL Databases)

The need for NoSQL Database
Relational Vs Non-Relational Databases
Introduction to MongoDB
Installation of MongoDB
Mongo DB Basic operations

Module 29 – Apache Spark

Spark Architecture
Spark Processing with Use cases
Spark with SCALA
Spark With SQL

Module 30 - Hadoop Administration

Hadoop Single Node Cluster Set Up(Hands on Installation on Laptops)

Module 31 – PIG, SQOOP, HIVE and HBase installation

PIG Installation (Hands on Installation on Laptops)

SQOOP Installation (Hand on Installation on Laptops)
HIVE Installation(Hands on Installation on Laptops)

HBase Installation(Hands on Installation on Laptops)

Module 32– Take Away from the Training

Provided 2 POC’s to work with Hadoop and Its Components
Provided All the Materials Soft copy with Use cases
Provided Certification Assistance
Provided Project Exposure and Discussion

Course Highlights

No. of hours: 50 Hours

Star Rating: (4)

Trainer:

Suraj Sharma

A working Professional and a Training Enthusiast. He has almost 10 Years of Corporate Experience.

Fee:

Online Virtual Class Room

Enroll in any above batch and attend live class at scheduled time

14999

Upcoming Batches:

FAQ:

1. Who can learn this course?

Anyone who is interested and having knowledge in SQL server can easily learn this program.

Anyone who is interested and having knowledge in SQL server can easily learn this program.

Java can be good to have if you know but it is not mandatory to know

3. What are career opportunities from this course?

Hadoop is being used by most of top MNCs like google, yahoo, Facebook, Bank of America, United Health Group. You can get good opportunity after completing this course.

4. How long this course will take?

This course will take 50 hours practical live class. After that you can practice as much as you can.

5. Will you provide soft copy material?

Yes, we will share the soft copy material and you will also get recording of our live classes.

6. Do you provide projects to work on?

Yes, after completion of course you will to work on 2 or 3 projects to work. It will hand on experience to clear the interviews confidently.

Recommended Courses:

Recommended courses

Data Visualization with Tableau

Microsoft Power BI

Core Python

Java Programming

Data Science with Python

Linux Administration

Reviews:

Jan-06-2021

Amardeep Kumar

Learned many interesting topics, I will implement all these in my regular course work

Jan-04-2021

Abhishek Tiwari

This course was really fantastic.

Nov-27-2020

Kartik Sharma

very Exciting & interesting. concepts explained well.

Nov-27-2020

Komal Gupta

Excellent course! It was a good learning process.

Contact us

+91-9999468662 +91-9999468661 info@wifilearning.com