Big Data Hadoop Certification

This is course is designed for aspiring Hadoop Developers. We will start from scratch and at the end of the course you should be confident enough for any type of Hadoop interview or project.


  • What is Big Data?
  • What is Hadoop?
  • The relation between Big Data and Hadoop.
  • What is the need for going ahead with Hadoop?
  • Scenarios to apt Hadoop Technology in REAL TIME Projects
  • Challenges with Big Data
  • Storage
  • How Hadoop is addressing Big Data Changes
  • Comparison with Other Technologies
    Data Warehouse
  • Different Components of Hadoop Echo System
  • Storage Components
    Processing Components
  • What is a Cluster Environment?
  • Cluster Vs Hadoop Cluster.
  • The significance of HDFS in Hadoop
  • Features of HDFS
  • Storages aspects of HDFS
  • Block
    How to Configure block size
    Default Vs Configurable Block size
    Why HDFS Block size so large?
    Design Principles of Block Size
  • Name Node and its functionality
  • Data Node and its functionality
  • Job Tracker and its functionality
  • Task Track and its functionality
  • Secondary Name Node and its functionality.
  • Data Storage in Data Nodes
  • Fail Over Mechanism in Hadoop – Replication
  • Replication Configuration
  • Custom Replication
  • Design Constraints with Replication Factor
  • CLI(Command Line Interface) and HDFS Commands
  • Java Based Approach
  • Why is Map Reduce essential in Hadoop?
  • Processing Daemons of Hadoop
  • Job Tracker
  • Roles Of Job Tracker
    Drawbacks w.r.to Job Tracker failure in Hadoop Cluster
    How to Configure Job Tracker in the Hadoop cluster
  • Task Tracker
  • Roles of Task Tracker
    Drawbacks w.r.to Job Tracker Failure in Hadoop Cluster
  • Input Split
  • Need of Input Split in Map Reduce
  • Input Split Size
  • Input Split Size Vs Block Size
  • Input Split Vs Mappers
  • Communication Mechanism of Job Tracker & Task Tracker
  • Input Format Class
  • Record Reader Class
  • Success Case Scenarios
  • Failure Case Scenarios
  • Retry Mechanism in Map Reduce
  • Different phases of Map Reduce Algorithm
  • Different Data Types in Map Reduce
  • Primitive Data Types Vs Map Reduce Data Types
  • How to write a basic Map Reduce Program
  • Driver Code
    Mapper Code
    Reducer Code
  • Driver Code
  • Importance of Driver Code in a Map-Reduce Program
    How to Identify the Driver Code in Map Reduce Program
    Different sections of Driver code
  • Mapper Code
  • Importance of Mapper Phase in Map Reduce
    How to Write a Mapper Class?
    Methods in Mapper Class
  • Reducer Code
  • Importance of Reduce phase in Map Reduce
    How to Write Reducer Class?
    Methods in Reducer Class
    Identify Mapper and Reducer
  • Text Input Format
  • Key Value Text Input Format
  • Nine Input Format
  • DB Input Format
  • Sequence File Input Format.
  • How to Use the specific Output format in Map Reduce
  • Text Output Format
  • Key Value Text Input Format
  • Nine Input Format
  • DB Input Format
  • Sequence File Input Format.
  • How to Use the specific Output format in Map Reduce
  • Is combiner mandate in Map Reduce
  • How to Use the Combiner class in Map Reduce
  • Performance tradeoffs w.r.to Combiner
  • Importance of Practitioner class in Map Reduce
  • How to use the Partitioner class in Map Reduce
  • Hash Partitioner Functionality
  • How to write a custom Partitioner
  • Importance of Compression in Map Reduce
  • What is CODEC
  • Compression Types
  • Gzip Codec
  • Bzip Codec
  • LZO Codec
  • Snappy Codec
  • Configurations w.r.to Compression Techniques
  • How to customize the Compression per one job Vs all the job.
  • Joins – in Map Reduce
  • Map Side Join
    Reduce Side Join
    Distributed cache
  • How to debug MapReduce jobs in Local and Pseudo cluster Mode
  • Data Localization in Map Reduce
  • Introduction to Apache Pig
  • SQL Vs Apache Pig
  • Different data types in Pig
  • Modes of Execution in Pig
  • Local Mode
    Map Reduce OR Distributed Mode
  • Execution Mechanism
  • Grunt Shell
  • Embedded
  • Transformations in Pig
  • How to develop the Complex Pig Script
  • Bags, Tuples, and fields in PIG
  • UDF’s in Pig
  • Need for using UDF’s in PIG
    How to use UDF’s
    REGISTER keyword in PIG
  • When to use Map Reduce & Apache PIG in REAL TIME Projects
  • Hive Introduction
  • Need of Apache HIVE in Hadoop
  • Hive Architecture
  • Driver
    Executor(Semantic Analyzer)
  • Meta Store in Hive
  • Importance of Hive Meta Store
    Embedded meta store configuration
    External meta store configuration
    Communication mechanics with Metastore
  • Hive Integration with Hadoop
  • Hive Query Language (Hive QL)
  • Configuring Hive With MySQL Metastore
  • SQL Vs Hive QL
  • Data Slicing Mechanisms
  • Partitions in Hive
    Buckets In Hive
    Partitioning Vs Bucketing
    Real-Time Use Cases
  • Collection Data Types in HIVE
  • Array
  • User Defined Functions(UDFs) in HIVE
  • UDFs
    Need of UDFs in HIVE
  • Hive Serializer/De-serializer – SerDe
  • HIVE – HBase Integration
  • Introduction to Sqoop.
  • MySQL client and Server Installation
  • How to connect to Relational Database using Sqoop
  • Different Sqoop Commands
  • Different Flavors of Imports
  • HBase Introduction
  • HDFS Vs HBase
  • HBase Use cases
  • HBase basics
  • Column Families
  • HBase Architecture
  • Clients
  • REST
    Java Based
  • Map Reduce Integration
  • Map Reduce over HBase
  • HBase Admin
  • Scheme Definition
    Basic CRUD Operations
  • Flume Introduction
  • Flume Architecture
  • Flume Master, Flume Collector, and Flume Agent
  • Flume Configurations
  • Real-Time Use Case using Apache Flume
  • Oozie Introduction
  • Oozie Architecture
  • Oozie Configuration Files
  • Oozie Job Submission
  • Workflow.xml
  • What is YARN?
  • YARN Architecture
  • Resource Manager
    Application Master
    Node Manager
  • When should we go ahead with YARN
  • Classic Map Reduce Vs YARN Map Reduce, Different Configuration Files for YARN
  • The need for NoSQL Database
  • Relational Vs Non-Relational Databases
  • Introduction to MongoDB
  • Installation of MongoDB
  • Mongo DB Basic operations
  • Spark Architecture
  • Spark Processing with Use cases
  • Spark with SCALA
  • Spark With SQL
  • Hadoop Single Node Cluster Set Up(Hands on Installation on Laptops)
  • Operating System Installation
    JDK Installation
    SSH Configuration
    Dedicated Group & User Creation
    Hadoop Installation
    Different Configuration Files Setting
    Name node format
    Starting the Hadoop Daemons
  • PIG Installation (Hands on Installation on Laptops)
  • Local Mode
    Clustered Mode
    Bashrc file configuration
  • SQOOP Installation (Hand on Installation on Laptops)
  • Sqoop installation with MySQL Client
  • HIVE Installation(Hands on Installation on Laptops)
  • Local Mode
    Clustered Mode
  • HBase Installation(Hands on Installation on Laptops)
  • Local Mode
    Clustered Mode
  • Provided 2 POC’s to work with Hadoop and Its Components
  • Provided All the Materials Soft copy with Use cases
  • Provided Certification Assistance
  • Provided Project Exposure and Discussion

Course Highlights

        No. of hours: 50 Hours

        Star Rating: (5)



Online Virtual Class Room

Enroll in any above batch and attend live class at scheduled time


Upcoming Batches:


1. Who can learn this course?
Anyone who is interested and having knowledge in SQL server can easily learn this program.
Anyone who is interested and having knowledge in SQL server can easily learn this program.
Java can be good to have if you know but it is not mandatory to know
3. What are career opportunities from this course?
Hadoop is being used by most of top MNCs like google, yahoo, Facebook, Bank of America, United Health Group. You can get good opportunity after completing this course.
4. How long this course will take?
This course will take 50 hours practical live class. After that you can practice as much as you can.
5. Will you provide soft copy material?
Yes, we will share the soft copy material and you will also get recording of our live classes.
6. Do you provide projects to work on?
Yes, after completion of course you will to work on 2 or 3 projects to work. It will hand on experience to clear the interviews confidently.




Amardeep Kumar

Learned many interesting topics, I will implement all these in my regular course work


Abhishek Tiwari

This course was really fantastic.


Kartik Sharma

very Exciting & interesting. concepts explained well.


Komal Gupta

Excellent course! It was a good learning process.

Contact us

Phone: +91-9999468662 +91-9999468661

Email: info@wifilearning.com