080-42091111 , +91-8892499499

Hadoop Development

COURSE HIGHLIGHTS

Trainers are certified Real Time working professionals.

Main focus on Hands-on sessions.

Affordable course fee.

Course aligned to Cloudera Certification.

Flexible timings for working people.

100% money back guarantee.

Real-time projects on Hadoop.

Guidance in Resume Preparation.

100% placement assistance.

Post training support.

Life validity for attending classes.

Vacancy

Data Scientist
Big Data Visualizer
Big Data Research Analyst
Big Data Engineer
Big Data Architect
Big Data Analyst

  1. What is Big Data?
  2. Examples of Big Data
  3. Reasons of Big Data Generation
  4. Why Big Data deserves your attention
  5. Use cases of Big Data
  6. Different options of analyzing Big Data

  1. What is Hadoop
  2. History of Hadoop
  3. How Hadoop name was given
  4. Problems with Traditional Large-Scale Systems and Need for Hadoop
  5. Understanding Hadoop Architecture
  6. Fundamental of HDFS (Blocks, Name Node, Data Node, Secondary Name Node)
  7. Rack Awareness
  8. Read/Write from
  9. HDFS
  10. HDFS Federation and High Availability

  1. Setting up single node Hadoop cluster(Pseudo mode)
  2. Understanding Hadoop configuration files
  3. Hadoop Components- HDFS, MapReduce
  4. Overview Of Hadoop Processes
  5. Overview Of Hadoop Distributed File System
  6. The building blocks of Hadoop
  7. Hands-On Exercise: Using HDFS commands

  1. Understanding Map Reduce
  2. Job Tracker and Task Tracker
  3. Architecture of Map Reduce
  4. Data Flow of Map Reduce
  5. Hadoop Writable, Comparable& comparison with Java data types
  6. Creation of local files and directories with Hadoop API
  7. Creation of HDFS files and directories with Hadoop API
  8. Map Function& Reduce Function
  9. How Map Reduce Works
  10. Anatomy of Map Reduce Job
  11. Submission & Initialization of Map Reduce Job
  12. Monitoring & Progress of Map Reduce Job
  13. Understand Difference Between Block and Input Split
  14. Role of Record Reader, Shuffler and Sorter
  15. File Input Formats
  16. Getting Started With Eclipse IDE
  17. Setting up Eclipse Development Environment
  18. Creating Map Reduce Projects
  19. Configuring Hadoop API on Eclipse IDE
  20. Differences between the Hadoop Old and New APIs
  21. Life cycle of the Job
  22. Identity Reducer
  23. MapReduce program flow with wordcount
  24. Combiner &Partitioner, Custom Partitioner with example
  25. Joining Multiple datasets in MapReduce
  26. Map Side, Reduce Side joins with examples
  27. Distributed Cache with practical example
  28. Stragglers &Speculative execution
  29. Schedulers(FIFO Scheduler, FAIR Scheduler, CAPACITY Scheduler)

  1. Limitations of Current Architecture
  2. YARN Architecture
  3. Application Master, Node Manager & Resource Manager
  4. Writing a Map Reduce using YARN

  1. Introduction to Apache Hive
  2. Architecture of Hive
  3. Installing Hive
  4. Hive data types
  5. Exploring hive metastore tables
  6. Types of Tables in Hive
  7. Partitions(Static & Dynamic)
  8. Buckets & Sampling
  9. Indexes& Views
  10. Developing hive scripts
  11. Parameter Substitution
  12. Difference between order by & sort by
  13. Difference between Cluster by & distribute by
  14. File Input formats(Text file, RC, ORC, Sequence, Parquet)
  15. Optimization Techniques in HIVE
  16. Creating UDFs
  17. Hands-On Exercise
  18. Assignment on HIVE

  1. Introduction to Apache Pig
  2. Building Blocks ( Bag, Tuple & Field)
  3. Installing Pig
  4. Data types
  5. Different modes of execution of PIG
  6. Working with various PIG Commands covering all the functions in PIG
  7. Developing PIG scripts
  8. Parameter Substitution  => Command line arguments   =>Passing parameters though a param file
  9. Joins ( Left Outer, Right Outer, Full Outer)
  10. Nested queries
  11. Specialized joins in PIG (Replicated, Skewed & Merge Join)
  12. HCatalog(Getting data from hive to pig & vice versa)
  13. Working with un-structured data
  14. Working with Semi-structured data like XML, JSON
  15. Optimization techniques
  16. Creating UDFs
  17. Hands-On Exercise

Assignment on PIG

  1. Introduction to SQOOP& Architecture
  2. Installation of SQOOP
  3. Import data from RDBMS to HDFS
  4. Importing Data from RDBMS to HIVE
  5. Exporting data from HIVE to RDBMS
  6. Hands on exercise

  1. Introduction to HBASE
  2. Installation ofHBASE
  3. Exploring HBASE Master & Region server
  4. Exploring Zookeeper
  5. CRUD Operation of HBase with Examples
  6. HIVE integration with HBASE
  7. Hands on exercise on HBASE

  1. What is Oozie&Why Oozie
  2. Features of Oozie
  3. Job Types in Oozie
  4. Control Nodes& Action Nodes
  5. Oozie Workflow Process flow
  6. Oozie Parameterization
  7. Oozie Command Line examples – Developer
  8. Oozie Web Console
  9. Hands on exercise on OOZIE

We will be providing raw data & requirements for the project & you will have to work. Finally, we will have one Project execution session where we will be explaining the steps for project execution.
Workshops on Spark & Scala once in 2 months. Interested people can attend.

 Pre-requisites like Core Java, Basics of Linux & Basics of SQLare also covered as part of the training for free of cost.