HADOOP TRAINING COURSE MODULE


HADOOP TRAINING COURSE MODULE

HADOOP COURSE


The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is now an Apache Hadoop subproject. Apache Hadoop is a framework that allows for distributed processing of large data sets across clusters of commodity computers using a simple programming model.

hadoop-training jalandhar

It is used as a Processing Platform for Big Data processing by    using the “Map Reduce” Processing Paradigm. The main purpose of this “Hands-on Training on HADOOP” is to create awareness and enrich knowledge for research scholars, faculty and students in the area of Big Data using Hadoop.


6 WEEKS HADOOP COURSE MODULE

  1. Introduction to Hadoop
    1. What is Hadoop
    2. Current data challenges
    3. Big Data and its aspects
    4. Difference between Big Data and Traditional Framework
    5. Example: where to use Big Data
    6. Big Data Business Opportunities
  2. Hadoop and HDFS Architecture
    1. Hadoop architecture
    2. Hadoop Core Components
    3. HDFS Architecture
    4. Modes of Installation
    5. Hadoop Daemons
    6. Rack Awareness
    7. Data Replication
Download 6 Weeks Training Module

6 MONTHS HADOOP COURSE MODULE

  1. Introduction to Hadoop
    1. What is Hadoop
    2. Current data challenges
    3. Big Data and its aspects
    4. Difference between Big Data and Traditional Framework
    5. Example: where to use Big Data
    6. Big Data Business Opportunities
  2. Hadoop and HDFS Architecture
    1. Hadoop architecture
    2. Hadoop Core Components
    3. HDFS Architecture
    4. Modes of Installation
    5. Hadoop Daemons
    6. Rack Awareness
    7. Data Replication
  3. Overview of Linux and Hadoop Installation
    1. Installation of VMware and Linux
    2. Basic commands of Linux
    3. Hadoop Installation and configuration
    4. Practical Examples
    5. Hadoop Admin commands
  4. Map Reduce implementation and Hadoop Ecosystem
    1. Map Reduce Flowchart
    2. Map Reduce Demons and Architecture
    3. Understanding of sorting and shuffling phase
    4. Map slot and Reducer slot in cluster
    5. Writing and executing Map Reduce Program
    6. Map Ecosystem Architecture
  5. PIG Implementation
    1. Introduction to Pig
    2. Installation to Pig
    3. Data Loading and Data Extraction in Pig
    4. Data Transformation in Pig
    5. Understanding of functions/methods in Pig
    6. Practical exercise for working with Pig
  6. HIVE Implementation
    1. Introduction to Hive
    2. Installation to Hive
    3. Overview of Hive Query Language
    4. DDL and DML manipulations in Hive
    5. Partitions in Hive
    6. Practical exercise for working with Hive
Download 6 Months Training Module

Search

Contact Us

E2Matrix Shop No. 12, Sutlej Market Near Bus Stand- Jalandhar support@e2matrix.com +91 9041262727 +91 9779363902