-
Module-1 1
-
Module-2 1
-
-
Module-3 1
-
-
Module-4 1
-
-
Module-5 1
-
-
Module-6 1
-
-
Module-7 1
-
-
Module-8 1
-
-
Module-9 1
-

Hadoop
Apache Hadoop is a group of open-source software utilities which facilitate using a network of many computers to solve problems involving large data and computation. It supplies a software framework for distributed storage and processing of big data by using the MapReduce programming model. HDFS has an exceptionally large amount of data and makes access easier. The files are getting stored in multiple machines as there is huge data to store. MapReduce is a programming model which is implemented for processing and generating big data with a parallel, distributed algorithm in a cluster.
Objective
In this module, you will understand Big Data, the limitations of the existing solutions for Big Data problem, how Hadoop solves the Big Data problem, the common Hadoop ecosystem components
- Introduction to Hadoop
- Hadoop Architecture
- HDFS Architecture
- Hadoop Environment Setup for hands on
- HDFS Commands
- YARN & Its Architecture
- MapReduce
Case Studies
- Hadoop installation
- HDFS Commands
- Eclipse IDE (Integrated Development Environment) Setup
- MapReduce hands-on