Big data hadoop

Big Data- Hadoop
Presented By: Vikram Dey

Big Data- Hadoop
90% of world’s data is generated in the last few years
Big Data: Large dataset the cannot be processed using the
traditional computing techniques.
What comes under Big Data:
• Social Media Data
• Stock exchange Data
• Search engine data

Traditional Approach
Google’s Solution

HADOOP
• Developed by Doug Cutting, Mika Cafarella and Team
• Open Source Project that works on MapReduce algorithm
• Apache Hadoop is a registered trademark of Apache
Software Foundation

HADOOP Framework
• Hadoop Common: Java Libraries
• Hadoop YARN: Job Scheduling and
cluster management framework
•Hadoop HDFS: Distributed File
System that provides high-
throughput access to application
data
MapReduce: Software framework
for parallel processing of large data
sets

How Does HADOOP Work?
Stage 1
User submit a job to the Hadoop Job-Client
for required process by specifying :
• the input and output files location in DFS
• Job configuration by setting different
parameters specific to the jobStage 2
• The Hadoop job client then submits the
job and configuration to JobTracker
• JobTracker distributes the configuration
to the slaves, scheduling tasks and
monitoring them, providing status to job-
client.

How Does HADOOP Work?
Stage 3
TaskTracker executes the task as per
MapReduce implementation and output is
stored into output files on the file system.

   Thank You   

Big data hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big data hadoop

Similar to Big data hadoop (20)

Recently uploaded

Recently uploaded (20)

Big data hadoop