2. Hadoop
What is MapReduce
an example
MapReduce Process
Job Tracker & Task Tracker
Anatomy of File Write
Anatomy of File Read
Replication & Rack awareness
3. Hadoop is a framework that allows
for distributed processing of large
data sets across clusters of
commodity computers using a
simple programming model
4. Hadoop was designed to enable applications to make most out of cluster
architecture by addressing two key points:
1. Layout of data across the cluster ensuring data is evenly distributed
2. Design of applications to benefit from data locality
It brings us two main mechanism of hadoop hdfs and hadoop MapReduce
Hello everyone. Welcome to the session on Demystifying Big Data & Hadoop.
In this session we will discuss the buzzword Big Data & Hadoop. What is big data and what is not big data.
I am Prakriti. I have 15 years of experience in Reporting and BI.
102,205,170
It is an Open-source Data Management with scale-out storage and distributed processing.