SlideShare a Scribd company logo
Hadoop Ecosystem: FromBig
Data to Big Results
Name: Tazeen GulrezSayed
Class : TE - A
Roll number :62
Subject: Data Science And Big Data Analytics
1.Introduction to Hadoop Ecosystem
2.HDFS - Hadoop Distributed File System
3.YARN - Yet Another Resource
NegotiatorMapReduce
4.Other Components of Hadoop Ecosystem
5.Conclusion
INDEX
Hadoop is an open-source software framework that is
used for distributed storage and processing of big
data. It was created by Doug Cutting and Mike
Cafarella in 2005, and it has since become one of the
most popular big data processing platforms in the
world.
The Hadoop ecosystem consists of several
components, including HDFS (Hadoop Distributed File
System), YARN (Yet Another Resource Negotiator),
and MapReduce. These components work together to
provide a scalable, fault-tolerant platform for
processing large amounts ofdata.
Introduction to Hadoop
Ecosystem
HDFS - Hadoop Distributed File
System
Hadoop is an open-source software framework that is
used for distributed storage and processing of big
data. It was created by Doug Cutting and Mike
Cafarella in 2005, and it has since become one of the
most popular big data processing platforms in the
world.
The Hadoop ecosystem consists of several
components, including HDFS (Hadoop Distributed File
System), YARN (Yet Another Resource Negotiator),
and MapReduce. These components work together to
provide a scalable, fault-tolerant platform for
processing large amounts ofdata.
YARN - Yet Another Resource
Negotiator
YARN is the resource management layer of Hadoop. It
is responsible for managing resources in a Hadoop
cluster, such as CPU, memory, and disk space. YARN
allows multiple applications to run on the same
cluster without interfering with each other.
YARN also enables dynamic allocation of resources,
allowing applications to request additional resources
as needed. This makes it possible to run complex big
data applications that require significant amounts of
resources.
MapReduce
Map Reduce is a programming model used for
processing large datasets in parallel. It works by
breaking down a large dataset into smaller chunks,
which are then processed in parallel across multiple
nodes in a cluster. Map Reduce consists of two main
functions: map andreduce.
The map function takes input data and converts it into
key-value pairs, while the reduce function takes the
output of the map function and combines it into a
smaller set of key-value pairs. Map Reduce is highly
scalable and fault-tolerant, making it ideal for
processing large amounts ofdata.
Other Components ofHadoop
Ecosystem
In addition to HDFS, YARN, and MapReduce, the
Hadoop ecosystem includes several other
components that provide additional functionality.
These include Hive, Pig, HBase, and Spark.n addition
to HDFS, YARN, and MapReduce, the Hadoop
ecosystem includes several other components that
provide additional functionality. These include Hive,
Pig, HBase, and Spark.
Hive is a data warehouse system that provides SQL-
like querying capabilities for Hadoop. Pig is a high-
level platform for creating MapReduce programs.
HBase is a NoSQL database that provides real-time
access to data stored in Hadoop. Spark is a fast, in-
memory data processing engine that can be used with
Hadoop to perform real-time analytics.
CONCLUSION
The Hadoop ecosystem is a powerful platform
for processing large amounts of data. With its
distributed architecture, fault tolerance, and
scalability, Hadoop has become the go-to
solution for big data processing.
By understanding the various components of the
Hadoop ecosystem, businesses and
organizations can take advantage of its
capabilities to gain insights and make informed
decisions based on theirdata.
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx

More Related Content

Similar to 62_Tazeen_Sayed_Hadoop_Ecosystem.pptx

Big data
Big dataBig data
Big data
revathireddyb
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
Mishika Bharadwaj
 
Survey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization MethodsSurvey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization Methods
paperpublications3
 
Hadoop and its role in Facebook: An Overview
Hadoop and its role in Facebook: An OverviewHadoop and its role in Facebook: An Overview
Hadoop and its role in Facebook: An Overview
rahulmonikasharma
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
Hitendra Kumar
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
Omar Jaber
 
G017143640
G017143640G017143640
G017143640
IOSR Journals
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
IOSR Journals
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
PoojaShah174393
 
BIGDATA MODULE 3.pdf
BIGDATA MODULE 3.pdfBIGDATA MODULE 3.pdf
BIGDATA MODULE 3.pdf
DIVYA370851
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
chunkypandey12
 
Cppt
CpptCppt
Cppt
CpptCppt
A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
saisreealekhya
 
Lecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptxLecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptx
Anonymous9etQKwW
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
AshishRathore72
 
Hadoop map reduce
Hadoop map reduceHadoop map reduce
Hadoop map reduce
VijayMohan Vasu
 
Harnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution TimesHarnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution Times
David Tjahjono,MD,MBA(UK)
 

Similar to 62_Tazeen_Sayed_Hadoop_Ecosystem.pptx (20)

Big data
Big dataBig data
Big data
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Survey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization MethodsSurvey on Performance of Hadoop Map reduce Optimization Methods
Survey on Performance of Hadoop Map reduce Optimization Methods
 
Hadoop and its role in Facebook: An Overview
Hadoop and its role in Facebook: An OverviewHadoop and its role in Facebook: An Overview
Hadoop and its role in Facebook: An Overview
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Anju
AnjuAnju
Anju
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
 
G017143640
G017143640G017143640
G017143640
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
 
BIGDATA MODULE 3.pdf
BIGDATA MODULE 3.pdfBIGDATA MODULE 3.pdf
BIGDATA MODULE 3.pdf
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 
Cppt
CpptCppt
Cppt
 
Cppt
CpptCppt
Cppt
 
A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
 
Lecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptxLecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptx
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Hadoop map reduce
Hadoop map reduceHadoop map reduce
Hadoop map reduce
 
Harnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution TimesHarnessing Hadoop and Big Data to Reduce Execution Times
Harnessing Hadoop and Big Data to Reduce Execution Times
 

Recently uploaded

Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 

Recently uploaded (20)

Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 

62_Tazeen_Sayed_Hadoop_Ecosystem.pptx

  • 1. Hadoop Ecosystem: FromBig Data to Big Results Name: Tazeen GulrezSayed Class : TE - A Roll number :62 Subject: Data Science And Big Data Analytics
  • 2. 1.Introduction to Hadoop Ecosystem 2.HDFS - Hadoop Distributed File System 3.YARN - Yet Another Resource NegotiatorMapReduce 4.Other Components of Hadoop Ecosystem 5.Conclusion INDEX
  • 3. Hadoop is an open-source software framework that is used for distributed storage and processing of big data. It was created by Doug Cutting and Mike Cafarella in 2005, and it has since become one of the most popular big data processing platforms in the world. The Hadoop ecosystem consists of several components, including HDFS (Hadoop Distributed File System), YARN (Yet Another Resource Negotiator), and MapReduce. These components work together to provide a scalable, fault-tolerant platform for processing large amounts ofdata. Introduction to Hadoop Ecosystem
  • 4. HDFS - Hadoop Distributed File System Hadoop is an open-source software framework that is used for distributed storage and processing of big data. It was created by Doug Cutting and Mike Cafarella in 2005, and it has since become one of the most popular big data processing platforms in the world. The Hadoop ecosystem consists of several components, including HDFS (Hadoop Distributed File System), YARN (Yet Another Resource Negotiator), and MapReduce. These components work together to provide a scalable, fault-tolerant platform for processing large amounts ofdata.
  • 5. YARN - Yet Another Resource Negotiator YARN is the resource management layer of Hadoop. It is responsible for managing resources in a Hadoop cluster, such as CPU, memory, and disk space. YARN allows multiple applications to run on the same cluster without interfering with each other. YARN also enables dynamic allocation of resources, allowing applications to request additional resources as needed. This makes it possible to run complex big data applications that require significant amounts of resources.
  • 6. MapReduce Map Reduce is a programming model used for processing large datasets in parallel. It works by breaking down a large dataset into smaller chunks, which are then processed in parallel across multiple nodes in a cluster. Map Reduce consists of two main functions: map andreduce. The map function takes input data and converts it into key-value pairs, while the reduce function takes the output of the map function and combines it into a smaller set of key-value pairs. Map Reduce is highly scalable and fault-tolerant, making it ideal for processing large amounts ofdata.
  • 7. Other Components ofHadoop Ecosystem In addition to HDFS, YARN, and MapReduce, the Hadoop ecosystem includes several other components that provide additional functionality. These include Hive, Pig, HBase, and Spark.n addition to HDFS, YARN, and MapReduce, the Hadoop ecosystem includes several other components that provide additional functionality. These include Hive, Pig, HBase, and Spark. Hive is a data warehouse system that provides SQL- like querying capabilities for Hadoop. Pig is a high- level platform for creating MapReduce programs. HBase is a NoSQL database that provides real-time access to data stored in Hadoop. Spark is a fast, in- memory data processing engine that can be used with Hadoop to perform real-time analytics.
  • 8. CONCLUSION The Hadoop ecosystem is a powerful platform for processing large amounts of data. With its distributed architecture, fault tolerance, and scalability, Hadoop has become the go-to solution for big data processing. By understanding the various components of the Hadoop ecosystem, businesses and organizations can take advantage of its capabilities to gain insights and make informed decisions based on theirdata.