SlideShare a Scribd company logo
 Introduction to Hadoop
 Hadoop nodes & daemons
 Hadoop Architecture
 Characteristics
 Hadoop Features
2
The Technology that empowers Yahoo, Facebook, Twitter, Walmart and
others
Hadoop
3
An Open Source framework
that allows distributed
processing of large data-sets
across the cluster of
commodity hardware
4
An Open Source framework
that allows distributed
processing of large data-sets
across the cluster of
commodity hardware
Open Source
 Source code is freely
available
 It may be redistributed
and modified
5
An open source framework
that allows Distributed
Processing of large data-sets
across the cluster of
commodity hardware
Distributed Processing
 Data is processed/
distributed on multiple
nodes / servers
 Multiple machines
processes the data
independently
6
An open source framework
that allows distributed
processing of large data-sets
across the Cluster of
commodity hardware
Cluster
 Multiple machines
connected together
 Nodes are connected via
LAN
7
An open source framework
that allows distributed
processing of large data-sets
across the cluster of
Commodity Hardware
Commodity Hardware
 Economic / affordable
machines
 Typically low
performance hardware
8
 Open source framework written in Java
 Inspired by Google's Map-Reduce programming model as
well as its file system (GFS)
9
Hadoop defeated
Super computer
Hadoop became
top-level project
launched Hive,
SQL Support for Hadoop
Development of
started as Lucene sub-project
published GFS &
MapReduce papers
2002 2003 2005 2006 2008
Doug Cutting started
working on
Doug Cutting added
DFS & MapReduce
in
converted 4TB of
image archives over
100 EC2 instances
Doug Cutting
joined Cloudera
2009
2004
Hadoop History
2007
10
Hadoop consists of three key parts
11
Master Node Slave Node
Nodes
12
Master Node Slave Node
Resource
Manager
NameNode
Node
Manager
DataNode
Nodes
13
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
Sub
Work
14
15
 Source code is freely
available
 Can be redistributed
 Can be modified
Free
Affordabl
e
Communi
ty
Transpare
nt
Inter-
operable
No
vendor
lock
Open
Source
16
 Data is processed
distributedly on cluster
 Multiple nodes in the
cluster process data
independently
Centralized Processing
Distributed Processing
17
 Failure of nodes are
recovered automatically
 Framework takes care of
failure of hardware as well
tasks
18
 Data is reliably stored on
the cluster of machines
despite machine failures
 Failure of nodes doesn’t
cause data loss
19
 Data is highly available
and accessible despite
hardware failure
 There will be no downtime
for end user application
due to data
20
 Vertical Scalability – New
hardware can be added to
the nodes
 Horizontal Scalability –
New nodes can be added
on the fly
21
 No need to purchase costly license
 No need to purchase costly hardware
Economic
Open
Source
Commodity
Hardware =
+
22
 Distributed computing
challenges are handled by
framework
 Client just need to concentrate
on business logic
23
 Move computation to data
instead of data to
computation
 Data is processed on the
nodes where it is stored
Storage Servers App Servers
Dat
a
Dat
a
Dat
a
Dat
a
Servers
Dat
a
Dat
a
Dat
a
Dat
a
Algorith
m
Alg
o
Alg
o
Alg
o
Alg
o
24
 Everyday we generate 2.5 quintillion bytes of data
 Hadoop handles huge volumes of data efficiently
 Hadoop uses the power of distributed computing
 HDFS & Yarn are two main components of Hadoop
 It is highly fault tolerant, reliable & available
25
Hadoop_Introduction_pptx.pptx

More Related Content

Similar to Hadoop_Introduction_pptx.pptx

2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
Anand Haridass
 
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptxOPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
AltafKhadim
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
Mishika Bharadwaj
 
Hadoop architecture-tutorial
Hadoop  architecture-tutorialHadoop  architecture-tutorial
Hadoop architecture-tutorial
vinayiqbusiness
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
Chirag Ahuja
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Ahmed Elsayed
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data Engineering
Anant Corporation
 
Bigdata and Hadoop Introduction
Bigdata and Hadoop IntroductionBigdata and Hadoop Introduction
Bigdata and Hadoop Introduction
umapavankumar kethavarapu
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
maharajothip1
 
Introduction to hazelcast
Introduction to hazelcastIntroduction to hazelcast
Introduction to hazelcast
Emin Demirci
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
Chris Almond
 
Apache Hadoop- Hadoop Basics.pptx
Apache Hadoop- Hadoop Basics.pptxApache Hadoop- Hadoop Basics.pptx
Apache Hadoop- Hadoop Basics.pptx
Miraj Godha
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
IRJET Journal
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant Dance
EMC
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14
John Sing
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
Humoyun Ahmedov
 
BDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdfBDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdf
KUMARRISHAV37
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
MLconf
 

Similar to Hadoop_Introduction_pptx.pptx (20)

2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
 
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptxOPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
 
Big data Analytics Hadoop
Big data Analytics HadoopBig data Analytics Hadoop
Big data Analytics Hadoop
 
Hadoop architecture-tutorial
Hadoop  architecture-tutorialHadoop  architecture-tutorial
Hadoop architecture-tutorial
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data Engineering
 
Bigdata and Hadoop Introduction
Bigdata and Hadoop IntroductionBigdata and Hadoop Introduction
Bigdata and Hadoop Introduction
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
 
Introduction to hazelcast
Introduction to hazelcastIntroduction to hazelcast
Introduction to hazelcast
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
 
Apache Hadoop- Hadoop Basics.pptx
Apache Hadoop- Hadoop Basics.pptxApache Hadoop- Hadoop Basics.pptx
Apache Hadoop- Hadoop Basics.pptx
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant Dance
 
Final White Paper_
Final White Paper_Final White Paper_
Final White Paper_
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
 
BDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdfBDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdf
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 

Recently uploaded

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 

Recently uploaded (20)

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 

Hadoop_Introduction_pptx.pptx

  • 1.
  • 2.  Introduction to Hadoop  Hadoop nodes & daemons  Hadoop Architecture  Characteristics  Hadoop Features 2
  • 3. The Technology that empowers Yahoo, Facebook, Twitter, Walmart and others Hadoop 3
  • 4. An Open Source framework that allows distributed processing of large data-sets across the cluster of commodity hardware 4
  • 5. An Open Source framework that allows distributed processing of large data-sets across the cluster of commodity hardware Open Source  Source code is freely available  It may be redistributed and modified 5
  • 6. An open source framework that allows Distributed Processing of large data-sets across the cluster of commodity hardware Distributed Processing  Data is processed/ distributed on multiple nodes / servers  Multiple machines processes the data independently 6
  • 7. An open source framework that allows distributed processing of large data-sets across the Cluster of commodity hardware Cluster  Multiple machines connected together  Nodes are connected via LAN 7
  • 8. An open source framework that allows distributed processing of large data-sets across the cluster of Commodity Hardware Commodity Hardware  Economic / affordable machines  Typically low performance hardware 8
  • 9.  Open source framework written in Java  Inspired by Google's Map-Reduce programming model as well as its file system (GFS) 9
  • 10. Hadoop defeated Super computer Hadoop became top-level project launched Hive, SQL Support for Hadoop Development of started as Lucene sub-project published GFS & MapReduce papers 2002 2003 2005 2006 2008 Doug Cutting started working on Doug Cutting added DFS & MapReduce in converted 4TB of image archives over 100 EC2 instances Doug Cutting joined Cloudera 2009 2004 Hadoop History 2007 10
  • 11. Hadoop consists of three key parts 11
  • 12. Master Node Slave Node Nodes 12
  • 13. Master Node Slave Node Resource Manager NameNode Node Manager DataNode Nodes 13
  • 15. 15
  • 16.  Source code is freely available  Can be redistributed  Can be modified Free Affordabl e Communi ty Transpare nt Inter- operable No vendor lock Open Source 16
  • 17.  Data is processed distributedly on cluster  Multiple nodes in the cluster process data independently Centralized Processing Distributed Processing 17
  • 18.  Failure of nodes are recovered automatically  Framework takes care of failure of hardware as well tasks 18
  • 19.  Data is reliably stored on the cluster of machines despite machine failures  Failure of nodes doesn’t cause data loss 19
  • 20.  Data is highly available and accessible despite hardware failure  There will be no downtime for end user application due to data 20
  • 21.  Vertical Scalability – New hardware can be added to the nodes  Horizontal Scalability – New nodes can be added on the fly 21
  • 22.  No need to purchase costly license  No need to purchase costly hardware Economic Open Source Commodity Hardware = + 22
  • 23.  Distributed computing challenges are handled by framework  Client just need to concentrate on business logic 23
  • 24.  Move computation to data instead of data to computation  Data is processed on the nodes where it is stored Storage Servers App Servers Dat a Dat a Dat a Dat a Servers Dat a Dat a Dat a Dat a Algorith m Alg o Alg o Alg o Alg o 24
  • 25.  Everyday we generate 2.5 quintillion bytes of data  Hadoop handles huge volumes of data efficiently  Hadoop uses the power of distributed computing  HDFS & Yarn are two main components of Hadoop  It is highly fault tolerant, reliable & available 25