SlideShare a Scribd company logo
1 of 26
Certified Big Data & Hadoop Training – DataFlair
Hadoop Tutorial
Certified Big Data & Hadoop Training – DataFlair
Agenda
 Introduction to Hadoop
 Hadoop nodes & daemons
 Hadoop Architecture
 Characteristics
 Hadoop Features
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
The Technology that empowers Yahoo, Facebook, Twitter, Walmart and others
Hadoop
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
An Open Source framework that
allows distributed processing of
large data-sets across the cluster
of commodity hardware
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
An Open Source framework that
allows distributed processing of
large data-sets across the cluster
of commodity hardware
Open Source
 Source code is freely available
 It may be redistributed and
modified
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
An open source framework that
allows Distributed Processing of
large data-sets across the cluster
of commodity hardware
Distributed Processing
 Data is processed distributedly
on multiple nodes / servers
 Multiple machines processes
the data independently
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
An open source framework that
allows distributed processing of
large data-sets across the Cluster
of commodity hardware
Cluster
 Multiple machines connected
together
 Nodes are connected via LAN
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
An open source framework that
allows distributed processing of
large data-sets across the cluster
of Commodity Hardware
Commodity Hardware
 Economic / affordable
machines
 Typically low performance
hardware
Certified Big Data & Hadoop Training – DataFlair
What is Hadoop?
• Open source framework written in Java
• Inspired by Google's Map-Reduce programming model as well as its file
system (GFS)
Certified Big Data & Hadoop Training – DataFlair
Hadoop defeated
Super computer
Hadoop became
top-level project
launched Hive,
SQL Support for Hadoop
Development of
started as Lucene sub-project
published GFS &
MapReduce papers
2002 2003 2005 2006 2008
Doug Cutting started
working on
Doug Cutting added
DFS & MapReduce
in
converted 4TB of
image archives over
100 EC2 instances
Doug Cutting
joined Cloudera
2009
2004
Hadoop History
2007
Certified Big Data & Hadoop Training – DataFlair
Hadoop Components
Hadoop consists of three key parts
Certified Big Data & Hadoop Training – DataFlair
Master Node Slave Node
Hadoop Nodes
Nodes
Certified Big Data & Hadoop Training – DataFlair
Master Node Slave Node
Hadoop Daemons
Resource
Manager
NameNode
Node
Manager
DataNode
Nodes
Certified Big Data & Hadoop Training – DataFlair
Sub Work Sub Work Sub Work Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Work
Sub Work Sub Work Sub Work Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Sub Work
Basic Hadoop Architecture
Certified Big Data & Hadoop Training – DataFlair
Hadoop Characteristics
Certified Big Data & Hadoop Training – DataFlair
Open Source
• Source code is freely
available
• Can be redistributed
• Can be modified
Free
Affordable
Community
Transparent
Inter-
operable
No vendor
lock
Open
Source
Certified Big Data & Hadoop Training – DataFlair
Distributed Processing
• Data is processed distributedly
on cluster
• Multiple nodes in the cluster
process data independently
Centralized Processing
Distributed Processing
Certified Big Data & Hadoop Training – DataFlair
Fault Tolerance
• Failure of nodes are recovered
automatically
• Framework takes care of failure
of hardware as well tasks
Certified Big Data & Hadoop Training – DataFlair
Reliability
• Data is reliably stored on the
cluster of machines despite
machine failures
• Failure of nodes doesn’t
cause data loss
Certified Big Data & Hadoop Training – DataFlair
High Availability
• Data is highly available and
accessible despite hardware
failure
• There will be no downtime for
end user application due to
data
Certified Big Data & Hadoop Training – DataFlair
Scalability
• Vertical Scalability – New
hardware can be added to the
nodes
• Horizontal Scalability – New
nodes can be added on the fly
Certified Big Data & Hadoop Training – DataFlair
Economic
• No need to purchase costly license
• No need to purchase costly hardware
Economic
Open Source
Commodity
Hardware =
+
Certified Big Data & Hadoop Training – DataFlair
Easy to Use
• Distributed computing challenges
are handled by framework
• Client just need to concentrate on
business logic
Certified Big Data & Hadoop Training – DataFlair
Data Locality
• Move computation to data
instead of data to computation
• Data is processed on the nodes
where it is stored Storage Servers App Servers
Data Data
Data
Data
Servers
Data Data
Data
Data
Algorithm
Algo Algo
Algo
Algo
Certified Big Data & Hadoop Training – DataFlair
Summary
• Everyday we generate 2.3 trillion GBs of data
• Hadoop handles huge volumes of data efficiently
• Hadoop uses the power of distributed computing
• HDFS & Yarn are two main components of Hadoop
• It is highly fault tolerant, reliable & available
Certified Big Data & Hadoop Training – DataFlair
Thank You
DataFlair
/c/DataFlairWS /DataFlairWS

More Related Content

Similar to HadoopIntroduction.pptx

Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeNicolas Morales
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online TrainingLearntek1
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopMike Pittaro
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2Aswini Ashu
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2aswini pilli
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Inside Analysis
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Alluxio, Inc.
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure Antonios Chatzipavlis
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure Antonios Chatzipavlis
 
Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackJohn Dougherty
 
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Alex Gorbachev
 

Similar to HadoopIntroduction.pptx (20)

Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
 
Talend for big_data_intorduction
Talend for big_data_intorductionTalend for big_data_intorduction
Talend for big_data_intorduction
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2
 
Exploring sql server 2016 bi
Exploring sql server 2016 biExploring sql server 2016 bi
Exploring sql server 2016 bi
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and Stack
 
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
 

More from BalasundaramSr

More from BalasundaramSr (20)

WEB 3 IS THE FILE UPLOADED IN THIS APPROACH
WEB 3 IS THE FILE UPLOADED IN THIS APPROACHWEB 3 IS THE FILE UPLOADED IN THIS APPROACH
WEB 3 IS THE FILE UPLOADED IN THIS APPROACH
 
Semantic Search to Web 3.0 Complete Tutorial
Semantic Search to Web 3.0 Complete TutorialSemantic Search to Web 3.0 Complete Tutorial
Semantic Search to Web 3.0 Complete Tutorial
 
Objects and Classes BRIEF.pptx
Objects and Classes BRIEF.pptxObjects and Classes BRIEF.pptx
Objects and Classes BRIEF.pptx
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
 
13047926.ppt
13047926.ppt13047926.ppt
13047926.ppt
 
Xpath.pdf
Xpath.pdfXpath.pdf
Xpath.pdf
 
OSNs.pptx
OSNs.pptxOSNs.pptx
OSNs.pptx
 
HadoopIntroduction.pptx
HadoopIntroduction.pptxHadoopIntroduction.pptx
HadoopIntroduction.pptx
 
Data Mart Lake Ware.pptx
Data Mart Lake Ware.pptxData Mart Lake Ware.pptx
Data Mart Lake Ware.pptx
 
Simple SNA.pdf
Simple SNA.pdfSimple SNA.pdf
Simple SNA.pdf
 
XPATH_XSLT-1.pptx
XPATH_XSLT-1.pptxXPATH_XSLT-1.pptx
XPATH_XSLT-1.pptx
 
Cognitive Science.ppt
Cognitive Science.pptCognitive Science.ppt
Cognitive Science.ppt
 
Web Page Design.ppt
Web Page Design.pptWeb Page Design.ppt
Web Page Design.ppt
 
wipo_res_dev_ge_09_www_130165.ppt
wipo_res_dev_ge_09_www_130165.pptwipo_res_dev_ge_09_www_130165.ppt
wipo_res_dev_ge_09_www_130165.ppt
 
OOA Analysis(1).pdf
OOA Analysis(1).pdfOOA Analysis(1).pdf
OOA Analysis(1).pdf
 
OODIAGRAMS.ppt
OODIAGRAMS.pptOODIAGRAMS.ppt
OODIAGRAMS.ppt
 
Threading.pptx
Threading.pptxThreading.pptx
Threading.pptx
 
OMTanalysis.ppt
OMTanalysis.pptOMTanalysis.ppt
OMTanalysis.ppt
 
network.ppt
network.pptnetwork.ppt
network.ppt
 
css1.ppt
css1.pptcss1.ppt
css1.ppt
 

Recently uploaded

dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningMarc Dusseiller Dusjagr
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonhttgc7rh9c
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of PlayPooky Knightsmith
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxCeline George
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesSHIVANANDaRV
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsNbelano25
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...EADTU
 

Recently uploaded (20)

dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of Play
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
VAMOS CUIDAR DO NOSSO PLANETA! .
VAMOS CUIDAR DO NOSSO PLANETA!                    .VAMOS CUIDAR DO NOSSO PLANETA!                    .
VAMOS CUIDAR DO NOSSO PLANETA! .
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food Additives
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 

HadoopIntroduction.pptx

  • 1. Certified Big Data & Hadoop Training – DataFlair Hadoop Tutorial
  • 2. Certified Big Data & Hadoop Training – DataFlair Agenda  Introduction to Hadoop  Hadoop nodes & daemons  Hadoop Architecture  Characteristics  Hadoop Features
  • 3. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? The Technology that empowers Yahoo, Facebook, Twitter, Walmart and others Hadoop
  • 4. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? An Open Source framework that allows distributed processing of large data-sets across the cluster of commodity hardware
  • 5. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? An Open Source framework that allows distributed processing of large data-sets across the cluster of commodity hardware Open Source  Source code is freely available  It may be redistributed and modified
  • 6. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? An open source framework that allows Distributed Processing of large data-sets across the cluster of commodity hardware Distributed Processing  Data is processed distributedly on multiple nodes / servers  Multiple machines processes the data independently
  • 7. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? An open source framework that allows distributed processing of large data-sets across the Cluster of commodity hardware Cluster  Multiple machines connected together  Nodes are connected via LAN
  • 8. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? An open source framework that allows distributed processing of large data-sets across the cluster of Commodity Hardware Commodity Hardware  Economic / affordable machines  Typically low performance hardware
  • 9. Certified Big Data & Hadoop Training – DataFlair What is Hadoop? • Open source framework written in Java • Inspired by Google's Map-Reduce programming model as well as its file system (GFS)
  • 10. Certified Big Data & Hadoop Training – DataFlair Hadoop defeated Super computer Hadoop became top-level project launched Hive, SQL Support for Hadoop Development of started as Lucene sub-project published GFS & MapReduce papers 2002 2003 2005 2006 2008 Doug Cutting started working on Doug Cutting added DFS & MapReduce in converted 4TB of image archives over 100 EC2 instances Doug Cutting joined Cloudera 2009 2004 Hadoop History 2007
  • 11. Certified Big Data & Hadoop Training – DataFlair Hadoop Components Hadoop consists of three key parts
  • 12. Certified Big Data & Hadoop Training – DataFlair Master Node Slave Node Hadoop Nodes Nodes
  • 13. Certified Big Data & Hadoop Training – DataFlair Master Node Slave Node Hadoop Daemons Resource Manager NameNode Node Manager DataNode Nodes
  • 14. Certified Big Data & Hadoop Training – DataFlair Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Sub Work Basic Hadoop Architecture
  • 15. Certified Big Data & Hadoop Training – DataFlair Hadoop Characteristics
  • 16. Certified Big Data & Hadoop Training – DataFlair Open Source • Source code is freely available • Can be redistributed • Can be modified Free Affordable Community Transparent Inter- operable No vendor lock Open Source
  • 17. Certified Big Data & Hadoop Training – DataFlair Distributed Processing • Data is processed distributedly on cluster • Multiple nodes in the cluster process data independently Centralized Processing Distributed Processing
  • 18. Certified Big Data & Hadoop Training – DataFlair Fault Tolerance • Failure of nodes are recovered automatically • Framework takes care of failure of hardware as well tasks
  • 19. Certified Big Data & Hadoop Training – DataFlair Reliability • Data is reliably stored on the cluster of machines despite machine failures • Failure of nodes doesn’t cause data loss
  • 20. Certified Big Data & Hadoop Training – DataFlair High Availability • Data is highly available and accessible despite hardware failure • There will be no downtime for end user application due to data
  • 21. Certified Big Data & Hadoop Training – DataFlair Scalability • Vertical Scalability – New hardware can be added to the nodes • Horizontal Scalability – New nodes can be added on the fly
  • 22. Certified Big Data & Hadoop Training – DataFlair Economic • No need to purchase costly license • No need to purchase costly hardware Economic Open Source Commodity Hardware = +
  • 23. Certified Big Data & Hadoop Training – DataFlair Easy to Use • Distributed computing challenges are handled by framework • Client just need to concentrate on business logic
  • 24. Certified Big Data & Hadoop Training – DataFlair Data Locality • Move computation to data instead of data to computation • Data is processed on the nodes where it is stored Storage Servers App Servers Data Data Data Data Servers Data Data Data Data Algorithm Algo Algo Algo Algo
  • 25. Certified Big Data & Hadoop Training – DataFlair Summary • Everyday we generate 2.3 trillion GBs of data • Hadoop handles huge volumes of data efficiently • Hadoop uses the power of distributed computing • HDFS & Yarn are two main components of Hadoop • It is highly fault tolerant, reliable & available
  • 26. Certified Big Data & Hadoop Training – DataFlair Thank You DataFlair /c/DataFlairWS /DataFlairWS