SlideShare a Scribd company logo
1 of 10
The solution for Big data 
HADOOP 
Shubham Pendharkar 
Nitish Mowall
THEN WHAT COULD BE THE SOLUTION FOR BIGDATA 
HADOOP
What is Hadoop ? 
 It is a open source software written in java. 
 Apache Hadoop is a framework that allows for the distributed processing of large data 
sets across clusters of commodity computers using a simple programming model. 
 Here's what makes it especially useful: 
 Scalable: It can reliably store and process petabytes. 
 Economical: It distributes the data and processing across clusters of commonly 
available computers (in thousands). 
 Efficient: By distributing the data, it can process it in parallel on the nodes where 
the data is located. 
 Reliable: It automatically maintains multiple copies of data and automatically 
redeploys computing tasks based on failures.
Companies using Hadoop:
What does it do? 
 Hadoop implements Google’s MapReduce, using HDFS 
 MapReduce divides applications into many small blocks of work. 
 HDFS creates multiple replicas of data blocks for reliability, placing them on compute 
nodes around the cluster. 
 MapReduce can then process the data where it is located. 
 Hadoop ‘s target is to run on clusters of the order of 10,000-nodes.
Structure of Hadoop system: 
HDFS :for STORAGE 
MapReduce :for Processing
Master node 
Master of the system. 
Maintains and manages the blocks which 
are present on the DataNodes (does not store 
data). 
Slave nodes 
Slaves which are deployed on each 
machine and provide the actual storage. 
Responsible for serving read and write 
requests for the clients.
The Big Data Solution: Hadoop for Scalable Distributed Processing

More Related Content

What's hot

An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache SparkElvis Saravia
 
Low latency access of bigdata using spark and shark
Low latency access of bigdata using spark and sharkLow latency access of bigdata using spark and shark
Low latency access of bigdata using spark and sharkPradeep Kumar G.S
 
Hadoop vs spark
Hadoop vs sparkHadoop vs spark
Hadoop vs sparkamarkayam
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoopChad Richeson
 
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_DataPeter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_DataTriNimbus
 
Hadoop Architecture
Hadoop Architecture Hadoop Architecture
Hadoop Architecture Ganesh B
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoopdarugar
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune amrutupre
 
Big Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud SystemsBig Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud SystemsIntellipaat
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPTAnand Pandey
 
Introducing the hadoop ecosystem
Introducing the hadoop ecosystemIntroducing the hadoop ecosystem
Introducing the hadoop ecosystemGeert Van Landeghem
 
WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS? WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS? nakshatraL
 

What's hot (20)

BIG DATA HADOOP
BIG DATA HADOOPBIG DATA HADOOP
BIG DATA HADOOP
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
 
Low latency access of bigdata using spark and shark
Low latency access of bigdata using spark and sharkLow latency access of bigdata using spark and shark
Low latency access of bigdata using spark and shark
 
Hadoop vs spark
Hadoop vs sparkHadoop vs spark
Hadoop vs spark
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop
HadoopHadoop
Hadoop
 
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_DataPeter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
 
Hadoop Architecture
Hadoop Architecture Hadoop Architecture
Hadoop Architecture
 
Hadoop
HadoopHadoop
Hadoop
 
Apache hadoop & map reduce
Apache hadoop & map reduceApache hadoop & map reduce
Apache hadoop & map reduce
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Big Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud SystemsBig Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud Systems
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPT
 
Introducing the hadoop ecosystem
Introducing the hadoop ecosystemIntroducing the hadoop ecosystem
Introducing the hadoop ecosystem
 
Hadoop
HadoopHadoop
Hadoop
 
WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS? WHAT IS HADOOP AND ITS COMPONENTS?
WHAT IS HADOOP AND ITS COMPONENTS?
 

Similar to The Big Data Solution: Hadoop for Scalable Distributed Processing

Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoopOmar Jaber
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceNeev Technologies
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop BasicsSonal Tiwari
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Ranjith Sekar
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop TechnologyRahul Sharma
 
Distributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxDistributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxUttara University
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar ReportAtul Kushwaha
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersMrigendra Sharma
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals
 

Similar to The Big Data Solution: Hadoop for Scalable Distributed Processing (20)

2.1-HADOOP.pdf
2.1-HADOOP.pdf2.1-HADOOP.pdf
2.1-HADOOP.pdf
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
 
Hadoop Tutorial for Beginners
Hadoop Tutorial for BeginnersHadoop Tutorial for Beginners
Hadoop Tutorial for Beginners
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a Glance
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop Basics
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Bigdata ppt
Bigdata pptBigdata ppt
Bigdata ppt
 
Bigdata
BigdataBigdata
Bigdata
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop Technology
 
Hadoop in a Nutshell
Hadoop in a NutshellHadoop in a Nutshell
Hadoop in a Nutshell
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
 
Distributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxDistributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptx
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Seminar ppt
Seminar pptSeminar ppt
Seminar ppt
 
Anju
AnjuAnju
Anju
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, Providers
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 

The Big Data Solution: Hadoop for Scalable Distributed Processing

  • 1. The solution for Big data HADOOP Shubham Pendharkar Nitish Mowall
  • 2.
  • 3. THEN WHAT COULD BE THE SOLUTION FOR BIGDATA HADOOP
  • 4. What is Hadoop ?  It is a open source software written in java.  Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of commodity computers using a simple programming model.  Here's what makes it especially useful:  Scalable: It can reliably store and process petabytes.  Economical: It distributes the data and processing across clusters of commonly available computers (in thousands).  Efficient: By distributing the data, it can process it in parallel on the nodes where the data is located.  Reliable: It automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.
  • 6. What does it do?  Hadoop implements Google’s MapReduce, using HDFS  MapReduce divides applications into many small blocks of work.  HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster.  MapReduce can then process the data where it is located.  Hadoop ‘s target is to run on clusters of the order of 10,000-nodes.
  • 7.
  • 8. Structure of Hadoop system: HDFS :for STORAGE MapReduce :for Processing
  • 9. Master node Master of the system. Maintains and manages the blocks which are present on the DataNodes (does not store data). Slave nodes Slaves which are deployed on each machine and provide the actual storage. Responsible for serving read and write requests for the clients.