SlideShare a Scribd company logo
1 of 15
How Hadoop is
Useful for Solving
Problems of Big Data
RISHISH MOHAN BHATNAGAR
Contents
 Big Data
 Problem with Big Data
 From where this much of data is generated
 Data Statistics
 Hadoop
 MapReduce
 Work Flow of MapReduce
 Work Flow Example
 Conclusion
1
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Big Data
 Big data is first and foremost about data volume, namely large data sets
measured in tens of terabytes, or in hundreds of terabytes or petabytes.
 Big data can also be a combination of -
 Structured Data (relational data)
 Unstructured Data (.doc files, images)
 Semi Structured Data (JSON Files, CSV Files)
 Big data is extremely large set of data that may be processed to reveal
patterns, trends and human patterns about particular topic
2
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Problems with Big Data
 The problem with Big Data is based on three V’s –
 Volume of the data
 Variety of the data
 Velocity of the data
 5 billion gigabytes data is produced by us until it is 2004.
 In 2011 same amount of data was produced in two days.
 In 2013 it is even possible in only in 10 minutes.
3
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
From Where This Much of Data is Generated?
 Social media such as Facebook and Twitter is responsible for huge amount
of data.
 The data recorded by black box aircrafts and helicopters generates lots of
unstructured data.
 Sensex, Nifty and other stock exchange across the world generates lots of
data.
 Various types of sensors generates large data volume.
4
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Monthly active users (Jan, 2017)
1871
1000
600
106
317
300
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Facebook
WhatsApp
Instagram
LinkedIn
Twitter
Snapchat
In millions
5
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Data Statistics
Data Statistics
6
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Hadoop
 Google provides a solution to solve the processing problems of Big Data
i.e. to divide a task into small parts and assign those parts to many
computers connected over the network, and collect the result to form the
final data set.
 Doug Cutting & Mike Cafarella took the solution provided by Google and
started project called “Hadoop” in 2005 named after elephant of the
Cutting’s son.
 Now Hadoop is a registered trademark of Apache Software Foundation.
7
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Hadoop (Contd.)
 Written in JAVA
 Allows distributed processing of large datasets across multiple of
computers.
 Designed to scale the system from single server to thousands of servers.
 Can be used with commodity hardware.
 Hadoop library itself has been designed to detect and handle failure at the
application layer.
8
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Hadoop (Contd.)
 Since 2012 ‘Hadoop’ not only offers basic modules but also provide Apache
Pig, Apache Hive, Apache Hbase and Apache Spark which can work in top of
the basic Hadoop.
 Compatible with all the platforms since it is JAVA based.
 Basically Hadoop works with following two strategies –
 MapReduce
 HDFS
9
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
MapReduce
 Based on the paradigm – ‘Sending the computer where data resides’
 It is a programming model based on the JAVA.
 There are two important tasks-
 Map takes a set of data and convert it into another set of data, where
individual elements are broken down into tuple key/value pair.
 Reduce takes the output of map and combine those data tuples into a
smaller set of tuples.
 Reduce task is always performed after the map job.
10
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Work Flow of MapReduce
Work flow of MapReduce consist five steps, these steps are –
1. Splitting – The splitting parameter can be anything such as splitting by space,
comma, semicolon or by a new line
2. Mapping – It takes set of data and convert them into another set of data, where
individual elements are broken down into key – value pair
3. Intermediate Splitting – The entire process in parallel on different systems. In
order to group them in “Reduce phase” the similar key data should be on same
system
4. Reduce – Takes the output intermediate splitting and combines those data into
smaller set of tuples.
5. Combining – The last phase where all the individual result is combined to form
final result.
11
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Work Flow Example
12
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Conclusion
 Initially Hadoop is by developed by yahoo engineers to counter the
Google's “Big Table”
 Hadoop is used for distributed data processing of all types of data
 To learn hadoop one should explore core java very well, without core java
learning process of hadoop is hard nut to crack
 There was 4.4 million jobs of hadoop in 2015 but only one – third of those
jobs filled
13
Rishish Mohan Bhatnagar
How Hadoop is Useful for Solving Problems of Big Data
Thank You

More Related Content

What's hot

Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overviewNitesh Ghosh
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataHaluan Irsad
 
Intro to Big Data Hadoop
Intro to Big Data HadoopIntro to Big Data Hadoop
Intro to Big Data HadoopApache Apex
 
Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013boorad
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challengesfazail amin
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleIntroduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleSpringPeople
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big DataTeemu Heikkilä
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellKhalid Imran
 
Big Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and SolrBig Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and Solrboorad
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizITJobZone.biz
 
10 Popular Hadoop Technical Interview Questions
10 Popular Hadoop Technical Interview Questions10 Popular Hadoop Technical Interview Questions
10 Popular Hadoop Technical Interview QuestionsZaranTech LLC
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBhavya Gulati
 
Whatisbigdataandwhylearnhadoop
WhatisbigdataandwhylearnhadoopWhatisbigdataandwhylearnhadoop
WhatisbigdataandwhylearnhadoopEdureka!
 

What's hot (20)

Bigdata
Bigdata Bigdata
Bigdata
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Intro to Big Data Hadoop
Intro to Big Data HadoopIntro to Big Data Hadoop
Intro to Big Data Hadoop
 
Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013Big Data Analysis Patterns - TriHUG 6/27/2013
Big Data Analysis Patterns - TriHUG 6/27/2013
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleIntroduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeople
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big Data
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and SolrBig Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and Solr
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
 
Big Data Hadoop Tutorial by Easylearning Guru
Big Data Hadoop Tutorial by Easylearning GuruBig Data Hadoop Tutorial by Easylearning Guru
Big Data Hadoop Tutorial by Easylearning Guru
 
Hadoop_Presentation
Hadoop_PresentationHadoop_Presentation
Hadoop_Presentation
 
10 Popular Hadoop Technical Interview Questions
10 Popular Hadoop Technical Interview Questions10 Popular Hadoop Technical Interview Questions
10 Popular Hadoop Technical Interview Questions
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
Big Data And Hadoop
Big Data And HadoopBig Data And Hadoop
Big Data And Hadoop
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Whatisbigdataandwhylearnhadoop
WhatisbigdataandwhylearnhadoopWhatisbigdataandwhylearnhadoop
Whatisbigdataandwhylearnhadoop
 

Similar to Why Hadoop is Useful?

A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introductionsaisreealekhya
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals
 
Learn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceLearn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceAssignment Help
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questionsKalyan Hadoop
 
Machine Learning Hadoop
Machine Learning HadoopMachine Learning Hadoop
Machine Learning HadoopAletheLabs
 
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
62_Tazeen_Sayed_Hadoop_Ecosystem.pptxTazeenSayed3
 
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...Puneet Kansal
 
Introduction-to-Big-Data-and-Hadoop.pptx
Introduction-to-Big-Data-and-Hadoop.pptxIntroduction-to-Big-Data-and-Hadoop.pptx
Introduction-to-Big-Data-and-Hadoop.pptxPratimakumari213460
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Modeinventionjournals
 

Similar to Why Hadoop is Useful? (20)

A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
G017143640
G017143640G017143640
G017143640
 
Big Data
Big DataBig Data
Big Data
 
00 hadoop welcome_transcript
00 hadoop welcome_transcript00 hadoop welcome_transcript
00 hadoop welcome_transcript
 
1. what is hadoop part 1
1. what is hadoop   part 11. what is hadoop   part 1
1. what is hadoop part 1
 
Learn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceLearn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant Resource
 
Big data
Big dataBig data
Big data
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questions
 
Machine Learning Hadoop
Machine Learning HadoopMachine Learning Hadoop
Machine Learning Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
 
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
OPTIMIZATION OF MULTIPLE CORRELATED QUERIES BY DETECTING SIMILAR DATA SOURCE ...
 
Introduction-to-Big-Data-and-Hadoop.pptx
Introduction-to-Big-Data-and-Hadoop.pptxIntroduction-to-Big-Data-and-Hadoop.pptx
Introduction-to-Big-Data-and-Hadoop.pptx
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Mode
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 

Why Hadoop is Useful?

  • 1. How Hadoop is Useful for Solving Problems of Big Data RISHISH MOHAN BHATNAGAR
  • 2. Contents  Big Data  Problem with Big Data  From where this much of data is generated  Data Statistics  Hadoop  MapReduce  Work Flow of MapReduce  Work Flow Example  Conclusion 1 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 3. Big Data  Big data is first and foremost about data volume, namely large data sets measured in tens of terabytes, or in hundreds of terabytes or petabytes.  Big data can also be a combination of -  Structured Data (relational data)  Unstructured Data (.doc files, images)  Semi Structured Data (JSON Files, CSV Files)  Big data is extremely large set of data that may be processed to reveal patterns, trends and human patterns about particular topic 2 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 4. Problems with Big Data  The problem with Big Data is based on three V’s –  Volume of the data  Variety of the data  Velocity of the data  5 billion gigabytes data is produced by us until it is 2004.  In 2011 same amount of data was produced in two days.  In 2013 it is even possible in only in 10 minutes. 3 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 5. From Where This Much of Data is Generated?  Social media such as Facebook and Twitter is responsible for huge amount of data.  The data recorded by black box aircrafts and helicopters generates lots of unstructured data.  Sensex, Nifty and other stock exchange across the world generates lots of data.  Various types of sensors generates large data volume. 4 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 6. Monthly active users (Jan, 2017) 1871 1000 600 106 317 300 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Facebook WhatsApp Instagram LinkedIn Twitter Snapchat In millions 5 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 7. Data Statistics Data Statistics 6 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 8. Hadoop  Google provides a solution to solve the processing problems of Big Data i.e. to divide a task into small parts and assign those parts to many computers connected over the network, and collect the result to form the final data set.  Doug Cutting & Mike Cafarella took the solution provided by Google and started project called “Hadoop” in 2005 named after elephant of the Cutting’s son.  Now Hadoop is a registered trademark of Apache Software Foundation. 7 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 9. Hadoop (Contd.)  Written in JAVA  Allows distributed processing of large datasets across multiple of computers.  Designed to scale the system from single server to thousands of servers.  Can be used with commodity hardware.  Hadoop library itself has been designed to detect and handle failure at the application layer. 8 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 10. Hadoop (Contd.)  Since 2012 ‘Hadoop’ not only offers basic modules but also provide Apache Pig, Apache Hive, Apache Hbase and Apache Spark which can work in top of the basic Hadoop.  Compatible with all the platforms since it is JAVA based.  Basically Hadoop works with following two strategies –  MapReduce  HDFS 9 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 11. MapReduce  Based on the paradigm – ‘Sending the computer where data resides’  It is a programming model based on the JAVA.  There are two important tasks-  Map takes a set of data and convert it into another set of data, where individual elements are broken down into tuple key/value pair.  Reduce takes the output of map and combine those data tuples into a smaller set of tuples.  Reduce task is always performed after the map job. 10 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 12. Work Flow of MapReduce Work flow of MapReduce consist five steps, these steps are – 1. Splitting – The splitting parameter can be anything such as splitting by space, comma, semicolon or by a new line 2. Mapping – It takes set of data and convert them into another set of data, where individual elements are broken down into key – value pair 3. Intermediate Splitting – The entire process in parallel on different systems. In order to group them in “Reduce phase” the similar key data should be on same system 4. Reduce – Takes the output intermediate splitting and combines those data into smaller set of tuples. 5. Combining – The last phase where all the individual result is combined to form final result. 11 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 13. Work Flow Example 12 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data
  • 14. Conclusion  Initially Hadoop is by developed by yahoo engineers to counter the Google's “Big Table”  Hadoop is used for distributed data processing of all types of data  To learn hadoop one should explore core java very well, without core java learning process of hadoop is hard nut to crack  There was 4.4 million jobs of hadoop in 2015 but only one – third of those jobs filled 13 Rishish Mohan Bhatnagar How Hadoop is Useful for Solving Problems of Big Data