SlideShare a Scribd company logo
MapReduce with Big Data
Jagriti Srivastava
2
3
Tools
4
●
Large volume of data – structured and unstructured
●
It’s what organizations do with the data that matters.
●
Helps for better decisions and strategic business moves.
●
Map Reduce for big data scenario :
– Data of total social media sign up from different countries.
– Listing of those data using Map Reduce technique.
– Search engines could determine page views, and marketers
could perform sentiment analysis using MapReduce.
Big Data with Map Reduce
5
MapReduce Implementation
●
At Google:
–  Index building for Google Search
– – Article clustering for Google News
– Statistical machine translation
●
  At Yahoo!:
–  Index building for Yahoo! Search
–  Spam detection for Yahoo! Mail
●
At Facebook:
–  Data mining
–  Ad optimization
–  Spam detection Example
●
  At Amazon:
–  Product clustering
–  Statistical machine translation
6
Why MapReduce in BigData
●
Responsible for delegating work to the different nodes in the cluster/map
and
●
Collects all the results from the query into one cohesive answer.
●
Components of MapReduce :
– JobTracker (the master node),
– TaskTrackers (these are agents within each cluster, with functions of their own) and
– JobHistoryServer (deployed as separate function, but a component that tracks jobs.
7

More Related Content

Similar to Map reduce with big data

The Role of Data Science in Real Estate
The Role of Data Science in Real EstateThe Role of Data Science in Real Estate
The Role of Data Science in Real Estate
CARTO
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
PentaTech
 
Encroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data TechnologyEncroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data Technology
MangaiK4
 
Integrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptx
Integrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptxIntegrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptx
Integrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptx
Begum Kaya
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
todd271
 
How tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualizationHow tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualization
Vishanth Bala
 
Google Analytics location data visualised with CARTO & BigQuery
Google Analytics location data visualised with CARTO & BigQueryGoogle Analytics location data visualised with CARTO & BigQuery
Google Analytics location data visualised with CARTO & BigQuery
CARTO
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makers
Ruhollah Farchtchi
 
Data-Ed Slides: Exorcising the Seven Deadly Data Sins
Data-Ed Slides: Exorcising the Seven Deadly Data SinsData-Ed Slides: Exorcising the Seven Deadly Data Sins
Data-Ed Slides: Exorcising the Seven Deadly Data Sins
DATAVERSITY
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applications
panoratio
 
Data Governance Workshop
Data Governance WorkshopData Governance Workshop
Data Governance Workshop
CCG
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
ANAND PRAKASH
 
Managing service business
Managing service businessManaging service business
Managing service business
İnform Elektronik
 
Business intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and ApplicationsBusiness intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and Applications
raj
 
The State of GIS in Washington & Oregon The 2014 GMI Metric Survey
The State of GIS in Washington & Oregon  The 2014 GMI Metric SurveyThe State of GIS in Washington & Oregon  The 2014 GMI Metric Survey
The State of GIS in Washington & Oregon The 2014 GMI Metric Survey
Greg Babinski
 
InSTEDD: ASLM2018 - Planwise for data driven planning
InSTEDD: ASLM2018 - Planwise for data driven planningInSTEDD: ASLM2018 - Planwise for data driven planning
InSTEDD: ASLM2018 - Planwise for data driven planning
InSTEDD
 
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed MartinEffectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Neo4j
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing Strategies
Knoldus Inc.
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
Sandeep Garg
 
Intro to big data and applications - day 2
Intro to big data and applications - day 2Intro to big data and applications - day 2
Intro to big data and applications - day 2
Parviz Vakili
 

Similar to Map reduce with big data (20)

The Role of Data Science in Real Estate
The Role of Data Science in Real EstateThe Role of Data Science in Real Estate
The Role of Data Science in Real Estate
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
 
Encroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data TechnologyEncroachment in Data Processing using Big Data Technology
Encroachment in Data Processing using Big Data Technology
 
Integrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptx
Integrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptxIntegrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptx
Integrating Structured Data (to an SEO Plan) for the Win _ WTSWorkshop '23.pptx
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
 
How tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualizationHow tech startups can leverage data analytics and visualization
How tech startups can leverage data analytics and visualization
 
Google Analytics location data visualised with CARTO & BigQuery
Google Analytics location data visualised with CARTO & BigQueryGoogle Analytics location data visualised with CARTO & BigQuery
Google Analytics location data visualised with CARTO & BigQuery
 
Big data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makersBig data analytics presented at meetup big data for decision makers
Big data analytics presented at meetup big data for decision makers
 
Data-Ed Slides: Exorcising the Seven Deadly Data Sins
Data-Ed Slides: Exorcising the Seven Deadly Data SinsData-Ed Slides: Exorcising the Seven Deadly Data Sins
Data-Ed Slides: Exorcising the Seven Deadly Data Sins
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applications
 
Data Governance Workshop
Data Governance WorkshopData Governance Workshop
Data Governance Workshop
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Managing service business
Managing service businessManaging service business
Managing service business
 
Business intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and ApplicationsBusiness intelligence- Components, Tools, Need and Applications
Business intelligence- Components, Tools, Need and Applications
 
The State of GIS in Washington & Oregon The 2014 GMI Metric Survey
The State of GIS in Washington & Oregon  The 2014 GMI Metric SurveyThe State of GIS in Washington & Oregon  The 2014 GMI Metric Survey
The State of GIS in Washington & Oregon The 2014 GMI Metric Survey
 
InSTEDD: ASLM2018 - Planwise for data driven planning
InSTEDD: ASLM2018 - Planwise for data driven planningInSTEDD: ASLM2018 - Planwise for data driven planning
InSTEDD: ASLM2018 - Planwise for data driven planning
 
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed MartinEffectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing Strategies
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Intro to big data and applications - day 2
Intro to big data and applications - day 2Intro to big data and applications - day 2
Intro to big data and applications - day 2
 

More from jagriti srivastava

Oyo rooms
Oyo roomsOyo rooms
Information system of amazon
Information system of amazonInformation system of amazon
Information system of amazon
jagriti srivastava
 
JavaScript Canvas
JavaScript CanvasJavaScript Canvas
JavaScript Canvas
jagriti srivastava
 
Variable and Methods in Java
Variable and Methods in JavaVariable and Methods in Java
Variable and Methods in Java
jagriti srivastava
 
Component diagram and Deployment Diagram
Component diagram and Deployment DiagramComponent diagram and Deployment Diagram
Component diagram and Deployment Diagram
jagriti srivastava
 
Basic java, java collection Framework and Date Time API
Basic java, java collection Framework and Date Time APIBasic java, java collection Framework and Date Time API
Basic java, java collection Framework and Date Time API
jagriti srivastava
 
Form validation and animation
Form validation and animationForm validation and animation
Form validation and animation
jagriti srivastava
 
Custom directive and scopes
Custom directive and scopesCustom directive and scopes
Custom directive and scopes
jagriti srivastava
 
Angular directive filter and routing
Angular directive filter and routingAngular directive filter and routing
Angular directive filter and routing
jagriti srivastava
 
Starting with angular js
Starting with angular js Starting with angular js
Starting with angular js
jagriti srivastava
 
Angular introduction basic
Angular introduction basicAngular introduction basic
Angular introduction basic
jagriti srivastava
 
Scannerclass
ScannerclassScannerclass
Scannerclass
jagriti srivastava
 
Programming Workshop
Programming WorkshopProgramming Workshop
Programming Workshop
jagriti srivastava
 
Java Nested class Concept
Java Nested class ConceptJava Nested class Concept
Java Nested class Concept
jagriti srivastava
 
Java , A brief Introduction
Java , A brief Introduction Java , A brief Introduction
Java , A brief Introduction
jagriti srivastava
 

More from jagriti srivastava (15)

Oyo rooms
Oyo roomsOyo rooms
Oyo rooms
 
Information system of amazon
Information system of amazonInformation system of amazon
Information system of amazon
 
JavaScript Canvas
JavaScript CanvasJavaScript Canvas
JavaScript Canvas
 
Variable and Methods in Java
Variable and Methods in JavaVariable and Methods in Java
Variable and Methods in Java
 
Component diagram and Deployment Diagram
Component diagram and Deployment DiagramComponent diagram and Deployment Diagram
Component diagram and Deployment Diagram
 
Basic java, java collection Framework and Date Time API
Basic java, java collection Framework and Date Time APIBasic java, java collection Framework and Date Time API
Basic java, java collection Framework and Date Time API
 
Form validation and animation
Form validation and animationForm validation and animation
Form validation and animation
 
Custom directive and scopes
Custom directive and scopesCustom directive and scopes
Custom directive and scopes
 
Angular directive filter and routing
Angular directive filter and routingAngular directive filter and routing
Angular directive filter and routing
 
Starting with angular js
Starting with angular js Starting with angular js
Starting with angular js
 
Angular introduction basic
Angular introduction basicAngular introduction basic
Angular introduction basic
 
Scannerclass
ScannerclassScannerclass
Scannerclass
 
Programming Workshop
Programming WorkshopProgramming Workshop
Programming Workshop
 
Java Nested class Concept
Java Nested class ConceptJava Nested class Concept
Java Nested class Concept
 
Java , A brief Introduction
Java , A brief Introduction Java , A brief Introduction
Java , A brief Introduction
 

Recently uploaded

Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 

Recently uploaded (20)

Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 

Map reduce with big data

  • 1. MapReduce with Big Data Jagriti Srivastava
  • 2. 2
  • 4. 4 ● Large volume of data – structured and unstructured ● It’s what organizations do with the data that matters. ● Helps for better decisions and strategic business moves. ● Map Reduce for big data scenario : – Data of total social media sign up from different countries. – Listing of those data using Map Reduce technique. – Search engines could determine page views, and marketers could perform sentiment analysis using MapReduce. Big Data with Map Reduce
  • 5. 5 MapReduce Implementation ● At Google: –  Index building for Google Search – – Article clustering for Google News – Statistical machine translation ●   At Yahoo!: –  Index building for Yahoo! Search –  Spam detection for Yahoo! Mail ● At Facebook: –  Data mining –  Ad optimization –  Spam detection Example ●   At Amazon: –  Product clustering –  Statistical machine translation
  • 6. 6 Why MapReduce in BigData ● Responsible for delegating work to the different nodes in the cluster/map and ● Collects all the results from the query into one cohesive answer. ● Components of MapReduce : – JobTracker (the master node), – TaskTrackers (these are agents within each cluster, with functions of their own) and – JobHistoryServer (deployed as separate function, but a component that tracks jobs.
  • 7. 7