SlideShare a Scribd company logo
1 of 11
Download to read offline
MAPREDUCE
MAKERERE UNIVERSITY MS-CS
RAJAB SSEMWOGERERE
2019/HD03/29911U
MapReduce Definition
Is a programming model having a simplified implementation of
many data parallel applications for processing and generating
large datasets.
10/1/2019 8:11:02 AM RAJAB SSEMWOGERERE
2/11
How MapReduce operates
A MapReduce maps data and reduces data. Mapping transforms
data as data comes in one line at a time (for every input line there
is one output from the mapper).
Then the reducer aggregates data together.
10/1/2019 8:11:02 AM RAJAB SSEMWOGERERE
3/11
Example of where MapReduce can be applied
How many movies each user rated on the Movie Lens.
A MovieLens is a web-based virtual community system
recommender that recommends movies for its users to watch,
based on their preferences using a collaborative filtering of
members’ movie ratings and movie reviews.
10/1/2019 8:11:02 AM RAJAB SSEMWOGERERE
4/11
Sample Dataset of a MovieLens Data
UserID MOVIEID RATING TIMESTAMP
196 242 3 881250949
186 302 3 891717742
196 377 1 878887116
244 51 2 880606923
166 346 1 886397596
186 474 4 884182806
186 265 2 881171488
10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE
5/11
The mapper function
The mapper function converts raw data into key/value pairs, the
key will be the userID, and the value will be the movieID.
We don’t care about the rating and the timestamp for
optimization benefits.
def mapper_get_userID (self, _, line):
(userID, movieID) = line.split('t')
yield userID, 1
10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 6/11
The mapper function continues’
The Mapper function will convert the raw data into key/value
pairs.
196:1 186:1 196:1 244:1 166:1 186:1 186:1
By the time the mapper function finishes our data will be well
extracted and organized for the reducer function to aggregate.
10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 7/11
Shuffle and Sort
MapReduce sorts and Groups the Mapped Data (“Shuffle and
Sort”) at this point it aggregates the values for each unique key.
196:1 186:1 196:1 244:1 166:1 186:1 186:1
166:1 186:1,1,1 196:1,1 244:1
10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 8/11
The reducer function
Given the output from shuffle and sort.
The reducer is called once for each unique key, and then
processes or does the computation then produces the output.
def reducer_count_ratings (self, key, values):
yield key, sum (values)
10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 9/11
The reducer function continues’
166:1 186:1,1,1 196:1,1 244:1
Out put 166:1 186:3 196:2 244:1
10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 10/11
THANK YOU FOR YOUR
ATTENTION
END
10/1/2019 8:11:04 AM RAJAB SSEMWOGERERE 11/11

More Related Content

Similar to Map reduce presentation

MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large ClustersIRJET Journal
 
Pollmaps - 2011 Esri UC Presentation
Pollmaps - 2011 Esri UC PresentationPollmaps - 2011 Esri UC Presentation
Pollmaps - 2011 Esri UC PresentationAlex Yule
 
[Foss4 g2013]the architecture of mobile traffic map service final
[Foss4 g2013]the architecture of mobile traffic map service final[Foss4 g2013]the architecture of mobile traffic map service final
[Foss4 g2013]the architecture of mobile traffic map service finalBJ Jang
 
Barcode Based Parking Management System
Barcode Based Parking Management SystemBarcode Based Parking Management System
Barcode Based Parking Management SystemIRJET Journal
 
Scaling the Peak - AWS, FME & Snowflake Spatial
Scaling the Peak - AWS, FME & Snowflake SpatialScaling the Peak - AWS, FME & Snowflake Spatial
Scaling the Peak - AWS, FME & Snowflake SpatialSafe Software
 
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...IRJET Journal
 
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...Absi Ahmed
 
Generating higher accuracy digital data products by model parameter
Generating higher accuracy digital data products by model parameterGenerating higher accuracy digital data products by model parameter
Generating higher accuracy digital data products by model parameterIAEME Publication
 
A Study on New York City Taxi Rides
A Study on New York City Taxi RidesA Study on New York City Taxi Rides
A Study on New York City Taxi RidesCaglar Subasi
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsSeeling Cheung
 
AGU_Iguassu_Brazil_AUG
AGU_Iguassu_Brazil_AUGAGU_Iguassu_Brazil_AUG
AGU_Iguassu_Brazil_AUGJordan Alpert
 
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)Cisco Service Provider Mobility
 
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors  Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors ijcga
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTIRJET Journal
 
Clayvision-Yuichiro Takeuchi and Ken Perlin-Works
Clayvision-Yuichiro Takeuchi and Ken Perlin-WorksClayvision-Yuichiro Takeuchi and Ken Perlin-Works
Clayvision-Yuichiro Takeuchi and Ken Perlin-WorksDarshan Mehta
 
3 d video streaming for virtual exploration of planet surface
3 d video streaming for virtual exploration of planet surface3 d video streaming for virtual exploration of planet surface
3 d video streaming for virtual exploration of planet surfaceeSAT Publishing House
 

Similar to Map reduce presentation (20)

Hadoop presentation
Hadoop presentationHadoop presentation
Hadoop presentation
 
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
 
Pollmaps - 2011 Esri UC Presentation
Pollmaps - 2011 Esri UC PresentationPollmaps - 2011 Esri UC Presentation
Pollmaps - 2011 Esri UC Presentation
 
[Foss4 g2013]the architecture of mobile traffic map service final
[Foss4 g2013]the architecture of mobile traffic map service final[Foss4 g2013]the architecture of mobile traffic map service final
[Foss4 g2013]the architecture of mobile traffic map service final
 
50120140505004
5012014050500450120140505004
50120140505004
 
Barcode Based Parking Management System
Barcode Based Parking Management SystemBarcode Based Parking Management System
Barcode Based Parking Management System
 
Scaling the Peak - AWS, FME & Snowflake Spatial
Scaling the Peak - AWS, FME & Snowflake SpatialScaling the Peak - AWS, FME & Snowflake Spatial
Scaling the Peak - AWS, FME & Snowflake Spatial
 
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...
 
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications acro...
Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications acro...
 
40120140501008
4012014050100840120140501008
40120140501008
 
Generating higher accuracy digital data products by model parameter
Generating higher accuracy digital data products by model parameterGenerating higher accuracy digital data products by model parameter
Generating higher accuracy digital data products by model parameter
 
A Study on New York City Taxi Rides
A Study on New York City Taxi RidesA Study on New York City Taxi Rides
A Study on New York City Taxi Rides
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
 
AGU_Iguassu_Brazil_AUG
AGU_Iguassu_Brazil_AUGAGU_Iguassu_Brazil_AUG
AGU_Iguassu_Brazil_AUG
 
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
 
EventShop ISG talk 140213
EventShop ISG talk 140213EventShop ISG talk 140213
EventShop ISG talk 140213
 
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors  Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
Clayvision-Yuichiro Takeuchi and Ken Perlin-Works
Clayvision-Yuichiro Takeuchi and Ken Perlin-WorksClayvision-Yuichiro Takeuchi and Ken Perlin-Works
Clayvision-Yuichiro Takeuchi and Ken Perlin-Works
 
3 d video streaming for virtual exploration of planet surface
3 d video streaming for virtual exploration of planet surface3 d video streaming for virtual exploration of planet surface
3 d video streaming for virtual exploration of planet surface
 

More from rajab ssemwogerere

Define cancer treatment using knn and naive bayes algorithms
Define cancer treatment using knn and naive bayes algorithmsDefine cancer treatment using knn and naive bayes algorithms
Define cancer treatment using knn and naive bayes algorithmsrajab ssemwogerere
 

More from rajab ssemwogerere (6)

Presentation machine learning
Presentation machine learningPresentation machine learning
Presentation machine learning
 
Define cancer treatment using knn and naive bayes algorithms
Define cancer treatment using knn and naive bayes algorithmsDefine cancer treatment using knn and naive bayes algorithms
Define cancer treatment using knn and naive bayes algorithms
 
Data security and privacy
Data security and privacyData security and privacy
Data security and privacy
 
Evaluate procedures
Evaluate proceduresEvaluate procedures
Evaluate procedures
 
Access control data security
Access control data securityAccess control data security
Access control data security
 
Application virtualization
Application virtualizationApplication virtualization
Application virtualization
 

Recently uploaded

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Map reduce presentation

  • 1. MAPREDUCE MAKERERE UNIVERSITY MS-CS RAJAB SSEMWOGERERE 2019/HD03/29911U
  • 2. MapReduce Definition Is a programming model having a simplified implementation of many data parallel applications for processing and generating large datasets. 10/1/2019 8:11:02 AM RAJAB SSEMWOGERERE 2/11
  • 3. How MapReduce operates A MapReduce maps data and reduces data. Mapping transforms data as data comes in one line at a time (for every input line there is one output from the mapper). Then the reducer aggregates data together. 10/1/2019 8:11:02 AM RAJAB SSEMWOGERERE 3/11
  • 4. Example of where MapReduce can be applied How many movies each user rated on the Movie Lens. A MovieLens is a web-based virtual community system recommender that recommends movies for its users to watch, based on their preferences using a collaborative filtering of members’ movie ratings and movie reviews. 10/1/2019 8:11:02 AM RAJAB SSEMWOGERERE 4/11
  • 5. Sample Dataset of a MovieLens Data UserID MOVIEID RATING TIMESTAMP 196 242 3 881250949 186 302 3 891717742 196 377 1 878887116 244 51 2 880606923 166 346 1 886397596 186 474 4 884182806 186 265 2 881171488 10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 5/11
  • 6. The mapper function The mapper function converts raw data into key/value pairs, the key will be the userID, and the value will be the movieID. We don’t care about the rating and the timestamp for optimization benefits. def mapper_get_userID (self, _, line): (userID, movieID) = line.split('t') yield userID, 1 10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 6/11
  • 7. The mapper function continues’ The Mapper function will convert the raw data into key/value pairs. 196:1 186:1 196:1 244:1 166:1 186:1 186:1 By the time the mapper function finishes our data will be well extracted and organized for the reducer function to aggregate. 10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 7/11
  • 8. Shuffle and Sort MapReduce sorts and Groups the Mapped Data (“Shuffle and Sort”) at this point it aggregates the values for each unique key. 196:1 186:1 196:1 244:1 166:1 186:1 186:1 166:1 186:1,1,1 196:1,1 244:1 10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 8/11
  • 9. The reducer function Given the output from shuffle and sort. The reducer is called once for each unique key, and then processes or does the computation then produces the output. def reducer_count_ratings (self, key, values): yield key, sum (values) 10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 9/11
  • 10. The reducer function continues’ 166:1 186:1,1,1 196:1,1 244:1 Out put 166:1 186:3 196:2 244:1 10/1/2019 8:11:03 AM RAJAB SSEMWOGERERE 10/11
  • 11. THANK YOU FOR YOUR ATTENTION END 10/1/2019 8:11:04 AM RAJAB SSEMWOGERERE 11/11