MapReduce with Big Data
Jagriti Srivastava
2
3
Tools
4
●
Large volume of data – structured and unstructured
●
It’s what organizations do with the data that matters.
●
Helps for better decisions and strategic business moves.
●
Map Reduce for big data scenario :
– Data of total social media sign up from different countries.
– Listing of those data using Map Reduce technique.
– Search engines could determine page views, and marketers
could perform sentiment analysis using MapReduce.
Big Data with Map Reduce
5
MapReduce Implementation
●
At Google:
–  Index building for Google Search
– – Article clustering for Google News
– Statistical machine translation
●
  At Yahoo!:
–  Index building for Yahoo! Search
–  Spam detection for Yahoo! Mail
●
At Facebook:
–  Data mining
–  Ad optimization
–  Spam detection Example
●
  At Amazon:
–  Product clustering
–  Statistical machine translation
6
Why MapReduce in BigData
●
Responsible for delegating work to the different nodes in the cluster/map
and
●
Collects all the results from the query into one cohesive answer.
●
Components of MapReduce :
– JobTracker (the master node),
– TaskTrackers (these are agents within each cluster, with functions of their own) and
– JobHistoryServer (deployed as separate function, but a component that tracks jobs.
7

Map reduce with big data

  • 1.
    MapReduce with BigData Jagriti Srivastava
  • 2.
  • 3.
  • 4.
    4 ● Large volume ofdata – structured and unstructured ● It’s what organizations do with the data that matters. ● Helps for better decisions and strategic business moves. ● Map Reduce for big data scenario : – Data of total social media sign up from different countries. – Listing of those data using Map Reduce technique. – Search engines could determine page views, and marketers could perform sentiment analysis using MapReduce. Big Data with Map Reduce
  • 5.
    5 MapReduce Implementation ● At Google: – Index building for Google Search – – Article clustering for Google News – Statistical machine translation ●   At Yahoo!: –  Index building for Yahoo! Search –  Spam detection for Yahoo! Mail ● At Facebook: –  Data mining –  Ad optimization –  Spam detection Example ●   At Amazon: –  Product clustering –  Statistical machine translation
  • 6.
    6 Why MapReduce inBigData ● Responsible for delegating work to the different nodes in the cluster/map and ● Collects all the results from the query into one cohesive answer. ● Components of MapReduce : – JobTracker (the master node), – TaskTrackers (these are agents within each cluster, with functions of their own) and – JobHistoryServer (deployed as separate function, but a component that tracks jobs.
  • 7.