Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tuple map reduce: beyond classic mapreduce

4,492 views

Published on

Tuple MapReduce, a new
foundational model extending MapReduce with the notion of
tuples. Tuple MapReduce allows to bridge the gap between the
low-level constructs provided by MapReduce and higher-level
needs required by programmers, such as compound records,
sorting or joins. This paper presents as well Pangool, an open-
source framework implementing Tuple MapReduce. Pangool
eases the design and implementation of applications based
on MapReduce and increases their flexibility, still maintaining
Hadoop’s performance.

Published in: Technology
  • Writing a good research paper isn't easy and it's the fruit of hard work. For help you can check writing expert. Check out, please ⇒ www.WritePaper.info ⇐ I think they are the best
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/2F90ZZC ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ♥♥♥ http://bit.ly/2F90ZZC ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (Unlimited) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/qURD } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/qURD } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/qURD } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/qURD } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/qURD } ......................................................................................................................... Download doc Ebook here { https://soo.gd/qURD } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (Unlimited) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { http://shorturl.at/krvUW } ......................................................................................................................... Download Full EPUB Ebook here { http://shorturl.at/krvUW } ......................................................................................................................... Download Full doc Ebook here { http://shorturl.at/krvUW } ......................................................................................................................... Download PDF EBOOK here { http://shorturl.at/krvUW } ......................................................................................................................... Download EPUB Ebook here { http://shorturl.at/krvUW } ......................................................................................................................... Download doc Ebook here { http://shorturl.at/krvUW } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Tuple map reduce: beyond classic mapreduce

  1. 1. Tuple MapReduce: Beyond classic MapReducePedro Ferrera, Ivan de Prado, Eric Palacios Jose Luis Fernandez­Marquez DataSalt Giovanna Di Marzo Serugendo Barcelona, SPAIN University of Geneva, CUI pere,ivan,epalacios@datasalt.com Geneva, SWITZERLAND joseluis.fernandez@unige.ch
  2. 2. Outline● Introduction● Related Work● Classic MapReduce – The problems of MapReduce● Tuple MapReduce – The basic Tuple MapReduce – Joins – Generalization of MapReduce● Pangool● Conclusions and Future work 2 / 18
  3. 3. Introduction● A huge amount of information → needs for new processing technologies.● MapReduce → major contribution ... – … but involves a sharp learning curve.● Most of design patterns found in real world problems are not well covered.● We propose Tuple MapReduce as a better foundation model.● TupleMapReduce on Hadoop → Pangool – No key architectural changes needed. 3 / 18
  4. 4. Related work● MapReduce: Google paper on 2004● Hadoop● Higher level tools – Sawzall, FlumeJava, Pig, Hive, Jaql, Cascading● Higher level abstractions very popular – Supports the idea of MapReduce as a too low-level paradigm● Merge MapReduce – Targets the problem of relational operations (joins) – Implies changes in the architecture and a new step merge 4 / 18
  5. 5. Classic MapReduce● Jobs – input file, ouput file – Developer provides two functions: map and reduce● Distributed execution of work – Firstly the map function in the mapper phase – Then the reduce function in the reducing phase 5 / 18
  6. 6. The problems of MapReduce● Compound records – Real world problems include multi-field records. They don’t fit well on the key/value schema● Sorting – No inherent sorting within the reduce records. – “secondary sorting trick” on implementations (Hadoop)● Join – A quite common operation – Not directly possible in MapReduce without using “tricks”: ● secondary sorting ● compound records 6 / 18
  7. 7. Tuple MapReduce● Idea: replace key/value by tuples● group-by and sort-by clauses 7 / 18
  8. 8. Tuple MapReduce (II)● group-by and sort-by constraint – group-by as a prefix of sort-by – Needed if you want to be able to implement Tuple MapReduce over a MapReduce architecture● Contrary to MapReduce, Tuple MapReduce: – provides compound records → tuple – provides intra-reduce sorting 8 / 18
  9. 9. Example: cumulative visits● Cumulative # of visits up to each single date Input → URL, date, visits <<< Expected output → URL, date, cumulative visits 9 / 18
  10. 10. Join-Tuple MapReduce● Joins among heterogeneous datasets – Tuples associated with a source-id. ● Tuples reach the reducer sorted by source-id –enabling memoryless reduce joins – and grouped by some common fields 10 / 18
  11. 11. Example: join between clients and payments name client_id payment_id amountclients Inner join payments 11 / 18
  12. 12. Generalization of MapReduce● MapReduce is a TupleMapReduce with... – tuples of two values and – group-by and sort-by set to first value● The opposite is also possible → implementing Tuple MapReduce into existing MapReduce implementations. – Architectural changes are not needed. – Pangool is a proof of that. 12 / 18
  13. 13. Pangool pangool.net● Tuple MapReduce implementation on top of Hadoop. – On top of existing MapReduce implementation. ● It is just a library. No architecture change was needed.● Used on real world applications – Banking – Searching – Social networks 13 / 18
  14. 14. Pangool benchmark – secondary sort 14 / 18
  15. 15. Pangool benchmark – join 15 / 18
  16. 16. Pangool performance● Just between 5% and 8% worst than Hadoop – Pretty good considering that Pangool is built on top of Hadoop API ● The difference would probably disappear with a native implementation● Much better than higher level APIs – Probably because Pangool is a low level API 16 / 18
  17. 17. Conclusions and Future work● MapReduce key/value has been shown too strict.● Tuple MapReduce keep MapReduce features – Enhancing it with ● compound records, ● joins and ● intra-reduce sorting.● Pangool is a proof of its viability, – including in existing implementations like Hadoop without changing the architecture● Future work would involve abstractions for flow creations – Simplifying job chaining and data flow. 17 / 18
  18. 18. Thanks! Pedro Ferrera, Ivan de Prado, Eric Palacios Jose Luis Fernandez­Marquez DataSalt Giovanna Di Marzo Serugendo Barcelona, SPAIN University of Geneva, CUI pere,ivan,epalacios@datasalt.com Geneva, SWITZERLAND joseluis.fernandez@unige.ch ● Any questions, or doubts? – ivan@datasalt.com – @ivanprado 18 / 18

×