Your SlideShare is downloading. ×
Million Monkeys User Group
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Million Monkeys User Group

181
views

Published on

Million Monkeys presentation given to Silicon Mountain Technology Group on 11-12-2012.

Million Monkeys presentation given to Silicon Mountain Technology Group on 11-12-2012.


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
181
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Interesting statistical question. Thought about since Aristotle.Randomness+Resouces+Time=Anything PossibleNo real monkeys – need virtual monkeys
  • Shakespeare lazy. Heavily influenced English Literature.Big Data isn’t always a huge file. It can be high computation.
  • This is not a map of MT and ID1 to 20 node testingKeep efficiency up RDBMS efficiency in gutter
  • Engineers not spending time coding to scale. Busy adding new features.No code changes for scaling. Took 1.5 months on one computer and 3.5 days on 20 nodesSpending on new computers gives a consistent, linear increase. Compare spending on RDBMS and Hadoop.
  • Transcript

    • 1. DO NOT USE PUBLICLY Million Monkeys PRIOR TO 10/23/12 Headline Goes Here Jesse Anderson | Curriculum Developer and Instructor Speaker Name or Subhead Goes Here November 20121
    • 2. About Me • Cloudera - Educational Services Team • Twitter - @jessetanderson • Blog and more info: http://www.jesse-anderson.com • Screencasts on Pragmatic Programmers: Buy It Now on http://www.jesse-anderson.com • President – Northern Nevada Software Developers Group2
    • 3. About Cloudera • Cloudera is “The commercial Hadoop company” • Founded by leading experts on Hadoop from Facebook, Google, Oracle and Yahoo • Provides consulting and training services for Hadoop users • Staff includes committers to virtually all Hadoop projects3
    • 4. Introduction • Infinite Monkey Theorem • Hadoop • Million Monkeys Algorithm • Business Case4
    • 5. Infinite Monkey Theorem5
    • 6. Exponential Growth (aka Big Data) Odds of finding a group Contiguous Combinations of characters is 1 in 26 Characters raised to the power of the number of 8 208,827,064,576 contiguous characters 9 5,429,503,678,976 10 141,167,095,653,3766
    • 7. Hadoop • Apache Project • Reliable, Scalable, Distributed Computing • Software Framework • MapReduce • Distributed File System (HDFS) • Other projects7
    • 8. Map Create or process the input data8
    • 9. Reduce Process data from Map into something usable9
    • 10. Data Flow10
    • 11. Million Monkeys Algorithm11
    • 12. Business Case12
    • 13. Hadoop Scalability Percent of Linear Scalability 100 80 Percent 60 RDBMS Hadoop 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Nodes RDBMS = Relational Database13
    • 14. Business Value of Scalability Scaling does not require Adding more computers massive re-engineering to cluster gets a and complete rewrites of predictable increase in code computational power and storage SAVE SAVE14
    • 15. Going Viral (and taking over the world) Covered internationally 26,000 unique in BBC, Wall Street visits from 119 Journal, Wired and countries in Slashdot one day15
    • 16. Next Steps • Books • Hadoop: The Definitive Guide - Tom White • Hadoop Operations - Eric Sammer • Cloudera Training • Developer, Admin, Hive and Pig, HBase, Essentials • CDH • Clouderas Apache Distribution Including Hadoop • Open Source • VM Image16
    • 17. Conclusion • MapReduce breaks up problem efficiently • No code changes to scale • Incredible scalability • Enables previously impossible tasks17
    • 18. 18

    ×