Strata + Hadoop World 2012 Keynote: Beyond Batch - Doug Cutting
 

Strata + Hadoop World 2012 Keynote: Beyond Batch - Doug Cutting

on

  • 2,368 views

Hadoop started as an offline, batch-processing system. It made it practical to store and process much larger datasets than before. Subsequently, more interactive, online systems emerged, integrating ...

Hadoop started as an offline, batch-processing system. It made it practical to store and process much larger datasets than before. Subsequently, more interactive, online systems emerged, integrating with Hadoop. First among these was HBase, the key/value store. Now scalable interactive query engines are beginning to join the Hadoop ecosystem. Realtime is gradually becoming a viable peer to batch in big data.

Statistics

Views

Total Views
2,368
Views on SlideShare
1,578
Embed Views
790

Actions

Likes
5
Downloads
56
Comments
0

5 Embeds 790

http://www.cloudera.com 775
http://cloudera.com 8
http://author01.mtv.cloudera.com 4
http://author01.core.cloudera.com 2
http://my.cloudera.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • you've heard a lot worried it might be hype bubbleyou might be hesitatingbelief: hadoop has a great futureover next few minutes tell you where hadoop is today and where hadoop's going so you can be comfortable adopting it for long-term profit from all of your data
  • Proven incredibly usefulEnables folks to benefit from vastly more dataNot something we’re ashamed of, rather proud of
  • … Need to look forward
  • …Back to today…
  • Major new capability in Impala not a niche another step towards a grander future we know where we're headed we shouldn't resist adoption use Impala today and expect more tomorrow
  • For more, attend the 1:40 presentation on Impala

Strata + Hadoop World 2012 Keynote: Beyond Batch - Doug Cutting Strata + Hadoop World 2012 Keynote: Beyond Batch - Doug Cutting Presentation Transcript

  • DO NOT USE PUBLICLY Beyond Batch PRIOR TO 10/23/12 Headline Goes Here Doug Cutting Speaker Name or Subhead Goes Here October 20121
  • Hadoop Started As Batch • Simple, powerful MapReduce • Kills a lot of birds • Efficient, scalable • Compute at storage • Shared platform • Used by Pig, Hive, etc. • Incredibly useful! • But not sufficient2
  • Big Data Is Not (Just) Batch Its true themes are: • Scalability • Affordability • Commodity hardware • Open-source software • Distributed & reliable • Schema on read • Data beats algorithms3 View slide
  • HBase: First Non-Batch Component Online key/value store • Complement to batch • Online put/get • Batch load & analyze • Best of both • Popular combination • A step towards the future…4 View slide
  • Holy Grail Of Big Data • Open source, commodity HW, etc. • Linear scaling • To scale, just buy more hardware • On many axes • Storage capacity • Throughput & latency • of batch & query • Transactions, Joins, Indexes • and batch!5
  • Google Gives Us A Map Google publication Open source project Google publication Apache project 2004 2004 GFS & MapReduce GFS & MapReduce 2006 2006 Hadoop Hadoop batch programs batch programs 2005 2005 Sawzall Sawzall 2008 2008 Pig & Hive Pig & Hive batch queries batch queries 2006 2006 BigTable BigTable 2008 2008 HBase HBase online key/value online key/value ... ... ... ... ... ... ... ... ... ... 2012 2012 Spanner Spanner ? ? ? ? transactions, etc. holy grail? 5 years – 26 authors!6
  • Impala Is Latest Step Google publication Open source project Google publication Apache project 2004 GFS & MapReduce 2006 Hadoop batch programs 2004 GFS & MapReduce 2006 Hadoop batch programs 2005 Sawzall 2008 Pig & Hive batch queries 2005 Sawzall 2008 Pig & Hive batch queries 2006 BigTable 2008 HBase online key/value 2006 BigTable 2008 HBase online key/value 2010 Dremel/F1 2012 Impala online queries ... ... ... ... ... 2012 2012 Spanner Spanner ? ? ? ? transactions, etc. holy grail?7
  • @cutting #bigquestions8