Big Data for the Rest of Us - OpenWest 2014 - Matt Asay

2,814 views
2,641 views

Published on

Everyone wants to launch a Big Data project, but there's still lots of confusion as to how. The key, as this presentation shows, is to minimize the cost of failure and increase iteration, using a strategy based on using well-known open-source tools.

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,814
On SlideShare
0
From Embeds
0
Number of Embeds
69
Actions
Shares
0
Downloads
31
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Big Data for the Rest of Us - OpenWest 2014 - Matt Asay

  1. 1. MongoDB Inc. Proprietary and Confidential Big Data for the Rest of Us VP, Marketing & Business Development, MongoDB Matt Asay
  2. 2. 2 Not the Future “The relational database market is a $9 billion a year market. I want to shrink it to $3 billion and take a third of the market.” - Marten Mickos
  3. 3. 3 This Is the Future “The biggest category of winners is the Big Data practitioners. These are the business people that have identified opportunities to use data to create new opportunities or disrupt legacy business models. We think this opportunity is so profound, we believe that the dividing line between winners and losers in the business world over the next decade will hinge on a company’s ability to leverage data as an asset.” - Peter Goldmacher, Cowen & Co.
  4. 4. 4 What’s at Stake Enable a Generation of Innovative, Modern Applications Previously Impossible Or Too Difficult to Achieve
  5. 5. The Big Data Unknown
  6. 6. 6 Top Big Data Challenges? Translation? Most struggle to know what Big Data is, how to manage it and who can manage it Source: Gartner
  7. 7. 7 •  More than 90% of today’s data was created in the last 2 years •  Moore’s Law for data: Doubles at regular intervals Big Data: Volume Matters
  8. 8. 8 Big(ger) Is the New Normal
  9. 9. 9 Volume Is Not the Problem “Of Gartner's "3Vs" of big data (volume, velocity, variety), the variety of data sources is seen by our clients as both the greatest challenge and the greatest opportunity.” - Forrester, 2014 * From Big Data Executive Summary of 50+ execs from F100, gov orgs What are the primary data issues driving you to consider Big Data?* Data Variety (68%) Data Volume (15%) Other Data (17%) Diverse, streaming or new data types Greater than 100TB Less than 100TB
  10. 10. 10 Modern, “Big” Data Is Messy
  11. 11. 11 Data Now Looks Like This
  12. 12. 12 And This
  13. 13. 13 And This
  14. 14. 14 •  90% of the world’s data was created in the last two years •  80% of enterprise data is unstructured •  Unstructured data growing 2X faster than structured Time to Rethink the Solution
  15. 15. Innovation As Iteration
  16. 16. “I have not failed. I've just found 10,000 ways that won't work.” ― Thomas A. Edison
  17. 17. 17 Back in 1970…Cars Were Great!
  18. 18. 18 So Were Computers!
  19. 19. 19 Including the Relational Database
  20. 20. 20 Lots of Great Innovations Since 1970
  21. 21. 21 Legacy Data Infrastructure Makes Development Hard Relational Database Object Relational Mapping Application Code XML Config DB Schema
  22. 22. 22 And Even Harder To Iterate New Table New Table New Column Name Pet Phone Email New Column 3 months later…
  23. 23. 23 So…Use Open Source
  24. 24. 24 Big Data != Big Upfront Payment
  25. 25. 25 Shouldn’t Be Penalized for Success “Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.” IBM Press Release 28 Aug, 2012
  26. 26. 26 Spoiled for choice DB-Engines.com Database Ranking Ranking Database Type Score Changes 1 Oracle Relational 1514.08 22.28 2 MySQL Relational 1292.67 2.45 3 Microsoft SQL Relational 1210.43 5.15 4 PostgreSQL Relational 230.23 -4.82 5 MongoDB Document 214.34 14.35 6 DB2 Relational 184.58 -2.74 7 Microsoft Access Relational 142.76 -3.72 8 SQLite Relational 90.17 -2.8 9 Cassandra Wide Column 78.72 0.63 10 Sybase Relational 78.14 -3.42
  27. 27. 27 Remember the Long Tail?
  28. 28. 28 It Didn’t Work Out So Well
  29. 29. 29 Use Popular, Well-Known Technologies Source: Silicon Angle, 2012
  30. 30. 30 The Data Scientist Is You “Organizations already have people who know their own data better than mystical data scientists….Learning Hadoop is easier than learning the company’s business.” (Gartner, 2012)
  31. 31. @mjasay

×