THE BIG DATA CONWhy Big Data is a Problem Not a Solution           #TheBigDataCon
IAN PLOSKERDirector, Technical Operations, EMEA          Basho Technologies           @dstroyallmodels
WHO ISbasho   ?
WE MAKE
DISCLAIMERS
ALL OPINIONS EXPRESSED HEREINARE MY OWN AND NOT THOSE OF MY EMPLOYER OR ANYONE ELSE
I’M NOT TROLLING
I’M NOT FUD RAKING
WELL, MAYBE JUST A LITTLE          BIT
LET’S GET STARTED
WHAT IS BIG DATA?
HOW BIG IS BIG?
GIGABYTES, TERABYTES,PETABYTES, EXABYTES?
IF THERE’S BIG DATA,SHOULDN’T THERE ALSO BEMEDIUM AND SMALL DATA
BIG DATA IS MORE DATATHAN YOU KNOW WHAT TO        DO WITH
BIG DATA IS THE DATA THATYOU DON’T KNOW WHAT TO         DO WITH
THE PROMISE OF BIG DATA
IS TO STORE THE REAMS
YOU DON’T KNOW HOW TO          USE
SO YOU CAN EXTRACTVALUE FROM IT IN THE      FUTURE
LET’S BE HONEST
THE PEOPLE WHO AREMAKING MONEY OFF OF BIG         DATA
ARE THE PEOPLEEXTRACTING VALUE FROM IT         TODAY
I.E. THE PEOPLE SELLING BIG      DATA SOLUTIONS
EVERYONE AND THEIRGRANDMA ARE TRYING TO GET IN ON THE ACTION
VENDORS ARE REPACKAGING   THE SAME OLD THING
AND TRYING TO TRICK USINTO THINKING ITS THE NEW         HOTNESS
LET’S NOT BE FOOLED BY       MARKETING
WORDS ARE IMPORTANT
MARKETING WORKS TOSEPARATE WORDS FROM    THEIR MEANING
FROM THEIR ORIGINAL      INTENT
TO GET YOU TO ASSOCIATE        WORDS
WITH PARTICULAR
PRODUCTS
BRANDS
AND VENDORS
</RANT>
LET’S TRY TO IMPROVE THE  STATE OF DISCOURSE
SO LET’S BRING SOME NEWCATCHPHRASES TO THE TABLE
SO WE DON’T HAVE TO TALKABOUT BIG DATA ANYMORE
INSTEAD LET’S TALK ABOUT     CRITICAL DATA
WHAT IS CRITICAL DATA?
Is your data really    that critical,       dude?
IT’S MISSION CRITICAL DATA
IT’S DATA WHOSE UNAVAILABILITY
COSTS YOU MONEY
CRED
OR LIVES
IT IS THE DATA NEEDED NOW
NOT AT SOME DISTANTPOINT IN THE FUTURE
IT IS THE DATA THAT YOU
YOUR CUSTOMERS
OR SOCIETY
CAN CAPITALIZE ON TODAY
ITS VALUE IS CAPTURED BYTHE OWNERS OF THE DATA
RATHER THAN A THIRD PARTY
HOW DO YOU IDENTIFY YOUR CRITICAL DATA?
LATENCY AT SOME MARGIN    APPEARS SIMPLY AS     UNAVAILABILITY
Who cares about latency?Sometimes high latency looks like an outage             to the end user.
FOR AMAZON:       100MS LATENCY    DECREASES SALES BY 1%Source: http://sites.google.com/site/glinden/Home/StanfordDataMini...
FOR GOOGLE:    A 500MS INCREASE IN LATENCY          REDUCED TRAFFIC           BY 20 PERCENTSource: http://sites.google.com...
WHAT DOES A SYSTEM FORCRITICAL DATA LOOK LIKE?
STREAMING PROCESSING
STORM
RIAK_PIPE
EXAMPLES OF DYNAMO      SYSTEMS
VOLDEMORT
THESE SYSTEMS SACRIFICE CONSISTENCY FOR HIGHAVAILABILITY/LOW LATENCY
IM NOT MAKING A SALES         PITCH
DONT USE MY DATABASE
SERIOUSLY, DONT
UNLESS
YOU ARE WILLING TO    SACRIFICE
FAMILIAR DATA AND QUERY         MODELS
FAMILIAR HIRING PATTERNS
KNOWN OPERATIONAL     ISSUES
FOR
PREDICTABLE LATENCY
AVAILABILITY
PREDICTABLE OPERATIONS
OR IF DATA UNAVAILABILITY COSTS YOU $$$ OR MORE
THANKS
ian@basho.com  @dstroyallmodelsgithub.com/ian-plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
Upcoming SlideShare
Loading in …5
×

The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker

975 views

Published on

"Big Data" is a commonly brandished term, but rarely does anyone provide a clear definition of it. If we stop to define Big Data, we find that at its core, it is about an inability to meet the challenges of information storage and access in a world where data is generated by the petabyte. As an alternative exists the concept of Critical Data, data that generates value for businesses, whose unavailability costs money or even lives. This is the data which has already been identified and understood to be at the core of a business. Together, we will gain a deeper understanding of how to gather, store, and access this business critical information.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
975
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker

  1. 1. THE BIG DATA CONWhy Big Data is a Problem Not a Solution #TheBigDataCon
  2. 2. IAN PLOSKERDirector, Technical Operations, EMEA Basho Technologies @dstroyallmodels
  3. 3. WHO ISbasho ?
  4. 4. WE MAKE
  5. 5. DISCLAIMERS
  6. 6. ALL OPINIONS EXPRESSED HEREINARE MY OWN AND NOT THOSE OF MY EMPLOYER OR ANYONE ELSE
  7. 7. I’M NOT TROLLING
  8. 8. I’M NOT FUD RAKING
  9. 9. WELL, MAYBE JUST A LITTLE BIT
  10. 10. LET’S GET STARTED
  11. 11. WHAT IS BIG DATA?
  12. 12. HOW BIG IS BIG?
  13. 13. GIGABYTES, TERABYTES,PETABYTES, EXABYTES?
  14. 14. IF THERE’S BIG DATA,SHOULDN’T THERE ALSO BEMEDIUM AND SMALL DATA
  15. 15. BIG DATA IS MORE DATATHAN YOU KNOW WHAT TO DO WITH
  16. 16. BIG DATA IS THE DATA THATYOU DON’T KNOW WHAT TO DO WITH
  17. 17. THE PROMISE OF BIG DATA
  18. 18. IS TO STORE THE REAMS
  19. 19. YOU DON’T KNOW HOW TO USE
  20. 20. SO YOU CAN EXTRACTVALUE FROM IT IN THE FUTURE
  21. 21. LET’S BE HONEST
  22. 22. THE PEOPLE WHO AREMAKING MONEY OFF OF BIG DATA
  23. 23. ARE THE PEOPLEEXTRACTING VALUE FROM IT TODAY
  24. 24. I.E. THE PEOPLE SELLING BIG DATA SOLUTIONS
  25. 25. EVERYONE AND THEIRGRANDMA ARE TRYING TO GET IN ON THE ACTION
  26. 26. VENDORS ARE REPACKAGING THE SAME OLD THING
  27. 27. AND TRYING TO TRICK USINTO THINKING ITS THE NEW HOTNESS
  28. 28. LET’S NOT BE FOOLED BY MARKETING
  29. 29. WORDS ARE IMPORTANT
  30. 30. MARKETING WORKS TOSEPARATE WORDS FROM THEIR MEANING
  31. 31. FROM THEIR ORIGINAL INTENT
  32. 32. TO GET YOU TO ASSOCIATE WORDS
  33. 33. WITH PARTICULAR
  34. 34. PRODUCTS
  35. 35. BRANDS
  36. 36. AND VENDORS
  37. 37. </RANT>
  38. 38. LET’S TRY TO IMPROVE THE STATE OF DISCOURSE
  39. 39. SO LET’S BRING SOME NEWCATCHPHRASES TO THE TABLE
  40. 40. SO WE DON’T HAVE TO TALKABOUT BIG DATA ANYMORE
  41. 41. INSTEAD LET’S TALK ABOUT CRITICAL DATA
  42. 42. WHAT IS CRITICAL DATA?
  43. 43. Is your data really that critical, dude?
  44. 44. IT’S MISSION CRITICAL DATA
  45. 45. IT’S DATA WHOSE UNAVAILABILITY
  46. 46. COSTS YOU MONEY
  47. 47. CRED
  48. 48. OR LIVES
  49. 49. IT IS THE DATA NEEDED NOW
  50. 50. NOT AT SOME DISTANTPOINT IN THE FUTURE
  51. 51. IT IS THE DATA THAT YOU
  52. 52. YOUR CUSTOMERS
  53. 53. OR SOCIETY
  54. 54. CAN CAPITALIZE ON TODAY
  55. 55. ITS VALUE IS CAPTURED BYTHE OWNERS OF THE DATA
  56. 56. RATHER THAN A THIRD PARTY
  57. 57. HOW DO YOU IDENTIFY YOUR CRITICAL DATA?
  58. 58. LATENCY AT SOME MARGIN APPEARS SIMPLY AS UNAVAILABILITY
  59. 59. Who cares about latency?Sometimes high latency looks like an outage to the end user.
  60. 60. FOR AMAZON: 100MS LATENCY DECREASES SALES BY 1%Source: http://sites.google.com/site/glinden/Home/StanfordDataMining.2006-11-28.ppt
  61. 61. FOR GOOGLE: A 500MS INCREASE IN LATENCY REDUCED TRAFFIC BY 20 PERCENTSource: http://sites.google.com/site/glinden/Home/StanfordDataMining.2006-11-28.ppt
  62. 62. WHAT DOES A SYSTEM FORCRITICAL DATA LOOK LIKE?
  63. 63. STREAMING PROCESSING
  64. 64. STORM
  65. 65. RIAK_PIPE
  66. 66. EXAMPLES OF DYNAMO SYSTEMS
  67. 67. VOLDEMORT
  68. 68. THESE SYSTEMS SACRIFICE CONSISTENCY FOR HIGHAVAILABILITY/LOW LATENCY
  69. 69. IM NOT MAKING A SALES PITCH
  70. 70. DONT USE MY DATABASE
  71. 71. SERIOUSLY, DONT
  72. 72. UNLESS
  73. 73. YOU ARE WILLING TO SACRIFICE
  74. 74. FAMILIAR DATA AND QUERY MODELS
  75. 75. FAMILIAR HIRING PATTERNS
  76. 76. KNOWN OPERATIONAL ISSUES
  77. 77. FOR
  78. 78. PREDICTABLE LATENCY
  79. 79. AVAILABILITY
  80. 80. PREDICTABLE OPERATIONS
  81. 81. OR IF DATA UNAVAILABILITY COSTS YOU $$$ OR MORE
  82. 82. THANKS
  83. 83. ian@basho.com @dstroyallmodelsgithub.com/ian-plosker

×