Your SlideShare is downloading. ×
0
NICK HALSTEAD, FOUNDERDATASIFT, @NIKBig Data“Myths and Legends”#BDW13Thursday, 25 April 13
#BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS#DATASIFTThursday, 25 April 13
#BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS1.5 BILLION ITEMS DAY#DATASIFTThursday, 25 Apri...
#BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS1.5 BILLION ITEMS DAY1.5 PETABYTES OF STORAGE#D...
#BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS1.5 BILLION ITEMS DAY1.5 PETABYTES OF STORAGE50...
Big Data“Myths and Legends”#BD13Thursday, 25 April 13
BIG DATA PERCEPTION#GOOGLEI THOUGHT I WOULD ASK GOOGLE....Thursday, 25 April 13
BIG DATA PERCEPTION#GOOGLEI THOUGHT I WOULD ASK GOOGLE....Thursday, 25 April 13
BIG DATA PERCEPTION#GOOGLEI THOUGHT I WOULD ASK GOOGLE....Thursday, 25 April 13
BIG DATA VENDOR “MYTHS”Thursday, 25 April 13
Thursday, 25 April 13
BIG DATA VENDOR “MYTHS”Thursday, 25 April 13
#BDW13Thursday, 25 April 13
1. YOU MUST BUY ALL OF THIS (for one job!)#BDW13Thursday, 25 April 13
2. HOW BIG IS “BIG”Thursday, 25 April 13
#BDW13Thursday, 25 April 13
20 PETABYTES IN EACH SEARCH INDEX REBULD (this was 2 years ago)#BDW13Thursday, 25 April 13
20 PETABYTES IN EACH SEARCH INDEX REBULD (this was 2 years ago)900,000 SERVERS#BDW13Thursday, 25 April 13
#BDW13Thursday, 25 April 13
#BDW133.2 BILLION LIKES AND COMMENTS PER DAYThursday, 25 April 13
#BDW133.2 BILLION LIKES AND COMMENTS PER DAYOVER HALF A PETABYTE … EVERY 24 HOURSThursday, 25 April 13
#BDW13 #HADRONThursday, 25 April 13
150 MILLION SENSORS DELIVERING DATA 40 MILLION TIMES PER SECOND#BDW13 #HADRONThursday, 25 April 13
150 MILLION SENSORS DELIVERING DATA 40 MILLION TIMES PER SECOND10’s OF PETABYTES PER YEAR#BDW13 #HADRONThursday, 25 April 13
A TYPICAL COMPANYThursday, 25 April 13
A TYPICAL COMPANY100 EMPLOYEESThursday, 25 April 13
A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERSThursday, 25 April 13
A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS25 DATABASES (customers, transactions, etc)Thursday, 25 April 13
A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS25 DATABASES (customers, transactions, etc)Thu...
A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS5,000 BYTES PER TRANSACTION25 DATABASES (custo...
A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS5,000 BYTES PER TRANSACTION25 DATABASES (custo...
A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS5,000 BYTES PER TRANSACTION25 DATABASES (custo...
A TYPICAL HARDDRIVE2000 GIGABYTES (2TB)Thursday, 25 April 13
A TYPICAL HARDDRIVE2000 GIGABYTES (2TB)4000 GIGABYTES (4TB)Thursday, 25 April 13
3. YOU NEED *LOTS* OF DATA SCIENTISTS#DILBERT#BDW13Thursday, 25 April 13
3. YOU NEED *LOTS* OF DATA SCIENTISTS#DILBERT#BDW13Thursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13Thursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13BANKINGThursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13BANKINGCOMMUNICATIONSThursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13BANKINGCOMMUNICATIONSGOVERNMENTThursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13Thursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13WEB LOGS 51%Thursday, 25 April 13
4. HOW BIG DATA IS USED#BDW13WEB LOGS 51%CLICK STREAM 35%Thursday, 25 April 13
5. HADOOP GONE BAD+SQL#BDW13 #HADOOPGONEBADThursday, 25 April 13
RDBM - RELATIONAL DATABASE#BDW13Thursday, 25 April 13
RDBM - RELATIONAL DATABASENEEDS TO BE PRE-DEFINED#BDW13Thursday, 25 April 13
RDBM - RELATIONAL DATABASENEEDS TO BE PRE-DEFINEDREQUIRES INDEX TO PERFORM#BDW13Thursday, 25 April 13
RDBM - RELATIONAL DATABASENEEDS TO BE PRE-DEFINEDREQUIRES INDEX TO PERFORMQUERIES ARE CONSTRAINED#BDW13Thursday, 25 April 13
MAP REDUCE#MAPREDUCE#BDW13Thursday, 25 April 13
MAP REDUCEPROCESS CLOSE TO THE DATA#MAPREDUCE#BDW13Thursday, 25 April 13
MAP REDUCEPROCESS CLOSE TO THE DATAPARALLEL EXECUTION#MAPREDUCE#BDW13Thursday, 25 April 13
MAP REDUCEPROCESS CLOSE TO THE DATAPARALLEL EXECUTIONANY TYPE OF ANALYSIS#MAPREDUCE#BDW13Thursday, 25 April 13
MAP REDUCEPROCESS CLOSE TO THE DATAPARALLEL EXECUTIONANY TYPE OF ANALYSISHIDES DETAILS OFFAULT TOLERANCE, LOCALITYAND LOAD...
BIG DATA SCHEMA #NOSQLHBASECOLUMNS FILES#BDW13Thursday, 25 April 13
(QUICK ASIDE)#SIDEBARThursday, 25 April 13
GOOGLE FILE SYSTEM (GFS) GOOGLE MAPREDUCE (GMR).GOOGLE STARTED ALL THIS....Thursday, 25 April 13
GOOGLE DREMELhttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
GOOGLE DREMELINTERACTIVE ANALYSIShttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
GOOGLE DREMELINTERACTIVE ANALYSISSCALE UP TO 10,000 SERVERShttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
GOOGLE DREMELINTERACTIVE ANALYSISSCALE UP TO 10,000 SERVERSCOLUMN STORAGEhttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
OpenDremelGOOGLE BIG QUERYGoogleBig Query#BDW13Thursday, 25 April 13
http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLThursday, 25 April 13
http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLThursday, 25 April 13
http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEThursday, 25 April 13
http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEGLOBALLY DISTRIBUTEDThursd...
http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEGLOBALLY DISTRIBUTEDUSE GP...
http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEGLOBALLY DISTRIBUTEDUSE GP...
Thursday, 25 April 13
BIG DATA IS THE NEW OILThursday, 25 April 13
NICK HALSTEAD, FOUNDERHTTP://DATASIFT.COMWE ARE HIRING!!Thursday, 25 April 13
Upcoming SlideShare
Loading in...5
×

Big Data Week - Myths and Legends

4,181

Published on

Presentation by Nick Halstead on some of the Myths around Big Data.

Published in: Technology, Business
1 Comment
8 Likes
Statistics
Notes
  • Thanks Nick for the slides, I'll not pretended that I understand what have been there but I wanted to ask sis you record this session as a video?

    If it's available, could you post it too? As we're using data for our startup, yet we still defining the stakeholders for us.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
4,181
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
46
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide

Transcript of "Big Data Week - Myths and Legends"

  1. 1. NICK HALSTEAD, FOUNDERDATASIFT, @NIKBig Data“Myths and Legends”#BDW13Thursday, 25 April 13
  2. 2. #BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS#DATASIFTThursday, 25 April 13
  3. 3. #BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS1.5 BILLION ITEMS DAY#DATASIFTThursday, 25 April 13
  4. 4. #BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS1.5 BILLION ITEMS DAY1.5 PETABYTES OF STORAGE#DATASIFTThursday, 25 April 13
  5. 5. #BDW13BIG DATASOCIAL DATA +TV MONITORING POLITICAL TRACKING FINANCIAL FEEDS1.5 BILLION ITEMS DAY1.5 PETABYTES OF STORAGE5000 CPU HADOOP CLUSTER #DATASIFTThursday, 25 April 13
  6. 6. Big Data“Myths and Legends”#BD13Thursday, 25 April 13
  7. 7. BIG DATA PERCEPTION#GOOGLEI THOUGHT I WOULD ASK GOOGLE....Thursday, 25 April 13
  8. 8. BIG DATA PERCEPTION#GOOGLEI THOUGHT I WOULD ASK GOOGLE....Thursday, 25 April 13
  9. 9. BIG DATA PERCEPTION#GOOGLEI THOUGHT I WOULD ASK GOOGLE....Thursday, 25 April 13
  10. 10. BIG DATA VENDOR “MYTHS”Thursday, 25 April 13
  11. 11. Thursday, 25 April 13
  12. 12. BIG DATA VENDOR “MYTHS”Thursday, 25 April 13
  13. 13. #BDW13Thursday, 25 April 13
  14. 14. 1. YOU MUST BUY ALL OF THIS (for one job!)#BDW13Thursday, 25 April 13
  15. 15. 2. HOW BIG IS “BIG”Thursday, 25 April 13
  16. 16. #BDW13Thursday, 25 April 13
  17. 17. 20 PETABYTES IN EACH SEARCH INDEX REBULD (this was 2 years ago)#BDW13Thursday, 25 April 13
  18. 18. 20 PETABYTES IN EACH SEARCH INDEX REBULD (this was 2 years ago)900,000 SERVERS#BDW13Thursday, 25 April 13
  19. 19. #BDW13Thursday, 25 April 13
  20. 20. #BDW133.2 BILLION LIKES AND COMMENTS PER DAYThursday, 25 April 13
  21. 21. #BDW133.2 BILLION LIKES AND COMMENTS PER DAYOVER HALF A PETABYTE … EVERY 24 HOURSThursday, 25 April 13
  22. 22. #BDW13 #HADRONThursday, 25 April 13
  23. 23. 150 MILLION SENSORS DELIVERING DATA 40 MILLION TIMES PER SECOND#BDW13 #HADRONThursday, 25 April 13
  24. 24. 150 MILLION SENSORS DELIVERING DATA 40 MILLION TIMES PER SECOND10’s OF PETABYTES PER YEAR#BDW13 #HADRONThursday, 25 April 13
  25. 25. A TYPICAL COMPANYThursday, 25 April 13
  26. 26. A TYPICAL COMPANY100 EMPLOYEESThursday, 25 April 13
  27. 27. A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERSThursday, 25 April 13
  28. 28. A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS25 DATABASES (customers, transactions, etc)Thursday, 25 April 13
  29. 29. A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS25 DATABASES (customers, transactions, etc)Thursday, 25 April 13
  30. 30. A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS5,000 BYTES PER TRANSACTION25 DATABASES (customers, transactions, etc)Thursday, 25 April 13
  31. 31. A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS5,000 BYTES PER TRANSACTION25 DATABASES (customers, transactions, etc)=4 GIGABYTES (for largest database)Thursday, 25 April 13
  32. 32. A TYPICAL COMPANY100 EMPLOYEES10,000 CUSTOMERS1 MILLION TRANSACTIONS RECORDS5,000 BYTES PER TRANSACTION25 DATABASES (customers, transactions, etc)=4 GIGABYTES (for largest database)=20 GIGABYTES (for ALL company data)Thursday, 25 April 13
  33. 33. A TYPICAL HARDDRIVE2000 GIGABYTES (2TB)Thursday, 25 April 13
  34. 34. A TYPICAL HARDDRIVE2000 GIGABYTES (2TB)4000 GIGABYTES (4TB)Thursday, 25 April 13
  35. 35. 3. YOU NEED *LOTS* OF DATA SCIENTISTS#DILBERT#BDW13Thursday, 25 April 13
  36. 36. 3. YOU NEED *LOTS* OF DATA SCIENTISTS#DILBERT#BDW13Thursday, 25 April 13
  37. 37. 4. HOW BIG DATA IS USED#BDW13Thursday, 25 April 13
  38. 38. 4. HOW BIG DATA IS USED#BDW13BANKINGThursday, 25 April 13
  39. 39. 4. HOW BIG DATA IS USED#BDW13BANKINGCOMMUNICATIONSThursday, 25 April 13
  40. 40. 4. HOW BIG DATA IS USED#BDW13BANKINGCOMMUNICATIONSGOVERNMENTThursday, 25 April 13
  41. 41. 4. HOW BIG DATA IS USED#BDW13Thursday, 25 April 13
  42. 42. 4. HOW BIG DATA IS USED#BDW13WEB LOGS 51%Thursday, 25 April 13
  43. 43. 4. HOW BIG DATA IS USED#BDW13WEB LOGS 51%CLICK STREAM 35%Thursday, 25 April 13
  44. 44. 5. HADOOP GONE BAD+SQL#BDW13 #HADOOPGONEBADThursday, 25 April 13
  45. 45. RDBM - RELATIONAL DATABASE#BDW13Thursday, 25 April 13
  46. 46. RDBM - RELATIONAL DATABASENEEDS TO BE PRE-DEFINED#BDW13Thursday, 25 April 13
  47. 47. RDBM - RELATIONAL DATABASENEEDS TO BE PRE-DEFINEDREQUIRES INDEX TO PERFORM#BDW13Thursday, 25 April 13
  48. 48. RDBM - RELATIONAL DATABASENEEDS TO BE PRE-DEFINEDREQUIRES INDEX TO PERFORMQUERIES ARE CONSTRAINED#BDW13Thursday, 25 April 13
  49. 49. MAP REDUCE#MAPREDUCE#BDW13Thursday, 25 April 13
  50. 50. MAP REDUCEPROCESS CLOSE TO THE DATA#MAPREDUCE#BDW13Thursday, 25 April 13
  51. 51. MAP REDUCEPROCESS CLOSE TO THE DATAPARALLEL EXECUTION#MAPREDUCE#BDW13Thursday, 25 April 13
  52. 52. MAP REDUCEPROCESS CLOSE TO THE DATAPARALLEL EXECUTIONANY TYPE OF ANALYSIS#MAPREDUCE#BDW13Thursday, 25 April 13
  53. 53. MAP REDUCEPROCESS CLOSE TO THE DATAPARALLEL EXECUTIONANY TYPE OF ANALYSISHIDES DETAILS OFFAULT TOLERANCE, LOCALITYAND LOAD BALANCING#MAPREDUCE#BDW13Thursday, 25 April 13
  54. 54. BIG DATA SCHEMA #NOSQLHBASECOLUMNS FILES#BDW13Thursday, 25 April 13
  55. 55. (QUICK ASIDE)#SIDEBARThursday, 25 April 13
  56. 56. GOOGLE FILE SYSTEM (GFS) GOOGLE MAPREDUCE (GMR).GOOGLE STARTED ALL THIS....Thursday, 25 April 13
  57. 57. GOOGLE DREMELhttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
  58. 58. GOOGLE DREMELINTERACTIVE ANALYSIShttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
  59. 59. GOOGLE DREMELINTERACTIVE ANALYSISSCALE UP TO 10,000 SERVERShttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
  60. 60. GOOGLE DREMELINTERACTIVE ANALYSISSCALE UP TO 10,000 SERVERSCOLUMN STORAGEhttp://bit.ly/mS8QxX#BDW13Thursday, 25 April 13
  61. 61. OpenDremelGOOGLE BIG QUERYGoogleBig Query#BDW13Thursday, 25 April 13
  62. 62. http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLThursday, 25 April 13
  63. 63. http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLThursday, 25 April 13
  64. 64. http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEThursday, 25 April 13
  65. 65. http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEGLOBALLY DISTRIBUTEDThursday, 25 April 13
  66. 66. http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEGLOBALLY DISTRIBUTEDUSE GPS / TRUETIMEThursday, 25 April 13
  67. 67. http://research.google.com/archive/spanner.htmlGOOGLE SPANNER#SPANNER #NEWSQLRELATIONAL DATABASEGLOBALLY DISTRIBUTEDUSE GPS / TRUETIMENO OPEN SOURCE EQUIVALENTThursday, 25 April 13
  68. 68. Thursday, 25 April 13
  69. 69. BIG DATA IS THE NEW OILThursday, 25 April 13
  70. 70. NICK HALSTEAD, FOUNDERHTTP://DATASIFT.COMWE ARE HIRING!!Thursday, 25 April 13
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×