Your SlideShare is downloading. ×
0
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
A (very) short history of big data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

A (very) short history of big data

7,773

Published on

My lightening talk from the BigDataCamp in Washington, DC this past November (2011).

My lightening talk from the BigDataCamp in Washington, DC this past November (2011).

Published in: Sports, Technology, Business
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,773
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
140
Comments
0
Likes
7
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • What was the problem? Took 7 years to tabulate, using people - not a job I’d want
  • 24 values = 4.5 bits, so 360 bits of data or 45 bytes x 62M = 2.8GB. Held onto data for a few months.
  • In 1880, it was 1-2 GB of data. And they couldn’t order bigger people to process the data faster
  • Many years like this. So what changed? The world wide web, and social services
  • 1999 - 2 exabytes generated in the entire year Images, movies, sensors But what’s driving interest in big data is two things -
  • Two Stanford students were trying to solve the problem of divining a web page's "importance".Separate from search - e.g. if two pages had roughly the same content, which to show first? Finding the dominant eigenvalue and eigenvector of a matrix Google came up with systems for storing & processing the web
  • We’ll talk about this in detail later
  • Transcript

    • 1. 1 A Very Short History of Big Data Lightening photo by: exfordy, flickr Talk Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 2. 2 The First Big Data Problem 1880 Census 50 Million People Age, gender, number of insane people in household Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 3. 3 The First Big Data Solution Hollerith Tabulating System Punched cards - 80 variables Used for 1890 Census 6 weeks instead of 7+ years Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 4. 4 What is Big Data? I Know It When I See It More than you can handle with the computer you’ve got And scaling up isn’t an option Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 5. 5 Big Science == Big Data Weather predictions Super-collider data Astronomy images Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 6. 6 A Data Explosion Te xt “Every two days now we create as much information as we did from the dawn of civilization up until 2003. That’s something like five exabytes of data” -- Google CEO Erik Schmidt Gigabyte = 10^9 = 1,000,000,000 Terabyte = 10^12 = 1,000,000,000,000 Petabyte = 10^15 = 1,000,000,000,000,000 Exabyte = 10^18 = 1,000,000,000,000,000,000 OK, there’s a lot of data Increased to 800 billion gigabytes in 2009. If every person on earth tweeted continuously for a century... Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 7. 7 Search Analyzing lots of data Important pages are those that important pages link to Solving Satan’s spreadsheet 100 billion rows x 100 billion columns Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 8. 8 Advertising Specifically online advertising Lots of data in the form of log files Lots of value if you increase sales Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 9. 9 Advertising Specifically online advertising Lots of data in the form of log files Lots of value if you increase sales Targeted advertising can be good Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
    • 10. 10 Advertising Satisfy your Barney Fetish Specifically online advertising Pictures of Barney being Lots of data in the form of log files drop-kicked off bridges. Discrete shipping. No questions asked. Lots of value if you increase sales Targeted advertising can be good But scary, when they know too much Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011

    ×