A (very) short history of big data
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

A (very) short history of big data

on

  • 7,904 views

My lightening talk from the BigDataCamp in Washington, DC this past November (2011).

My lightening talk from the BigDataCamp in Washington, DC this past November (2011).

Statistics

Views

Total Views
7,904
Views on SlideShare
5,931
Embed Views
1,973

Actions

Likes
7
Downloads
129
Comments
0

12 Embeds 1,973

http://www.scaleunlimited.com 1908
http://localhost 18
http://winterstreetdesign.com 13
https://twitter.com 12
http://tweetedtimes.com 10
http://www.linkedin.com 5
http://webcache.googleusercontent.com 2
http://translate.googleusercontent.com 1
https://si0.twimg.com 1
http://paper.li 1
https://twimg0-a.akamaihd.net 1
http://silverreader.com 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • What was the problem? Took 7 years to tabulate, using people - not a job I’d want
  • 24 values = 4.5 bits, so 360 bits of data or 45 bytes x 62M = 2.8GB. Held onto data for a few months.
  • In 1880, it was 1-2 GB of data. And they couldn’t order bigger people to process the data faster
  • Many years like this. So what changed? The world wide web, and social services
  • 1999 - 2 exabytes generated in the entire year Images, movies, sensors But what’s driving interest in big data is two things -
  • Two Stanford students were trying to solve the problem of divining a web page's "importance".Separate from search - e.g. if two pages had roughly the same content, which to show first? Finding the dominant eigenvalue and eigenvector of a matrix Google came up with systems for storing & processing the web
  • We’ll talk about this in detail later

A (very) short history of big data Presentation Transcript

  • 1. 1 A Very Short History of Big Data Lightening photo by: exfordy, flickr Talk Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 2. 2 The First Big Data Problem 1880 Census 50 Million People Age, gender, number of insane people in household Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 3. 3 The First Big Data Solution Hollerith Tabulating System Punched cards - 80 variables Used for 1890 Census 6 weeks instead of 7+ years Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 4. 4 What is Big Data? I Know It When I See It More than you can handle with the computer you’ve got And scaling up isn’t an option Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 5. 5 Big Science == Big Data Weather predictions Super-collider data Astronomy images Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 6. 6 A Data Explosion Te xt “Every two days now we create as much information as we did from the dawn of civilization up until 2003. That’s something like five exabytes of data” -- Google CEO Erik Schmidt Gigabyte = 10^9 = 1,000,000,000 Terabyte = 10^12 = 1,000,000,000,000 Petabyte = 10^15 = 1,000,000,000,000,000 Exabyte = 10^18 = 1,000,000,000,000,000,000 OK, there’s a lot of data Increased to 800 billion gigabytes in 2009. If every person on earth tweeted continuously for a century... Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 7. 7 Search Analyzing lots of data Important pages are those that important pages link to Solving Satan’s spreadsheet 100 billion rows x 100 billion columns Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 8. 8 Advertising Specifically online advertising Lots of data in the form of log files Lots of value if you increase sales Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 9. 9 Advertising Specifically online advertising Lots of data in the form of log files Lots of value if you increase sales Targeted advertising can be good Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011
  • 10. 10 Advertising Satisfy your Barney Fetish Specifically online advertising Pictures of Barney being Lots of data in the form of log files drop-kicked off bridges. Discrete shipping. No questions asked. Lots of value if you increase sales Targeted advertising can be good But scary, when they know too much Copyright (c) 2008 Scale Unlimited, Inc. All Rights Reserved. Reproduction or distribution of this document in any form without prior written permission is forbidden.Monday, December 19, 2011