Your SlideShare is downloading. ×
0
History of the Info: Part II Nick Ducoff CEO and Co-Founder, Infochimps
Early 2000s
Mid 2000s
Present Day
3000 BC Recording
3000 BC 1200 BC Recording Aggregating
3000 BC 1200 BC 300 BC Recording Aggregating Storing  at Scale
300s AD – Random Access 3000 BC 1200 BC 300 BC 300 AD Recording Aggregating Storing  at Scale Random  Access
3000 BC 1200 BC 300 BC 300 AD 1400 AD Recording Aggregating Storing  at Scale Random  Access Mass Distribution
3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing  at Scale Random  Access Mass Distribution Inf...
1930s – Computation theory (Turing) 1940s – Information theory (Shannon) 1950s – Computer languages (1GL,2GL,3GL) 1960s – ...
 
 
 
 
 
 
Tables on web pages Open APIs Commercial data sources Augmentation Completion Normalization Name ZIP Average Rent Walter C...
 
 
 
[email_address]
Upcoming SlideShare
Loading in...5
×

The History of Data

189

Published on

A comprehensive history of data, presented by Nick Ducoff, CEO of Infochimps.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
189
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Internet brought offline businesses online
  • Social networks created massive amounts of data
  • Social networks created massive amounts of data
  • Babylon was first society to systematically record knowledge, including the first census which systematically counted and recorded people and commodities for taxation and other purposes
  • Library at Thebes was first known effort to gather and make many sources of knowledge available in one place
  • Charged with collecting all the world's knowledge, the Library of Alexandria collected what is thought to have been nearly a half million objects
  • Codex replaces scrolls, enabling random access of information, or browsing.
  • Gutenberg’s printing press enables mass production and distribution of information
  • William Playfair invents the line, bar and pie charts, paving the way for Charles Minard’s famous graphical representation of Napoleon’s March
  • Alan Turing showed that any reasonable computation could be done by programming a machine Claude Shannon solved the engineering problem of the transmission of information over a noisy channel Computer language advanced quickly from first generation languages to third generation languages such as COBAL Henriette Avram created the Machine-readable cataloging system to metatag books Relational databases enabled storing and lookups of data at scale Tim Berners-Lee creates WWW which leads to mass adoption of internet, quickly growing to billions of pages, causing Brewster Kahle to begin systematically capturing and storing the information 1930s – Computation theory (Turing) 1940s – Information theory (Shannon) 1950s – Computer languages (1GL, 2GL, 3GL) 1960s – Standardized metadata (Avram) 1970s – Relational databases (IBM) 1980s – WWW (Al Gore  ) 1990s – Internet archive (Kahle)
  • 1.8 ZB of data but still hard to find the pieces you want
  • Aggregated, organized, accessible. When you can easily identify, understand and access the pieces, you can build anything.
  • Map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian campaign of 1812
  • Better BI decisions and data-driven apps
  • Transcript of "The History of Data"

    1. 1. History of the Info: Part II Nick Ducoff CEO and Co-Founder, Infochimps
    2. 2. Early 2000s
    3. 3. Mid 2000s
    4. 4. Present Day
    5. 5. 3000 BC Recording
    6. 6. 3000 BC 1200 BC Recording Aggregating
    7. 7. 3000 BC 1200 BC 300 BC Recording Aggregating Storing at Scale
    8. 8. 300s AD – Random Access 3000 BC 1200 BC 300 BC 300 AD Recording Aggregating Storing at Scale Random Access
    9. 9. 3000 BC 1200 BC 300 BC 300 AD 1400 AD Recording Aggregating Storing at Scale Random Access Mass Distribution
    10. 10. 3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing at Scale Random Access Mass Distribution Infographics
    11. 11. 1930s – Computation theory (Turing) 1940s – Information theory (Shannon) 1950s – Computer languages (1GL,2GL,3GL) 1960s – Standardized metadata (Avram) 1970s – Relational databases (IBM) 1980s – WWW (Al Gore  ) 1990s – Internet archive (Kahle) 3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing at Scale Random Access Mass Distribution Infographics
    12. 18. Tables on web pages Open APIs Commercial data sources Augmentation Completion Normalization Name ZIP Average Rent Walter Cureton 78701 $400-$599 Ivy Caldwell 94103 >$1500 Regina Wootton 10027 $1000-$1499 Name Address City ZIP Brian James 901 Red River Austin 78701 Terri Becraft 262 7th St. San Francisco 94103 Paz Brummit 603 W. 114th St. New York 10027 Name Address Normalized Address Cecil Bartz 901 red river austin texas 901 Red River, Austin, TX 78701 Genaro Luz 702 w. 32nd st austin 702 W. 32nd St., Austin, TX 78705 Ruth Brown 114th + broadway, nyc W. 114th St. & Broadway, New York, NY 10027
    13. 22. [email_address]
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×