Your SlideShare is downloading. ×
Data: A Cautionary Tale by Daniel Katz
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Data: A Cautionary Tale by Daniel Katz

241
views

Published on

Published in: Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
241
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A Cautionary Tale
  • 2.  
  • 3.  
  • 4.  
  • 5.
    • The Big Picture
    • Collect
    • Clean
    • Model
    • Store
    • Present
  • 6. { "classes": [ { "name": "Fundamental Process of Design", "professor": "Joo Youn Paek" , "year" : " 2010 ", "semester" : "fall", "students": [ { "student" : { "name": “Joe Student", “ email": “it4life@gmail.com", "twitter_name": “@itp4life" , “ blog_url": “http://itp4life.blogspot.co" , } } ] } ] }
  • 7. <classes> <class> <name>Fundamental Process of Design</name> <professor>Joo Youn Paek</professor> <year>2010</year> <semester>Fall</semester> <students> <student> <name>Joe Student</name> <email>itp4life@gmail.com</email> <twitter_name>@itp4life</twitter_name> <blog_url>http://itp4life.blogspot.com</blog_url> </student> </students> </class> </classes>
  • 8.  
  • 9.  
  • 10.
    • The Open Data Movement is in Full Swing
      • Governments
      • Institutions
      • Scientists
      • Enthusiasts
      • http:// vimeo.com/2598878
  • 11.
    • Commercial tools and open source are starting to converge
  • 12.
    • There will always be assumptions
  • 13.
    • Bring it down
  • 14.
    • FreeBase – Entity Graph
    • Info Chimp
    • Twitter
    • Facebook
  • 15.
    • Data.gov
    • MTA
  • 16.
    • Arduino
    • Smart Phone
    • Other sensors
  • 17.  
  • 18.
    • Don’t be intimidated by data from disparate sources
  • 19.  
  • 20.  
  • 21.
    • Clean up messy data
    • Inconsistent data points
    • Identify patterns
    • Combine data from disparate sources
  • 22. Collection of Twitter Responses from API Value.parseJson().user.screen_name
  • 23.  
  • 24.
    • Depending on the type of data you are collecting, there are appropriate places to store it
  • 25.
    • Non-programmers
      • Google Fusion Tables
    • For programmers
      • Geo Database and programming tools
        • PostGIS (Postgresql)
        • GeoTools (Java)
  • 26.
    • Non-programmers
      • Google Docs (Read into processing)
      • Microsoft Excel (internal charting tool)
      • Text based formatting (visualize with Google Chart API)
    • For programmers
      • Any relational database
        • MySql
        • PostgresSql
  • 27.
    • Graph Database
  • 28.
    • http:// blog.blprnt.com/blog/blprnt/your-random-numbers-getting-started-with-processing-and-data-visualization
    • http://code.google.com/p/gdocjdbc/
  • 29.
    • http:// www.infochimps.com/datasets/tweets-during-state-of-the-union-address
    • http://code.google.com/p/google-refine /
    • http:// dev.twitter.com/doc/get/geo/search
    • http://flowingdata.com/2009/07/14/how-does-the-average-consumer-spend-his-money /
    • http://www.bls.gov/cex /
    • http://www.google.com/fusiontables/Home