Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Open GovernmentData & MongoDB        Luigi Montanez luigi@sunlightfoundation.com
Question? @LuigiMontanez
Open Data + Open Source  = Open Government                Question? @LuigiMontanez
MongoDB enables   open data          Question? @LuigiMontanez
Opening Up Data✴   Gather data from disparate sources     ✴   Data dumps (SQL, Fixed-width columns)     ✴   Web scraping  ...
JSON✴   Tree structure, not tabular✴   Still relational✴   JSON for data, XML for documents✴   Closely resembles native da...
Three Projects✴   Poligraft✴   Real Time Congress API✴   Open State Project                             Question? @LuigiMo...
Three Projects✴   Poligraft✴   Real Time Congress API✴   Open State Project                             Question? @LuigiMo...
App design    drivesschema design          Question? @LuigiMontanez
{  "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com"}                            T...
{  "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com",  "slug": "EOsc",  "source_ur...
{  "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com",  "slug": "EOsc",  "source_ur...
{  "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com",  "slug": "EOsc",  "source_ur...
{  "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com",  "slug": "EOsc",  "source_ur...
Natural Schemas           Question? @LuigiMontanez
Three Projects✴   Poligraft✴   Real Time Congress API✴   Open State Project                             Question? @LuigiMo...
Real-Time Congress API                 Credit: vgm8383 on Flickr
Android App: “Congress”
Politiwidgets
Requirements✴   Aggregate lots of data      Biographical, Bills, Votes, Earmarks,      Video Clips, Floor Updates, Legisla...
{legislator: {    in_office: true,    title: "Rep",    nickname: "",    district: "9",    bioguide_id: "L000551",    govtr...
// limit selection to a subset of fieldsdb.people.find( { first_name : john },                { last_name : 1,            ...
?sections=last_name,first_name,state,earmarks  {legislator: {      last_name: "Lee",      first_name: "Barbara",      stat...
?sections=last_name,first_name,state,earmarks.total_amount,earmarks.total_number         {legislator: {             last_n...
Partial responses make payloads     smaller            Question? @LuigiMontanez
Three Projects✴   Poligraft✴   Real Time Congress API✴   Open State Project                             Question? @LuigiMo...
50 States =50 Formats         Question? @LuigiMontanez
Schemalessnessallows for granular      control             Question? @LuigiMontanez
Custom Fields✴   Traditional RDBMS     ✴   Update the schema for new fields, run a         migration, feel icky     ✴   Cre...
Speaking JSON   natively         Question? @LuigiMontanez
PythonSource   Scraped JSON               PostgreSQL                        Transform
Source   Scraped JSON   MongoDB
Three Projects✴   Poligraft✴   Real Time Congress API✴   Open State Project                             Question? @LuigiMo...
Developer Happiness
Thanks!sunlightlabs.com@LuigiMontanez                   Question? @LuigiMontanez
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Open Government Data and MongoDB
Upcoming SlideShare
Loading in …5
×

Open Government Data and MongoDB

1,964 views

Published on

Given at MongoDC on June 27, 2011.

  • Be the first to comment

Open Government Data and MongoDB

  1. 1. Open GovernmentData & MongoDB Luigi Montanez luigi@sunlightfoundation.com
  2. 2. Question? @LuigiMontanez
  3. 3. Open Data + Open Source = Open Government Question? @LuigiMontanez
  4. 4. MongoDB enables open data Question? @LuigiMontanez
  5. 5. Opening Up Data✴ Gather data from disparate sources ✴ Data dumps (SQL, Fixed-width columns) ✴ Web scraping ✴ Text/PDF parsing✴ Serving RESTful JSON APIs Question? @LuigiMontanez
  6. 6. JSON✴ Tree structure, not tabular✴ Still relational✴ JSON for data, XML for documents✴ Closely resembles native data structures✴ No manual parsing needed Question? @LuigiMontanez
  7. 7. Three Projects✴ Poligraft✴ Real Time Congress API✴ Open State Project Question? @LuigiMontanez
  8. 8. Three Projects✴ Poligraft✴ Real Time Congress API✴ Open State Project Question? @LuigiMontanez
  9. 9. App design drivesschema design Question? @LuigiMontanez
  10. 10. { "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com"} Text
  11. 11. { "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com", "slug": "EOsc", "source_url": "http://www.politico.com/news/stories/ 0810/40534.html", "content": ".................",} Text
  12. 12. { "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com", "slug": "EOsc", "source_url": "http://www.politico.com/news/stories/ 0810/40534.html", "content": ".................", "entities": [...] Text}
  13. 13. { "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com", "slug": "EOsc", "source_url": "http://www.politico.com/news/stories/ 0810/40534.html", "content": ".................", "entities": [ { Text "name": "Barack Obama", "type": "politician", }, ... ]}
  14. 14. { "title": "President Obamas climate Plan B in hot water -Darren Samuelsohn - POLITICO.com", "slug": "EOsc", "source_url": "http://www.politico.com/news/stories/ 0810/40534.html", "content": ".................", "entities": [ { Text "name": "Barack Obama", "type": "politician", "breakdown": {"indiv": "33", "pac": "67"} "top_industries": ["Lawyers/Lobbyists","Finance/Insurance/ Real Estate","Misc. Business"] }, ... ]}
  15. 15. Natural Schemas Question? @LuigiMontanez
  16. 16. Three Projects✴ Poligraft✴ Real Time Congress API✴ Open State Project Question? @LuigiMontanez
  17. 17. Real-Time Congress API Credit: vgm8383 on Flickr
  18. 18. Android App: “Congress”
  19. 19. Politiwidgets
  20. 20. Requirements✴ Aggregate lots of data Biographical, Bills, Votes, Earmarks, Video Clips, Floor Updates, Legislative Documents, Committee Schedules, Contributions, Interest Group Ratings✴ Lightweight responses Question? @LuigiMontanez
  21. 21. {legislator: { in_office: true, title: "Rep", nickname: "", district: "9", bioguide_id: "L000551", govtrack_id: "400237", phone: "202-225-2661", website: "http://lee.house.gov/index.html", twitter_id: "", last_name: "Lee", name_suffix: "", last_updated: "2010/04/13 00:00:14 +0000", party: "D", chamber: "house", state: "CA", youtube_url: "http://www.youtube.com/RepLee", first_name: "Barbara", gender: "F", congress_office: "2444 Rayburn House Office Building", earmarks: { average_number: 20, total_amount: 10000000, average_amount: 22994535, total_number: 28, last_updated: "2010-03-18", fiscal_year: 2010, } ...}
  22. 22. // limit selection to a subset of fieldsdb.people.find( { first_name : john }, { last_name : 1, address : 1 } );// use dot-notation to dig into an objectdb.people.find( { state: CA }, { address.zip_code: 1 } );
  23. 23. ?sections=last_name,first_name,state,earmarks {legislator: { last_name: "Lee", first_name: "Barbara", state: "CA", earmarks: { average_number: 20, total_amount: 10000000, average_amount: 22994535, total_number: 28, last_updated: "2010-03-18", fiscal_year: 2010, } }
  24. 24. ?sections=last_name,first_name,state,earmarks.total_amount,earmarks.total_number {legislator: { last_name: "Lee", first_name: "Barbara", state: "CA", earmarks: { total_amount: 10000000, total_number: 28 } }
  25. 25. Partial responses make payloads smaller Question? @LuigiMontanez
  26. 26. Three Projects✴ Poligraft✴ Real Time Congress API✴ Open State Project Question? @LuigiMontanez
  27. 27. 50 States =50 Formats Question? @LuigiMontanez
  28. 28. Schemalessnessallows for granular control Question? @LuigiMontanez
  29. 29. Custom Fields✴ Traditional RDBMS ✴ Update the schema for new fields, run a migration, feel icky ✴ Create a custom_fields table✴ MongoDB ✴ Just store it Question? @LuigiMontanez
  30. 30. Speaking JSON natively Question? @LuigiMontanez
  31. 31. PythonSource Scraped JSON PostgreSQL Transform
  32. 32. Source Scraped JSON MongoDB
  33. 33. Three Projects✴ Poligraft✴ Real Time Congress API✴ Open State Project Question? @LuigiMontanez
  34. 34. Developer Happiness
  35. 35. Thanks!sunlightlabs.com@LuigiMontanez Question? @LuigiMontanez

×