SlideShare a Scribd company logo
MongoDB
@ Sunlight
Luigi Montanez
luigi@sunlightfoundation.com
Question? @LuigiMontanez
Question? @LuigiMontanez
Open Source + Open Data
=
Open Government
Question? @LuigiMontanez
High Quality Raw Data
✴ First: Raw data in JSON, XML, or CSV
✴ Second: RESTful APIs in JSON or XML
✴ Third: Nothing else...
Question? @LuigiMontanez
MongoDB enables
open data
Question? @LuigiMontanez
JSON has won
(among developers)
Question? @LuigiMontanez
Opening Up Data
✴ Storing data from disparate sources
✴ Data dumps
✴ Web scraping
✴ Text/PDF parsing
✴ Serving RESTful JSON APIs
Question? @LuigiMontanez
Three Projects
✴ National Data Catalog
✴ Real-Time Congress API
✴ Open State Project
Question? @LuigiMontanez
Three Projects
✴ National Data Catalog
✴ Real-Time Congress API
✴ Open State Project
Question? @LuigiMontanez
App design
drives
schema design
Text
{
"title": "Worldwide M1+ Earthquakes, Past Hour"
}
Text
{
"title": "Worldwide M1+ Earthquakes, Past Hour",
"description": "Real-time, worldwide earthquake list for the past h
"homepage": "http://data.gov/raw/32",
"official_docs": "http://earthquake.usgs.gov/eqcenter/catalogs/",
"organization": "Department of the Interior",
"original_catalog": "data.gov",
}
Text
{
"title": "Worldwide M1+ Earthquakes, Past Hour",
"description": "Real-time, worldwide earthquake list for the past
"homepage": "http://data.gov/raw/32",
"official_docs": "http://earthquake.usgs.gov/eqcenter/catalogs/",
"organization_id": "4cbcc0ff2c34576ba4000001",
"catalog_id": "4cbcc0ab2d34d76b97020433",
}
{
"title": "Worldwide M1+ Earthquakes, Past Hour",
"description": "Real-time, worldwide earthquake list for the past h
"homepage": "http://data.gov/raw/32",
"official_docs": "http://earthquake.usgs.gov/eqcenter/catalogs/",
"organization": { "name": "Department of the Interior",
"id": "4cbcc0ff2c34576ba4000001",
"slug": "us-dept-of-interior"
},
"original_catalog": { "name": "data.gov",
"id": "4cbcc0ab2d34d76b97020433",
"slug": "datagov"
}
}
{
"title": "Worldwide M1+ Earthquakes, Past Hour",
"description": "Real-time, worldwide earthquake list for the past h
"homepage": "http://data.gov/raw/32",
"official_docs": "http://earthquake.usgs.gov/eqcenter/catalogs/",
"organization": {
"name": "Department of the Interior",
"id": "4cbcc0ff2c34576ba4000001",
"slug": "us-dept-of-interior"
},
"original_catalog": {
"name": "data.gov",
"id": "4cbcc0ab2d34d76b97020433",
"slug": "datagov"
},
"downloads": [ { "type": "csv", "url": "http://data.gov/download/32
"ratings" : {
"average_rating": 3.5,
"rating_count": 23
},
"comments": []
}
Question? @LuigiMontanez
User-centric data?
✴ Source document: contains collection of
user data
✴ User document: contains collection of
source data
✴ UserSource document
✴ Rating, Favorite, Note docs
Question? @LuigiMontanez
Freedom of choice
Question? @LuigiMontanez
Three Projects
✴ National Data Catalog
✴ Real-Time Congress API
✴ Open State Project
Real-Time Congress API
(Drumbone)
Credit: vgm8383 on Flickr
Android App: “Congress”
Politiwidgets
Question? @LuigiMontanez
Requirements
✴ Aggregate lots of data
Biographical, Bills, Votes, Earmarks,
Video Clips, Floor Updates, Legislative
Documents, Committee Schedules,
Contributions, Interest Group Ratings
✴ Lightweight responses
{legislator: {
in_office: true,
title: "Rep",
nickname: "",
district: "9",
bioguide_id: "L000551",
govtrack_id: "400237",
phone: "202-225-2661",
website: "http://lee.house.gov/index.html",
twitter_id: "",
last_name: "Lee",
name_suffix: "",
last_updated: "2010/04/13 00:00:14 +0000",
party: "D",
chamber: "house",
state: "CA",
youtube_url: "http://www.youtube.com/RepLee",
first_name: "Barbara",
gender: "F",
congress_office: "2444 Rayburn House Office Building",
earmarks: {
average_number: 20,
total_amount: 10000000,
average_amount: 22994535,
total_number: 28,
last_updated: "2010-03-18",
fiscal_year: 2010,
}
...
}
// limit selection to a subset of fields
db.people.find( { 'first_name' : 'john' },
{ 'last_name' : 1,
'address' : 1 } );
// use dot-notation to dig into an object
db.people.find( { 'state': 'CA' },
{ 'address.zip_code': 1 } );
{legislator: {
last_name: "Lee",
first_name: "Barbara",
state: "CA",
earmarks: {
average_number: 20,
total_amount: 10000000,
average_amount: 22994535,
total_number: 28,
last_updated: "2010-03-18",
fiscal_year: 2010,
}
}
?sections=last_name,first_name,state,earmarks
{legislator: {
last_name: "Lee",
first_name: "Barbara",
state: "CA",
earmarks: {
total_amount: 10000000,
total_number: 28
}
}
?sections=last_name,first_name,state,earmarks.total_amount,earmarks.total_number
Question? @LuigiMontanez
Partial responses
make payloads
smaller
Question? @LuigiMontanez
Three Projects
✴ National Data Catalog
✴ Real-Time Congress API
✴ Open State Project
Question? @LuigiMontanez
50 States =
50 Formats
Question? @LuigiMontanez
Schemalessness
allows for
losslessness
Source Scraped JSON
Python
Transform
PostgreSQL
Source Scraped JSON MongoDB
Question? @LuigiMontanez
Three Projects
✴ National Data Catalog
✴ Real-Time Congress API
✴ Open State Project
Question? @LuigiMontanez
Thanks!
sunlightlabs.com
@LuigiMontanez

More Related Content

Viewers also liked

Making Sense of IWOM: How IWOM is generated and disseminated
Making Sense of IWOM: How IWOM is generated and disseminatedMaking Sense of IWOM: How IWOM is generated and disseminated
Making Sense of IWOM: How IWOM is generated and disseminatedKantar Media CIC
 
社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)
社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)
社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)Kantar Media CIC
 
Reach Force Marketing Automation Mini Conference - 6/18/2013
Reach Force Marketing Automation Mini Conference - 6/18/2013Reach Force Marketing Automation Mini Conference - 6/18/2013
Reach Force Marketing Automation Mini Conference - 6/18/2013
Steve Susina
 
Resultados twitter Curling Nevada Barcelona 5
Resultados twitter Curling Nevada Barcelona 5Resultados twitter Curling Nevada Barcelona 5
Resultados twitter Curling Nevada Barcelona 5
Selva Orejón
 
Search-Friendly Web Development at RubyNation
Search-Friendly Web Development at RubyNationSearch-Friendly Web Development at RubyNation
Search-Friendly Web Development at RubyNationLuigi Montanez
 
CIC网论观察2006-2010精选2:2.0时代下的品牌危机
CIC网论观察2006-2010精选2:2.0时代下的品牌危机CIC网论观察2006-2010精选2:2.0时代下的品牌危机
CIC网论观察2006-2010精选2:2.0时代下的品牌危机Kantar Media CIC
 
Search-Friendly Web Development @ Lone Star Ruby Conference 2010
Search-Friendly Web Development @ Lone Star Ruby Conference 2010Search-Friendly Web Development @ Lone Star Ruby Conference 2010
Search-Friendly Web Development @ Lone Star Ruby Conference 2010
Luigi Montanez
 
Communities of Authority
Communities of AuthorityCommunities of Authority
Communities of AuthorityAaron Cope
 
Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010Luigi Montanez
 
Non vale
Non valeNon vale
Non vale
La Maledetta
 
IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?
IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?
IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?Kantar Media CIC
 
Search-Friendly Web Development @ Ruby|Web Conference 2010
Search-Friendly Web Development @ Ruby|Web Conference 2010Search-Friendly Web Development @ Ruby|Web Conference 2010
Search-Friendly Web Development @ Ruby|Web Conference 2010Luigi Montanez
 
ETech 09, notes and links
ETech 09, notes and linksETech 09, notes and links
ETech 09, notes and links
Aaron Cope
 

Viewers also liked (13)

Making Sense of IWOM: How IWOM is generated and disseminated
Making Sense of IWOM: How IWOM is generated and disseminatedMaking Sense of IWOM: How IWOM is generated and disseminated
Making Sense of IWOM: How IWOM is generated and disseminated
 
社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)
社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)
社会化商业创新与变革 (来自2012年5月大社会化行业分享会北京站)
 
Reach Force Marketing Automation Mini Conference - 6/18/2013
Reach Force Marketing Automation Mini Conference - 6/18/2013Reach Force Marketing Automation Mini Conference - 6/18/2013
Reach Force Marketing Automation Mini Conference - 6/18/2013
 
Resultados twitter Curling Nevada Barcelona 5
Resultados twitter Curling Nevada Barcelona 5Resultados twitter Curling Nevada Barcelona 5
Resultados twitter Curling Nevada Barcelona 5
 
Search-Friendly Web Development at RubyNation
Search-Friendly Web Development at RubyNationSearch-Friendly Web Development at RubyNation
Search-Friendly Web Development at RubyNation
 
CIC网论观察2006-2010精选2:2.0时代下的品牌危机
CIC网论观察2006-2010精选2:2.0时代下的品牌危机CIC网论观察2006-2010精选2:2.0时代下的品牌危机
CIC网论观察2006-2010精选2:2.0时代下的品牌危机
 
Search-Friendly Web Development @ Lone Star Ruby Conference 2010
Search-Friendly Web Development @ Lone Star Ruby Conference 2010Search-Friendly Web Development @ Lone Star Ruby Conference 2010
Search-Friendly Web Development @ Lone Star Ruby Conference 2010
 
Communities of Authority
Communities of AuthorityCommunities of Authority
Communities of Authority
 
Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010
 
Non vale
Non valeNon vale
Non vale
 
IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?
IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?
IWOM WATCH COMPILATIONS:Spoof – Brand’s “ending” or “chances”?
 
Search-Friendly Web Development @ Ruby|Web Conference 2010
Search-Friendly Web Development @ Ruby|Web Conference 2010Search-Friendly Web Development @ Ruby|Web Conference 2010
Search-Friendly Web Development @ Ruby|Web Conference 2010
 
ETech 09, notes and links
ETech 09, notes and linksETech 09, notes and links
ETech 09, notes and links
 

Similar to Sunlight Labs & MongoDB @ MongoDC

Civic Hacking @ Ruby Hoedown
Civic Hacking @ Ruby HoedownCivic Hacking @ Ruby Hoedown
Civic Hacking @ Ruby Hoedown
Luigi Montanez
 
Civic Hacking @ MongoNYC
Civic Hacking @ MongoNYCCivic Hacking @ MongoNYC
Civic Hacking @ MongoNYC
Luigi Montanez
 
Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)
Sammy Fung
 
BBC Linked Data Platform (SemTechBiz San Fran 2013)
BBC Linked Data Platform (SemTechBiz San Fran 2013)BBC Linked Data Platform (SemTechBiz San Fran 2013)
BBC Linked Data Platform (SemTechBiz San Fran 2013)
Dave Rogers
 
gRPC vs REST: let the battle begin!
gRPC vs REST: let the battle begin!gRPC vs REST: let the battle begin!
gRPC vs REST: let the battle begin!
Alex Borysov
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
Eli White
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong Kong
Sammy Fung
 
Localytics Webinar: Direct Access
Localytics Webinar: Direct Access Localytics Webinar: Direct Access
Localytics Webinar: Direct Access
Localytics
 
Friending The Statehouse
Friending The StatehouseFriending The Statehouse
Friending The Statehouse
Mark Headd
 
"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol
"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol
"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol
Userfeeds.io
 
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal HomepageFinal Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Smart Chicago Collaborative
 
Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)
Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)
Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)
Hamza Harkous
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
Scott Sanders
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009
Christopher Eagle
 
Open APIs - concepts. applications. visualizations.
Open APIs - concepts. applications. visualizations.Open APIs - concepts. applications. visualizations.
Open APIs - concepts. applications. visualizations.
Christian Dalager
 
Final Presentation
Final PresentationFinal Presentation
Final PresentationLove Tyagi
 
Goodle Developer Days Munich 2008 - Open Social Update
Goodle Developer Days Munich 2008 - Open Social UpdateGoodle Developer Days Munich 2008 - Open Social Update
Goodle Developer Days Munich 2008 - Open Social Update
Patrick Chanezon
 
Goodle Developer Days London 2008 - Open Social Update
Goodle Developer Days London 2008 - Open Social UpdateGoodle Developer Days London 2008 - Open Social Update
Goodle Developer Days London 2008 - Open Social Update
Patrick Chanezon
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
markgrover
 
APIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in HeavenAPIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in Heaven
Michael Petychakis
 

Similar to Sunlight Labs & MongoDB @ MongoDC (20)

Civic Hacking @ Ruby Hoedown
Civic Hacking @ Ruby HoedownCivic Hacking @ Ruby Hoedown
Civic Hacking @ Ruby Hoedown
 
Civic Hacking @ MongoNYC
Civic Hacking @ MongoNYCCivic Hacking @ MongoNYC
Civic Hacking @ MongoNYC
 
Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)
 
BBC Linked Data Platform (SemTechBiz San Fran 2013)
BBC Linked Data Platform (SemTechBiz San Fran 2013)BBC Linked Data Platform (SemTechBiz San Fran 2013)
BBC Linked Data Platform (SemTechBiz San Fran 2013)
 
gRPC vs REST: let the battle begin!
gRPC vs REST: let the battle begin!gRPC vs REST: let the battle begin!
gRPC vs REST: let the battle begin!
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong Kong
 
Localytics Webinar: Direct Access
Localytics Webinar: Direct Access Localytics Webinar: Direct Access
Localytics Webinar: Direct Access
 
Friending The Statehouse
Friending The StatehouseFriending The Statehouse
Friending The Statehouse
 
"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol
"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol
"Why Fake News Is Relevant" - Introduction to the Userfeeds Protocol
 
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal HomepageFinal Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
 
Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)
Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)
Dissecting and Mitigating the Privacy Risk of Personal Cloud Apps (at PETS 2016)
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009
 
Open APIs - concepts. applications. visualizations.
Open APIs - concepts. applications. visualizations.Open APIs - concepts. applications. visualizations.
Open APIs - concepts. applications. visualizations.
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Goodle Developer Days Munich 2008 - Open Social Update
Goodle Developer Days Munich 2008 - Open Social UpdateGoodle Developer Days Munich 2008 - Open Social Update
Goodle Developer Days Munich 2008 - Open Social Update
 
Goodle Developer Days London 2008 - Open Social Update
Goodle Developer Days London 2008 - Open Social UpdateGoodle Developer Days London 2008 - Open Social Update
Goodle Developer Days London 2008 - Open Social Update
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
 
APIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in HeavenAPIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in Heaven
 

More from Luigi Montanez

Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010
Luigi Montanez
 
Civic Coding @ SunnyConf
Civic Coding @ SunnyConfCivic Coding @ SunnyConf
Civic Coding @ SunnyConfLuigi Montanez
 
Search-Friendly Web Development @ DC RUG - August 2010
Search-Friendly Web Development @ DC RUG - August 2010Search-Friendly Web Development @ DC RUG - August 2010
Search-Friendly Web Development @ DC RUG - August 2010
Luigi Montanez
 
Civic Hacking @ Ruby Midwest 2010
Civic Hacking @ Ruby Midwest 2010Civic Hacking @ Ruby Midwest 2010
Civic Hacking @ Ruby Midwest 2010
Luigi Montanez
 
Civic Hacking @ Ignite RailsConf
Civic Hacking @ Ignite RailsConfCivic Hacking @ Ignite RailsConf
Civic Hacking @ Ignite RailsConfLuigi Montanez
 
Civic Hacking @ LA RubyConf 2010
Civic Hacking @ LA RubyConf 2010Civic Hacking @ LA RubyConf 2010
Civic Hacking @ LA RubyConf 2010Luigi Montanez
 
Be A Civic Coder - DCRUG
Be A Civic Coder - DCRUGBe A Civic Coder - DCRUG
Be A Civic Coder - DCRUG
Luigi Montanez
 
Be A Civic Coder
Be A Civic CoderBe A Civic Coder
Be A Civic Coder
Luigi Montanez
 
Thin
ThinThin

More from Luigi Montanez (9)

Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010Civic Hacking @ Strange Loop 2010
Civic Hacking @ Strange Loop 2010
 
Civic Coding @ SunnyConf
Civic Coding @ SunnyConfCivic Coding @ SunnyConf
Civic Coding @ SunnyConf
 
Search-Friendly Web Development @ DC RUG - August 2010
Search-Friendly Web Development @ DC RUG - August 2010Search-Friendly Web Development @ DC RUG - August 2010
Search-Friendly Web Development @ DC RUG - August 2010
 
Civic Hacking @ Ruby Midwest 2010
Civic Hacking @ Ruby Midwest 2010Civic Hacking @ Ruby Midwest 2010
Civic Hacking @ Ruby Midwest 2010
 
Civic Hacking @ Ignite RailsConf
Civic Hacking @ Ignite RailsConfCivic Hacking @ Ignite RailsConf
Civic Hacking @ Ignite RailsConf
 
Civic Hacking @ LA RubyConf 2010
Civic Hacking @ LA RubyConf 2010Civic Hacking @ LA RubyConf 2010
Civic Hacking @ LA RubyConf 2010
 
Be A Civic Coder - DCRUG
Be A Civic Coder - DCRUGBe A Civic Coder - DCRUG
Be A Civic Coder - DCRUG
 
Be A Civic Coder
Be A Civic CoderBe A Civic Coder
Be A Civic Coder
 
Thin
ThinThin
Thin
 

Sunlight Labs & MongoDB @ MongoDC