Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Open Legislation  Spring 2011
Open Data(Government)
Secondary Sources are nice●   OpenCongress●   GovTrack.US●   OpenStates●   FedSpending.org●   Many more
Primary Sources are better●   Data.gov●   USAspending.gov●   California●   Oregon●   Washington●   Many more
Sometimes though...Open Data is not Enough.  We need Platforms.
A Different Breed of Open●   Making data accessible:    ●   Built-in search    ●   Permanent URIs    ●   Standardized Feed...
So back toOpen Legislation
Browse, Search, and Sharehttp://open.nysenate.gov/legislation
Its not a Service;Its an Open Platform
1 Year Re-cap●   Open Sourced It (for real)●   Improved the API (xml/json)●   Decreased Load Times●   Restructured the Bac...
The next year●   In general..    ●   Data Quality and Documentation    ●   Usage Tracking and Statistics    ●   User Inter...
The Senate has Legislative   Data Quality issues?
Well, not exactly●   Legislative Research Service has the data    ●   Big, ancient mainframe to boot●   They FTP us update...
Reasons for Difficulty●   Poorly Documented SOBI behavior●   Formatted as a change log (sometimes)    ●   Finding sources ...
Solutions●   Version Control    ●   Write objects to JSON/XML files    ●   With Git, commit each new version        –   Co...
Progress✔   Parsing has been overhauled✔   Objects are written to file✔   Bugs have been found and fixed✔   Periodic Scrap...
A short task list✗   Integrate git into the parsing system.✗   Document expected behavoir✗   Write a small test suite✗   T...
HFOSS Symposium 2011●   Bryan Sivak – Civic Commons●   Mark Prutalis – Sahana Foundation●   Many universities, Mozilla, Go...
Upcoming SlideShare
Loading in …5
×

Open Legislation Spring 2011 Talk 1

343 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Open Legislation Spring 2011 Talk 1

  1. 1. Open Legislation Spring 2011
  2. 2. Open Data(Government)
  3. 3. Secondary Sources are nice● OpenCongress● GovTrack.US● OpenStates● FedSpending.org● Many more
  4. 4. Primary Sources are better● Data.gov● USAspending.gov● California● Oregon● Washington● Many more
  5. 5. Sometimes though...Open Data is not Enough. We need Platforms.
  6. 6. A Different Breed of Open● Making data accessible: ● Built-in search ● Permanent URIs ● Standardized Feeds ● Real-time Alerts● REST Architecture with Feed Publishing ● RSS/Atom => Pubsubhubbub => Alerts
  7. 7. So back toOpen Legislation
  8. 8. Browse, Search, and Sharehttp://open.nysenate.gov/legislation
  9. 9. Its not a Service;Its an Open Platform
  10. 10. 1 Year Re-cap● Open Sourced It (for real)● Improved the API (xml/json)● Decreased Load Times● Restructured the Back-end● Basic Documentation● Wrapped into a build system
  11. 11. The next year● In general.. ● Data Quality and Documentation ● Usage Tracking and Statistics ● User Interface Improvements ● Further separation of the Platform and Service● Right now ● Data Quality, Data Quality, Data Quality ● And a little bit of documentation
  12. 12. The Senate has Legislative Data Quality issues?
  13. 13. Well, not exactly● Legislative Research Service has the data ● Big, ancient mainframe to boot● They FTP us updates every 5 minutes ● In SOBI formats (what?) ● With some XML mixed in● We parse it back into XML/JSON/SQL structure
  14. 14. Reasons for Difficulty● Poorly Documented SOBI behavior● Formatted as a change log (sometimes) ● Finding sources of error can be hard● LRS is not co-operative
  15. 15. Solutions● Version Control ● Write objects to JSON/XML files ● With Git, commit each new version – Commit message points to the source SOBI ● Use git to trace data errors back to SOBI files● Unit Test known corner cases● Periodically do a scrape check?
  16. 16. Progress✔ Parsing has been overhauled✔ Objects are written to file✔ Bugs have been found and fixed✔ Periodic Scrapes are approved
  17. 17. A short task list✗ Integrate git into the parsing system.✗ Document expected behavoir✗ Write a small test suite✗ Try to avoid having to scrape.
  18. 18. HFOSS Symposium 2011● Bryan Sivak – Civic Commons● Mark Prutalis – Sahana Foundation● Many universities, Mozilla, Google● David, Moorthy, Brian, and Myself! ● 1 Hour and a few 3 x 4 posters.

×