"The Lean Startup Approach to Open Data: How Demand-Driven Open Data (DDOD) Improves Relevance, Discoverability and Usability" was presented on Jun 21, 2016 at the Data Cabinet meeting run by the White House Office of Science and Technology Policy.
It discusses how DDOD is a better framework for implementing an agency's open data initiatives, explaining the technology platform and methodology that make it possible. The framework is reviewed in terms of its three core deliverables: knowledge base, data assets and technical capability. The quantified accomplishments of the program over the past year are demonstrated in terms of use cases, agencies covered, users served, datasets indexed, data assets improved or released, standardization, and technical capabilities, such as metric calculations, data federation and quality analytics. Finally, it covers several successful case studies that demonstrate DDOD in action.
More information:
* DDOD http://ddod.healthdata.gov
* White House Office of Science and Technology Policy: https://www.whitehouse.gov/blog/2016/02/05/open-data-empowering-americans-make-data-driven-decisions
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
DDOD - The Lean Startup Approach to Open Data
1. Presented at
Executive Office of the President
Office of Science and Technology
Policy
US Gov Data Cabinet meeting
June 21, 2016
The Lean Startup
Approach to Open Data
How Demand-Driven Open Data (DDOD)
Improves Relevance, Discoverability and Usability
David Portnoy
Entrepreneur-in-Residence
U.S. Department of Health & Human Services
Twitter: @dportnoy
http://ddod.healthdata.gov
2. Piloted as “Lean Startup for Open Data”
at HHS IDEA Lab across Department of Health & Human Services agencies
Demonstrated capabilities in improving data quality
at White House Open Data Roundtable
Optimizing DDOD to be scalable and applicable across government
with OSTP/OMB, Data Cabinet and Center for Open Data Enterprise
Discuss data maturity model application beyond open data
at MIT Chief Data Officer conference
The Background
3. CIOs and agency heads must: Maintain an EDI
(Enterprise Data Inventory); Implement a
“process to evaluate and improve timeliness,
completeness, accuracy, usefulness, and
availability” of open data; Implement a method
for understanding data asset usage, responding
to quality issues, usability, recommendations for
improvements, and adherence complaints;
Ensure conformance with open data best
practices; Produce an Open Data Compliance
Report.
Agencies must: Analyze data asset usage,
including responding to quality issues, usability,
recommendations for improvements, and
complaints about adherence; Monitor public
satisfaction and performance improvement
needs; Engage the public in using open data and
encourage collaborative approaches to improving
data use; Provide information for the GAO report
on the value of information made available to the
public and additional data assets that should be
made available publicly.
Looking Ahead... OPEN Gov Data Act
Focus on measuring value of data
Engage the public in using open data and encourage
collaborative approaches to improving data use
Analyze data asset usage. “Monitor public
satisfaction and performance improvement needs”
Institute a process to continuously improve on “quality
issues, usability, recommendations, complaints...”
4. What happens when we don’t measure value?
Data owners focus on datasets that are:
easiest to generate and
least risky to release
Unusable and low-value datasets
Difficult to find useful data
The Reality
The Result
5. Take community engagement (on steroids, of course)
The Solution
And pair it with lean startup principles
The Shift
6. What’s a Use Case?
All metrics in DDOD are in terms of Use Cases,
...which is simply a well-defined application of a dataset
for a specific purpose in industry, research or media.
It always includes a statement of value -- both
to the requester and the general public
7. Each use case has core
sections…
Description
Value
Specifications
Solution
See them at:
http://ddod.healthdata.gov/
Anatomy of a Use Case
8. Processes for administration of use cases, such as
• Encouraging responsiveness, transparency and documentation
• Ensuring use cases and resulting datasets are indexed in HealthData.gov
Specialized tools for administering use cases
• Workflow engine, communications method, knowledge base
• Data processing, storage, hosting, versioning
Proactive outreach to industry and academia for a thriving
community
DDOD provides 3 core services to Data Owners
9. The Framework
Identify missing
technical capability
Manually improve
data catalog and
data assets
Contribute to
Use Case
knowledge base
External DDOD Activity
• Outreach & collaboration
• Use case administration
DDOD drives 3 types of deliverables:
cataloging of use cases,
improvements of data assets and
development of technical capabilities
Internal DDOD Activity
• Systems development
• Program evaluation
Ongoing
Systems development specification
Increase & measure value Improve capability
Knowledge Base Data Assets Technical Capability
10. The Process
DDOD’s workflow for a Use Case enabled by 3 types of participants:
Data User, DDOD Admin, Data Owner
11. Communications Platform
(Github Issues)
Data Catalog
(HealthData.gov)
Knowledge Base (MediaWiki)
The Tools
Middleware (Python)
Tied together with middleware
that monitors changes and
tracks progress
12. The Architecture
Data.json Hosted
charts
(Flask, Google
Charts, Bokeh)
Embed
Middleware
(Python, Flask, math libraries)
HealthData.gov
Drupal
(CMS,
workflow)
Semantic
MediaWiki
Drupal DKAN theme
SMW
API
GitHub
issues
GH API
DCAN Drupal
Extension
(DKAN data
catalog)
Requests
Library
...but it’s always changing
15. ✤ As of May 2016
The Deliverables
Knowledge Base Data Catalog & Assets Technical Capability
◼ 44 use cases documenting
specific applications of open data
assets added to DDOD knowledge
base
◼ 8 agencies covered: CMS,
FDA, CDC, HRSA, ONC, ACF,
ACL, ASPE
◼ 47 users served by DDOD,
including companies, data
scientists, researchers, journalists
and nonprofits
◼ 20 use cases driving additional
datasets indexed
◼ 180 previously uncataloged
URLs identified
◼ 9 use cases driving new or
improved datasets released
◼ 2 standards for open data
resulting from 8 use cases
◼ Automated calculation and
visualization of value metrics
◼ Dataset count fluctuation
monitoring
◼ Daily catalog change reports
◼ Data asset federation report &
harvest flow visualization
◼ DDOD/HealthData.gov integration
roadmap
• Single source of truth monitoring
• Data quality notifications
• Auto sync between platforms
16. It started with frustration
about data quality
Can’t reconcile multiple sources
Missing unique identifiers
Refreshes change history
And ended with a release of
new data
(including an API!)
Example Use Case
17. Quality improvements
using machine readability
and consolidation
Medicaid enrollment
data reports have been
published only as PDFs
...with different files by
years and state!
Lots of overhead and
transcription errors
If only they could all be
that easy!
Example Use Case
18. Data quality improves by
eliminating manual entry
Federal poverty
guidelines are tables
published annually
Lots of organizations
enter these by hand
But community already
solved the problem
The best kind of problems
solve themselves!
Example Use Case
19. Insights for regulation Stimulate adoption
Sometimes, the biggest
gains come when you
observe trends.
Observe 7 use cases with
common challenge
Need standardized
provider dimension
Work with regulators and
industry
DDOD was able to
contribute to a new standard
7 use cases impacted Industry work groupExample Use Case
20. Your insights please!
Help fine-tune DDOD to be most
applicable to your agency’s needs
1. Channels used to reach public
2. Prioritization of releases / improvements
3. Measuring value of data assets
4. Incentives for program owner
The Request