Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data All the Way Down


Published on

Presentation at OKCon 2011 on how to build web applications that provide complex data using a layered architecture.

Published in: Technology

Data All the Way Down

  1. 1. Data All the Way DownJeni Tennison@JeniT
  2. 2. Data All the Way Down• challenges of complex open data• layered approach to data publishing• essential steps• benefits
  3. 3. Complex Datasets• too much for a single spreadsheet• need to navigate • browse through data • look at slices of larger dataset • view summary statistics• need to explain • definitions of terms, provisos & disclaimers
  4. 4. User Challenge• complex data sets have range of users • different hardware / platforms • different tasks / goals • different ability / understanding• no one interface satisfies everyone• data owners cannot satisfy everyone• create ecosystem around open data
  5. 5. visualisation / data gap end user vs reuser
  6. 6. Visualisations• approachable for real people• necessary for stakeholder buy-in• beauty is in whats left out • advertisement or taster of rich datasets • often not possible in official data• leaves questions unanswered • what if we looked at the data in a different way?
  7. 7. Raw Data• importable into own data store • often only interested in particular slice • data set may be massive / changing• run whatever analysis you want • requires at least some programming skills • analysis might not be appropriate for the data• documentation probably lacking
  8. 8. bridging the gap layered data accessPhoto by Nikita Kravchuk
  9. 9. Layered Architecture• user interface • navigation and global understanding• API • curated, targeted, programmable access• query • free-form programmable access• raw data
  10. 10. lists as Atom feeds
  11. 11. content as XML
  12. 12. layer other views
  13. 13. organograms navigable visualisation
  14. 14. organograms JSON data
  15. 15. organograms RDF / XML / HTML
  16. 16. organograms SPARQL query
  17. 17. organograms raw data
  18. 18. Key Techniques• resource-driven design (good URIs)• every page built based on API calls• explicit links to API access • for bonus points, link to your transformation code• consistent terminology • clear mapping from UI to API• caching & access control at each level
  19. 19. Benefits• fork at any point • dont like the visualisation / API? create your own!• everyone is human • reusers gain understanding from user interface• visualisation benefits the stack • API oriented towards achieving a goal • visual validation of data improves quality
  20. 20. Questions?