Enough with the UIs


Show me the data!

Turning Statistics Into Knowledge, July 2009
Usability
 Accessibility
Easier for people to find data
Agenda

• The Problem
• Linked Data primer
• Prototype walk through
• Take homes and vision
History
• WHO: Openhealth prototype - global
  disease incidence reporting platform
• OECD: QWIDS - Query Wizards for
  In...
Answer Questions
Where are flu
pandemics erupting?
How much is being
spent on HIV/AIDS by
        Japan?
Is aid tied to malaria
  activities making a
      difference?
Trying to reduce child
mortality by two-thirds
Where is this data?
Our mission

• make it easy to answer the questions
• cross organization
• don’t reinvent the wheel
• keep it simple
“the possibility of
delivering abundant data
 without the need for
massive centralization.”
Agenda

• The Problem
• Linked Data primer
• Prototype walk through
• Take homes and vision
Linked Data,
  Explained
Current web +
Some new technology
Mature technology
Anyone can say
anything about
   anything.
“Japan is a DAC country”


<http://www.geonames.org/countries/#JP>
  oecd:memberOf;
  oecd:DAC.
Standard interchange
Naturally extensible
Queryable
  SPARQL
Provenance
Trust
Inference
Japan ∈ DAC
DAC ⊂ Donors

Japan ∈ Donors
That’s it! (on linked data)
Agenda

• The Problem
• Linked Data primer
• Prototype walk through
• Take homes and vision
Is aid tied to malaria
  activities making a
      difference?
Embeddable Maps!
      (no really)
Agenda

• The Problem
• Linked Data Primer
• Prototype walk through
• Take homes and vision
Critical Adoption
“RT @NovakKevin:just
got word that Fed may
    experiment with
#linkeddata. Great news
     #semanticweb
 #w3cegov #semtec...
Take Homes

• build shared software got Int’l Orgs
• sits on existing infrastructure
• end users can answer the harder que...
Where are we going?

• Funding from foundations
• Expressions of Interest
• WHO, OECD, IMF, WB, UNESCO,
 UNCTAD, FAO
Come talk to us!

       www.2paths.com/conf/tsik2009

       aaron@2paths.com

       michal@2paths.com
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Show me the Data!  Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009
Upcoming SlideShare
Loading in …5
×

Show me the Data! Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009

1,281 views

Published on

At the 2009 Seminar on Innovative Approaches to Turn Statistics into Knowledge (http://www.oecd.org/progress/ict/statknowledge), jointly organized by the OECD, US Census Bureau and World Bank, we proposed and demo'd a proof of concept on data sharing between international organizations. We demonstrated how open source tools could sit on top of existing infrastructure and reused visualization tools to show how data could be pulled and combined from the various organizations on the fly.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,281
On SlideShare
0
From Embeds
0
Number of Embeds
131
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • my name is Aaron Gladders, that&amp;#x2019;s me on the bottom left
    my colleague, Michal Urbanski and his recent addition
  • we work @ 2Paths
  • we&amp;#x2019;re from canada yo
  • specifically Vancouver, where known for our rain
  • but it&amp;#x2019;s also a beautiful city
    http://www.flickr.com/photos/tommyauphoto/2579810410/sizes/o/
  • Our Focus: making it easier for people (and machines!) to find data
  • At 2Paths we&amp;#x2019;re not about world domination. We want to feed our kids....
  • and ideally make the world a better place
  • So what is the problem
    There are lots of UI&amp;#x2019;s out there, some great ones that we&amp;#x2019;ve seen here and we&amp;#x2019;ve also heard from some great story tellers. But they need access to the data in an accessible way to tell their stories. As we heard yesterday, much of the north american focus is on just getting access to the data.
  • So a little history. This began for us with the WHO Openhealth prototype. It was a global disease incidence reporting platform, meant to assemble data from all over the world on the diseases occurring there. Ultimately this was to show up on portals, with the associated grids, graphs and maps. Later we worked with making it easier for people to dig for the data they wanted, with the OECD Query Wizard for Int&amp;#x2019;l Development Statistics. This was built on top of .STAT, which you say yesterday in Trevor Fletcher&amp;#x2019;s gangster flick. In the last while we&amp;#x2019;ve been helping the BMGF to classify their information and ultimately to share it. And our most recent addition has been towards IATI, helping to define which tech should be used for donor governments to share their aid activities, to improve aid transparency and effectiveness.
  • Really though, we&amp;#x2019;re trying to answer questions. Such as...
  • Where are flu pandemics erupting. Generally you would goto the WHO for this.
  • How much is being spent on HIV/AIDS by Japan. Here you would goto the OECD
  • What if we want to get a little more complicated - You&amp;#x2019;d have to goto the WHO and the OECD
  • And even more complicated, when data comes from places all over the world like with the Millenium Development Goals
  • So where is this data?
  • You&amp;#x2019;d go to NSO&amp;#x2019;s
  • Or the websites of Int&amp;#x2019;l Orgs. Note some of these orgs are providing data feeds - OECD (not advertised, but they are there, we used them and we even built an nice RESTful one. It&amp;#x2019;s very exciting that the World Bank has a public API
  • More recently you can go to data.un.ORG
  • And now with the United States, data.gov
  • Really though, you go to as many sources as you can find (or even just one)
    download it
  • combine it, chart it
  • and maybe map it (and with tools like Google Fusion, it&amp;#x2019;s a snap)
  • Our goal was to make it easy to answer these questions. That required cross org data mashups. But we wanted to leverage the existing tools out there, and more importantly, keep it simple.
  • We quickly zero&amp;#x2019;d in on semantic web tech - flexibility for each org define their information as they needed but allow for mappings between, in a queryable way.
  • I&amp;#x2019;m here to present a quick primer on Linked Data for the uninitiated. We&amp;#x2019;re going to quickly go over what it is, and why we should be paying attention to it.
  • The main technology behind linked data is one we should already be familiar with: the semantic web. It is, essentially, an extension of our current web. It grafts some new standards onto existing ones, in order to give meaning to content.
  • The technology driving the semantic web has been around for a while. While it may have started out as a highly academic exercise, it has evolved into a very compelling platform for the sharing of data. Semantic technologies have also benefitted from a lot of different areas, from better XML support in languages to the emergence of a new class of semantic-specific vendors. It&amp;#x2019;s a technology that has &amp;#x201C;escaped the lab&amp;#x201D;, so to speak, and is being used to solve actual problems, today.
  • If you had to summarize semantic technology in one sentence, it would be &amp;#x201C;Anyone can say anything, about anything&amp;#x201D;. However, what&amp;#x2019;s novel about it is that the way you say things is standardized, because ...
  • ... the &amp;#x201C;meaning&amp;#x201D; of statement isn&amp;#x2019;t intended only for us, it&amp;#x2019;s also for machines, for tools and agents who can act on _behalf_ of people. Here we see a statement, &amp;#x201C;Japan is a DAC country&amp;#x201D;, represented in an example form that a machine would understand.
  • If we represent all our information in these semantic formats, we can leverage a significant number of tools that already understand them. We still have to do some work, just like we used to rolling out own XML formats for moving data around, except now we&amp;#x2019;re more likely to use other people&amp;#x2019;s vocabularies or produce vocabularies that others can use. These vocabularies are the &amp;#x201C;link&amp;#x201D; of linked data.
  • Because the semantic web has been in development for some time, there are already a number of existing vocabularies you can use to describe your data. Using them is the normal and natural way of participating in the world of linked data. In the case where you need to define your vocabulary because a suitable one doesn&amp;#x2019;t already exist, you will want others to use it as well, making the entire semantic ecosystem naturally extensible. Once you release and describe your data, others can easily say things about it, or use your vocabulary to describe *their* data, or link their data to yours.
  • A large amount of linked data is of limited utility if we have no way to find what we&amp;#x2019;re looking for. Relatively recently, the semantic web gained a nice, shiny query language called SPARQL. It became an official W3C recommendation at the beginning of 2008, so it&amp;#x2019;s pretty new, but it&amp;#x2019;s already become quite a popular tool for making complex queries into distributed stores of linked data.
  • There&amp;#x2019;s another aspect of the semantic web that we should find very interesting as we examine the world of linked data. The idea of &amp;#x201C;provenance&amp;#x201D;, which lets you trace where and from who a piece of data comes from, will be highly useful to organizations which are concerned about data accuracy.
  • Suppose you are looking at a chart of data you&amp;#x2019;ve found online. With a proper provenance system in place, you would be able to tell where that chart&amp;#x2019;s data came from, and more importantly, you could determine whether or not you can ...
  • ... trust it. By enumerating your trusted data sources, a system can automatically determine whether the data you&amp;#x2019;re looking at is trustworthy by examining its provenance.
  • Suppose, that chart had used data from Wikipedia in addition to officially published figures from the OECD and the WHO. In this case, you might want to bit more careful using the data from that chart.
  • Of course, if Wikipedia also has a provenance system in place, ideally your software will follow that chain and perhaps you can trust that data after all.
  • The last thing that I&amp;#x2019;d like to touch on with respect to semantic technology is the idea of inference. This basically means that a semantic system is able to derive &amp;#x201C;new knowledge&amp;#x201D; based on things it already knows.
  • A basic example would be, given a system that knows &amp;#x201C;Japan is a DAC country&amp;#x201D; and &amp;#x201C;DAC countries are donors&amp;#x201D;, it would be able to infer that Japan is a donor country. This is one of the keys for semantic technology... it&amp;#x2019;s arguably what&amp;#x2019;s driving widespread semantic adoption. Once the data and metadata are there, this is the tool that will really drive innovation.
  • And that&amp;#x2019;s it. Hopefully that gives you a reasonable picture as to what linked data and the semantic web are, and why we ought to be interested in them.
  • And now we&amp;#x2019;re going to take a walkthrough a little prototype we&amp;#x2019;ve done up, in order to demonstrate a working system that can make use of distributed, linked data. I&amp;#x2019;m presenting screenshots because we&amp;#x2019;re paranoid about demo curses, but we do have a running system available here today, which we&amp;#x2019;re willing to show you under less stressful circumstances.
  • Show me the Data! Seminar on Innovative Approaches to Turn Statistics into Knowledge 2009

    1. 1. Enough with the UIs Show me the data! Turning Statistics Into Knowledge, July 2009
    2. 2. Usability Accessibility Easier for people to find data
    3. 3. Agenda • The Problem • Linked Data primer • Prototype walk through • Take homes and vision
    4. 4. History • WHO: Openhealth prototype - global disease incidence reporting platform • OECD: QWIDS - Query Wizards for International Development Statistics • Bill & Melinda Gates Foundation • International Aid Transparency Initiative
    5. 5. Answer Questions
    6. 6. Where are flu pandemics erupting?
    7. 7. How much is being spent on HIV/AIDS by Japan?
    8. 8. Is aid tied to malaria activities making a difference?
    9. 9. Trying to reduce child mortality by two-thirds
    10. 10. Where is this data?
    11. 11. Our mission • make it easy to answer the questions • cross organization • don’t reinvent the wheel • keep it simple
    12. 12. “the possibility of delivering abundant data without the need for massive centralization.”
    13. 13. Agenda • The Problem • Linked Data primer • Prototype walk through • Take homes and vision
    14. 14. Linked Data, Explained
    15. 15. Current web + Some new technology
    16. 16. Mature technology
    17. 17. Anyone can say anything about anything.
    18. 18. “Japan is a DAC country” <http://www.geonames.org/countries/#JP> oecd:memberOf; oecd:DAC.
    19. 19. Standard interchange
    20. 20. Naturally extensible
    21. 21. Queryable SPARQL
    22. 22. Provenance
    23. 23. Trust
    24. 24. Inference
    25. 25. Japan ∈ DAC DAC ⊂ Donors Japan ∈ Donors
    26. 26. That’s it! (on linked data)
    27. 27. Agenda • The Problem • Linked Data primer • Prototype walk through • Take homes and vision
    28. 28. Is aid tied to malaria activities making a difference?
    29. 29. Embeddable Maps! (no really)
    30. 30. Agenda • The Problem • Linked Data Primer • Prototype walk through • Take homes and vision
    31. 31. Critical Adoption
    32. 32. “RT @NovakKevin:just got word that Fed may experiment with #linkeddata. Great news #semanticweb #w3cegov #semtech”
    33. 33. Take Homes • build shared software got Int’l Orgs • sits on existing infrastructure • end users can answer the harder questions • Query and combine across organizations • accessibility, usability
    34. 34. Where are we going? • Funding from foundations • Expressions of Interest • WHO, OECD, IMF, WB, UNESCO, UNCTAD, FAO
    35. 35. Come talk to us! www.2paths.com/conf/tsik2009 aaron@2paths.com michal@2paths.com

    ×