Idescat on the Google Public Data Explorer: <br />The Why, the What and the near Future<br />Xavier Badosa (@badosa)<br />...
Idescat on the Google Public Data Explorer: <br />The Why, the What and the near Future<br />Xavier Badosa (@badosa)<br />...
7.5 M<br />Barcelona<br />
idescat<br />
Dissemination<br />products<br />idescat<br />
Dissemination<br />products<br />Statistics as platform<br />
“Apps”<br />Statistics as platform<br />“O.S.”<br />
General-purpose<br />“Apps”<br />Statistics as platform<br />“O.S.”<br />
General-purpose<br />“Apps”<br />Third-party<br />“Apps”<br />that solve specific needs<br />Statistics as platform<br />
General-purpose<br />“Apps”<br />Third-party<br />“Apps”<br />that solve specific needs<br />REUSE<br />Statistics as plat...
CC BY<br />REUSE<br />Statistics as platform<br />
CC BY<br />APIs<br />REUSE<br />Statistics as platform<br />
CC BY<br />APIs<br />Widgets<br />...<br />REUSE<br />Statistics as platform<br />
CC BY<br />APIs<br />Widgets<br />...<br />GPDE<br />REUSE<br />
CC BY<br />APIs<br />Widgets<br />...<br />GPDE<br />REUSE<br />Very powerful tool<br />
“To use again”<br />REUSE<br />
“To use again”<br />elsewhere<br />REUSE<br />
in a new way<br />“To use again”<br />elsewhere<br />REUSE<br />
Malleability<br />elsewhere<br />REUSE<br />
Malleability<br />Ease of<br />transformation<br />elsewhere<br />REUSE<br />
Malleability<br />Mobility<br />REUSE<br />
Malleability<br />Ease of<br />transportation<br />Mobility<br />REUSE<br />
Malleability<br />Mobility<br />
Malleability<br />Mobility<br />
Malleability<br />Mobility<br />
Malleability<br />Mobility<br />
Malleability<br />highly customizable<br />Mobility<br />
highly customizable<br />
A single big dataset (vs. many small datasets)<br />
are<br />unconnected<br />datasets<br />worlds<br />I feel so lonely!<br />A single big dataset (vs. many small datasets)<...
1<br />dataset<br />manysources<br />A single big dataset (vs. many small datasets)<br />
1<br />dataset<br />many sources<br />
1<br />dataset<br />many sources<br />
Feb. 2011<br />1<br />dataset<br />many sources<br />28<br />31 DS<br />
Feb. 2011<br />May 2011<br />1<br />dataset<br />many sources<br />28<br />31 DS<br />40 DS<br />
Feb. 2011<br />May 2011<br />1<br />dataset<br />many sources<br />35<br />28<br />31 DS<br />40 DS<br />
Feb. 2011<br />May 2011<br />DOES<br />NOT<br />SCALE<br />35<br />28<br />31 DS<br />40 DS<br />
Employment Barcelona<br />Hierarchical list of places<br />List of metrics<br />Commonvocabularies<br />List of dimensions...
1<br />dataset<br />many sources<br />data dissemination<br />data visualization<br />
1<br />existing<br />dataset<br />many sources<br />data dissemination<br />data visualization<br />
1<br />open<br />existing<br />dataset<br />many sources<br />Machine<br />processable<br />
1<br />local<br />open<br />dataset<br />many sources<br />
1<br />local<br />open<br />dataset<br />many sources<br />
988<br />1<br />local<br />open<br />dataset<br />many sources<br />Catalonia   1<br />Counties      41<br />Municipalitie...
1<br />local<br />open<br />dataset<br />many sources<br />annual<br />
DSPL<br />annual<br />
Separation of <br />data & metadata<br />Commonsensical<br />use of XML+CSV<br />DSPL<br />annual<br />
Separation of <br />data & metadata<br />Commonsensical<br />use of XML+CSV<br />DSPL<br />easy  to  automate<br />annual<...
easy  to  automate<br />annual<br />
Full<br />bundle<br />easy  to  automate<br />annual<br />
Catalan municipalities indicators<br />Full<br />bundle<br />128 files!<br /><10 updated<br />easy  to  automate<br />annu...
Full<br />bundle<br />Designed<br />for humans<br />easy  to  automate<br />annual<br />
Single<br />files<br />Designed<br />for machines<br />Write API<br />easy  to  automate<br />annual<br />
The King <br />   of<br />API<br />s<br />
Single<br />files<br />PUSH<br />Designed<br />for machines<br />Write API<br />easy  to  automate<br />annual<br />
Single<br />files<br />PULL<br />Designed<br />for machines<br />Read<br />easy  to  automate<br />annual<br />
DSPL<br />PULL<br />
DSPL<br />PULL<br />
DSPL<br />PULL<br />
local<br />open<br />many sources<br />annual<br />
53 metrics<br />
4 topics<br />53 metrics<br />
∑<br />4 topics<br />53 metrics<br />
metrics<br />x<br />dimensions<br />4 topics<br />53 metrics<br />
metrics<br />x<br />dimensions<br />4 topics<br />53 metrics<br />population<br />x<br />employment status<br />
metrics<br />x<br />dimensions<br />4 topics<br />53 metrics<br />population<br />x<br />employment status<br />
metrics<br />x<br />dimensions<br />population<br />x<br />employment status<br />!<br />These aren’t metrics<br />
Tooclose<br />data model<br />user interface<br />!<br />These aren’t metrics<br />
topics<br />
topics<br />related metrics<br />Better<br />derived metrics<br />
30 dimensions<br />946 mun.<br />53 metrics<br />41 counties<br />4 topics<br />
30 dimensions<br />946 mun.<br />53 metrics<br />41 counties<br />4 topics<br />3 languages<br />
highly customizable<br />
Malleability<br />highly customizable<br />
Malleability<br />highly customizable<br />Mobility<br />
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />Reversing<br />thecommunication<br />initiative<br />
Idescat<br />Users<br />Mobility<br />Reversing<br />thecommunication<br />initiative<br />
Analytics Dashboard<br />embeddable<br />Mobility<br />Reversing<br />thecommunication<br />initiative<br />
Analytics Dashboard<br />embeddable<br />Mobility<br />There’s no GPDE analyticsdashboard!<br />
Analytics Dashboard<br /># installs, # visits/visitors<br />installs with + visits/visitors<br />info with + visits/visito...
embeddable<br />Mobility<br />
embeddable<br />Mobility<br />3S<br />
3S<br />Youtubify yourself<br />
3S<br />
http://www.google.com/publicdata/explore?ds=z1foifl1a0gsn2_&ctype=l<br />&strail=false&nselm=h&met_y=f_pop&hl=en&dl=en#cty...
Shorten<br />3S<br />Share<br />
Shorten<br />3S<br />Share<br />
Shorten<br />Share<br />
Support<br />oEmbed<br />Shorten<br />Share<br />
Support<br />oEmbed<br />via<br />Shorten<br />Share<br />
Support<br />oEmbed<br />Shorten<br />Share<br />
Malleability<br />idescat<br />Mobility<br />REUSE<br />
Malleability<br />idescat<br />Mobility<br />REUSE<br />Google<br />APIs<br />
Better<br />discovery<br />Automatic<br />updates<br />Easierembedding<br />A  N  A  L  Y  T  I  C  S<br />
pageviews?<br />visits?<br />unique visitors?<br />Whatabout<br />ourwebsite’s<br />success?<br />
pageviews?<br />visits?<br />unique visitors?<br />Success metrics?<br />
pageviews?<br />visits?<br />unique visitors?<br />Business model?<br />Success metrics?<br />
pageviews?<br />visits?<br />unique visitors?<br />Business model?<br />Success metrics?<br />
pageviews?<br />Wedon’toperate in the<br />eyeballmarket<br />visits?<br />uniquevisitors?<br />Business model?<br />Succe...
pageviews?<br />Wedon’toperate in the<br />eyeballmarket<br />visits?<br />uniquevisitors?<br />Weoperate in the<br />refe...
maximum data exposure & reach<br />reference<br />market<br />
maximum data exposure & reach<br />reference<br />market<br />accuracypreservation<br />
maximum data exposure & reach<br />reference<br />market<br />accuracypreservation<br />brandrecognition<br />
ThankYou !<br />Seealso:<br />Statisticaldissemination 2.0<br />
borman818 / Daniel Borman<br />JoshBancroft<br />jakevance / Jacob Vance<br />Prizmatic<br />Cristian Torras<br />Mick ㋡rl...
WikimediaCommons<br />NASA<br />http://en.wikipedia.org/wiki/File:The_Earth_seen_from_Apollo_17.jpg<br />NuclearVacuum<br ...
Upcoming SlideShare
Loading in...5
×

Idescat on the Google Public Data Explorer

9,547

Published on

Idescat on the Google Public Data Explorer: The Why, the What and the near Future.

Google Public Data Explorer Day. Eurostat. Luxembourg, 30 June 2011.

Published in: Technology, News & Politics
5 Comments
1 Like
Statistics
Notes
  • Actually I thought you knew all people who find the GPDE directory at all ;-)

    But otherwise, I agree something could be done in terms of better discovery. I was surprised when I checked google search for a few of the indicators from your data bundle (in English), but no graph showed up in the search results. Well, 'slideware' doesn't work in all cases, it's not perfect (yet) and there is still enough room for developments and our contributions :-)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Katja, you are absolutely right when you say that ’reuse, searchability, malleability, mobility outweight the directory list in terms of visibility’. That said, a unique list still does not seem the right model for a dataset catalog (unless that catalog has less than 10 items!).

    And it’s not only a matter of the dataset’s visibility as a whole: more important than that is the dataset’s *contents* visibility (metrics, dimensions, locations, time...): that’s why this issue is referred to as ’discovery’ in the summary (slide 110). For example, as it is now, a user has a hard time to discover all the information available at GPDE for a certain country like Italy.

    ' I could speculate that you personally know all people who find your dataset through the directory ;-)'

    If that was the case that would prove my point :-): only those who already knew our dataset was there did actually find it. Probably, many of those interested in our information that didn’t expect it to be there and weren’t looking for it didn’t find out about it after visiting the directory list.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Great presentation! I'd just like to comment slides 37 + 39. I believe it is not very relevant how your dataset scale in the Google Public Data directory. Reuse, searchability, malleability, mobility outweight the directory list in terms of visibility. I could speculate that you personally know all people who find your dataset through the directory ;-)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • GPDE has a good data model. But, IMHO, the user interface maps too closely that data model, which I think can lead to confusion sometimes. For instance, a filter (for example for labor market statistics’ purposes: population > 15 years old) can be treated, in the data model, as a dimension with only 1 category (and this is perfectly OK) but should be shown in the user interface as something different from a regular dimension.

    Besides, considering topics as groups of metrics doesn’t seem right. For some topics (’society’, ’labor market’, ’education’...) [see slide 71], it is not just about a metric (’population’) but about a metric * dimensions. Now, if you have many metrics, the only tool at hand to help your users is grouping them into topics. So you are forced to choose between the data model or the user interface.

    We have so many metrics that not grouping them wasn’t an option, so we had to ’cheat’ in the data model front: we made up some metrics like ’Economic activity of the population’ (or ’Knowledge of Catalan’): of course, this is not a real metric, the metric is ’population’, filtered by age and classified by employment status.

    It is very wrong to mess up the data model for user interface reasons, but we couldn’t find a better solution for this trade-off. My proposal for Google is on slide 74: forget about topics for grouping metrics; introduce the idea of ’related’ or ’derived’ metrics as a way of narrowing the metrics’ list.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dear Xavier,
    I have a friend at Facebook asking for an explanation to page 72. Can you help?
    Best regards
    Alf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
9,547
On Slideshare
0
From Embeds
0
Number of Embeds
33
Actions
Shares
0
Downloads
0
Comments
5
Likes
1
Embeds 0
No embeds

No notes for slide

Idescat on the Google Public Data Explorer

  1. 1. Idescat on the Google Public Data Explorer: <br />The Why, the What and the near Future<br />Xavier Badosa (@badosa)<br />StatisticalInstitute of Catalonia (Idescat)<br />Google Public Data Explorer Day<br />Eurostat<br />Luxembourg, 30 June 2011<br />
  2. 2. Idescat on the Google Public Data Explorer: <br />The Why, the What and the near Future<br />Xavier Badosa (@badosa)<br />Statistical Institute of Catalonia (Idescat)<br />Google Public Data Explorer Day<br />Eurostat<br />Luxembourg, 30 June 2011<br />
  3. 3. 7.5 M<br />Barcelona<br />
  4. 4. idescat<br />
  5. 5. Dissemination<br />products<br />idescat<br />
  6. 6. Dissemination<br />products<br />Statistics as platform<br />
  7. 7. “Apps”<br />Statistics as platform<br />“O.S.”<br />
  8. 8. General-purpose<br />“Apps”<br />Statistics as platform<br />“O.S.”<br />
  9. 9. General-purpose<br />“Apps”<br />Third-party<br />“Apps”<br />that solve specific needs<br />Statistics as platform<br />
  10. 10. General-purpose<br />“Apps”<br />Third-party<br />“Apps”<br />that solve specific needs<br />REUSE<br />Statistics as platform<br />
  11. 11. CC BY<br />REUSE<br />Statistics as platform<br />
  12. 12. CC BY<br />APIs<br />REUSE<br />Statistics as platform<br />
  13. 13. CC BY<br />APIs<br />Widgets<br />...<br />REUSE<br />Statistics as platform<br />
  14. 14. CC BY<br />APIs<br />Widgets<br />...<br />GPDE<br />REUSE<br />
  15. 15. CC BY<br />APIs<br />Widgets<br />...<br />GPDE<br />REUSE<br />Very powerful tool<br />
  16. 16. “To use again”<br />REUSE<br />
  17. 17. “To use again”<br />elsewhere<br />REUSE<br />
  18. 18. in a new way<br />“To use again”<br />elsewhere<br />REUSE<br />
  19. 19. Malleability<br />elsewhere<br />REUSE<br />
  20. 20. Malleability<br />Ease of<br />transformation<br />elsewhere<br />REUSE<br />
  21. 21. Malleability<br />Mobility<br />REUSE<br />
  22. 22. Malleability<br />Ease of<br />transportation<br />Mobility<br />REUSE<br />
  23. 23. Malleability<br />Mobility<br />
  24. 24. Malleability<br />Mobility<br />
  25. 25. Malleability<br />Mobility<br />
  26. 26. Malleability<br />Mobility<br />
  27. 27. Malleability<br />highly customizable<br />Mobility<br />
  28. 28. highly customizable<br />
  29. 29. A single big dataset (vs. many small datasets)<br />
  30. 30. are<br />unconnected<br />datasets<br />worlds<br />I feel so lonely!<br />A single big dataset (vs. many small datasets)<br />
  31. 31. 1<br />dataset<br />manysources<br />A single big dataset (vs. many small datasets)<br />
  32. 32. 1<br />dataset<br />many sources<br />
  33. 33. 1<br />dataset<br />many sources<br />
  34. 34. Feb. 2011<br />1<br />dataset<br />many sources<br />28<br />31 DS<br />
  35. 35. Feb. 2011<br />May 2011<br />1<br />dataset<br />many sources<br />28<br />31 DS<br />40 DS<br />
  36. 36. Feb. 2011<br />May 2011<br />1<br />dataset<br />many sources<br />35<br />28<br />31 DS<br />40 DS<br />
  37. 37. Feb. 2011<br />May 2011<br />DOES<br />NOT<br />SCALE<br />35<br />28<br />31 DS<br />40 DS<br />
  38. 38.
  39. 39. Employment Barcelona<br />Hierarchical list of places<br />List of metrics<br />Commonvocabularies<br />List of dimensions<br />Available years/months<br />List of sources<br />Users don’t care about datasets<br />
  40. 40.
  41. 41. 1<br />dataset<br />many sources<br />data dissemination<br />data visualization<br />
  42. 42. 1<br />existing<br />dataset<br />many sources<br />data dissemination<br />data visualization<br />
  43. 43. 1<br />open<br />existing<br />dataset<br />many sources<br />Machine<br />processable<br />
  44. 44. 1<br />local<br />open<br />dataset<br />many sources<br />
  45. 45. 1<br />local<br />open<br />dataset<br />many sources<br />
  46. 46. 988<br />1<br />local<br />open<br />dataset<br />many sources<br />Catalonia 1<br />Counties 41<br />Municipalities 946<br />
  47. 47. 1<br />local<br />open<br />dataset<br />many sources<br />annual<br />
  48. 48. DSPL<br />annual<br />
  49. 49. Separation of <br />data & metadata<br />Commonsensical<br />use of XML+CSV<br />DSPL<br />annual<br />
  50. 50. Separation of <br />data & metadata<br />Commonsensical<br />use of XML+CSV<br />DSPL<br />easy to automate<br />annual<br />
  51. 51. easy to automate<br />annual<br />
  52. 52. Full<br />bundle<br />easy to automate<br />annual<br />
  53. 53. Catalan municipalities indicators<br />Full<br />bundle<br />128 files!<br /><10 updated<br />easy to automate<br />annual<br />
  54. 54. Full<br />bundle<br />Designed<br />for humans<br />easy to automate<br />annual<br />
  55. 55. Single<br />files<br />Designed<br />for machines<br />Write API<br />easy to automate<br />annual<br />
  56. 56. The King <br /> of<br />API<br />s<br />
  57. 57. Single<br />files<br />PUSH<br />Designed<br />for machines<br />Write API<br />easy to automate<br />annual<br />
  58. 58. Single<br />files<br />PULL<br />Designed<br />for machines<br />Read<br />easy to automate<br />annual<br />
  59. 59. DSPL<br />PULL<br />
  60. 60. DSPL<br />PULL<br />
  61. 61. DSPL<br />PULL<br />
  62. 62.
  63. 63. local<br />open<br />many sources<br />annual<br />
  64. 64.
  65. 65. 53 metrics<br />
  66. 66. 4 topics<br />53 metrics<br />
  67. 67. ∑<br />4 topics<br />53 metrics<br />
  68. 68. metrics<br />x<br />dimensions<br />4 topics<br />53 metrics<br />
  69. 69. metrics<br />x<br />dimensions<br />4 topics<br />53 metrics<br />population<br />x<br />employment status<br />
  70. 70. metrics<br />x<br />dimensions<br />4 topics<br />53 metrics<br />population<br />x<br />employment status<br />
  71. 71. metrics<br />x<br />dimensions<br />population<br />x<br />employment status<br />!<br />These aren’t metrics<br />
  72. 72. Tooclose<br />data model<br />user interface<br />!<br />These aren’t metrics<br />
  73. 73. topics<br />
  74. 74. topics<br />related metrics<br />Better<br />derived metrics<br />
  75. 75. 30 dimensions<br />946 mun.<br />53 metrics<br />41 counties<br />4 topics<br />
  76. 76. 30 dimensions<br />946 mun.<br />53 metrics<br />41 counties<br />4 topics<br />3 languages<br />
  77. 77. highly customizable<br />
  78. 78.
  79. 79.
  80. 80.
  81. 81.
  82. 82.
  83. 83. Malleability<br />highly customizable<br />
  84. 84. Malleability<br />highly customizable<br />Mobility<br />
  85. 85. embeddable<br />Mobility<br />
  86. 86. embeddable<br />Mobility<br />
  87. 87. embeddable<br />Mobility<br />
  88. 88. embeddable<br />Mobility<br />
  89. 89. embeddable<br />Mobility<br />
  90. 90. embeddable<br />Mobility<br />
  91. 91. embeddable<br />Mobility<br />Reversing<br />thecommunication<br />initiative<br />
  92. 92. Idescat<br />Users<br />Mobility<br />Reversing<br />thecommunication<br />initiative<br />
  93. 93. Analytics Dashboard<br />embeddable<br />Mobility<br />Reversing<br />thecommunication<br />initiative<br />
  94. 94. Analytics Dashboard<br />embeddable<br />Mobility<br />There’s no GPDE analyticsdashboard!<br />
  95. 95. Analytics Dashboard<br /># installs, # visits/visitors<br />installs with + visits/visitors<br />info with + visits/visitors<br />chart with + visits/visitors<br />...<br />
  96. 96.
  97. 97. embeddable<br />Mobility<br />
  98. 98. embeddable<br />Mobility<br />3S<br />
  99. 99. 3S<br />Youtubify yourself<br />
  100. 100. 3S<br />
  101. 101. http://www.google.com/publicdata/explore?ds=z1foifl1a0gsn2_&ctype=l<br />&strail=false&nselm=h&met_y=f_pop&hl=en&dl=en#ctype=c&strail=false<br />&nselm=s&met_y=f_pop_percent&fdim_y=birth_place:Abroad&scale_y=lin<br />&ind_y=false&ifdim=mun&hl=en&dl=en<br />http://goo.gl/XtpLa<br />http://goo.gl/pd/XtpLa<br />http://gp.de/z1foifl1a0gsn2_?8vH<br />Shorten<br />3S<br />
  102. 102. Shorten<br />3S<br />Share<br />
  103. 103. Shorten<br />3S<br />Share<br />
  104. 104. Shorten<br />Share<br />
  105. 105. Support<br />oEmbed<br />Shorten<br />Share<br />
  106. 106. Support<br />oEmbed<br />via<br />Shorten<br />Share<br />
  107. 107. Support<br />oEmbed<br />Shorten<br />Share<br />
  108. 108. Malleability<br />idescat<br />Mobility<br />REUSE<br />
  109. 109. Malleability<br />idescat<br />Mobility<br />REUSE<br />Google<br />APIs<br />
  110. 110. Better<br />discovery<br />Automatic<br />updates<br />Easierembedding<br />A N A L Y T I C S<br />
  111. 111.
  112. 112.
  113. 113.
  114. 114. pageviews?<br />visits?<br />unique visitors?<br />Whatabout<br />ourwebsite’s<br />success?<br />
  115. 115. pageviews?<br />visits?<br />unique visitors?<br />Success metrics?<br />
  116. 116. pageviews?<br />visits?<br />unique visitors?<br />Business model?<br />Success metrics?<br />
  117. 117. pageviews?<br />visits?<br />unique visitors?<br />Business model?<br />Success metrics?<br />
  118. 118. pageviews?<br />Wedon’toperate in the<br />eyeballmarket<br />visits?<br />uniquevisitors?<br />Business model?<br />Success metrics?<br />
  119. 119. pageviews?<br />Wedon’toperate in the<br />eyeballmarket<br />visits?<br />uniquevisitors?<br />Weoperate in the<br />reference<br />market<br />Business model?<br />Success metrics?<br />
  120. 120. maximum data exposure & reach<br />reference<br />market<br />
  121. 121. maximum data exposure & reach<br />reference<br />market<br />accuracypreservation<br />
  122. 122. maximum data exposure & reach<br />reference<br />market<br />accuracypreservation<br />brandrecognition<br />
  123. 123.
  124. 124.
  125. 125.
  126. 126.
  127. 127. ThankYou !<br />Seealso:<br />Statisticaldissemination 2.0<br />
  128. 128. borman818 / Daniel Borman<br />JoshBancroft<br />jakevance / Jacob Vance<br />Prizmatic<br />Cristian Torras<br />Mick ㋡rlosky<br />Michelle Kinsey Bruns<br />Niamor83<br />Clarissa Rossarola<br />
  129. 129. WikimediaCommons<br />NASA<br />http://en.wikipedia.org/wiki/File:The_Earth_seen_from_Apollo_17.jpg<br />NuclearVacuum<br />http://en.wikipedia.org/wiki/File:The_Earth_seen_from_Apollo_17.jpg<br />Mutxamel / HansenBCN<br />http://en.wikipedia.org/wiki/File:Localizaci%C3%B3n_de_Catalu%C3%B1a.svg<br />Authorunknown<br />http://www.taltopia.com/media/6/6374/SPERM-ART.jpg<br />Maps © by Google and TeleAtlas<br />PD<br />

×