• Save
Idescat on the Google Public Data Explorer
Upcoming SlideShare
Loading in...5
×
 

Idescat on the Google Public Data Explorer

on

  • 7,857 views

Idescat on the Google Public Data Explorer: The Why, the What and the near Future.

Idescat on the Google Public Data Explorer: The Why, the What and the near Future.

Google Public Data Explorer Day. Eurostat. Luxembourg, 30 June 2011.

Statistics

Views

Total Views
7,857
Views on SlideShare
2,813
Embed Views
5,044

Actions

Likes
1
Downloads
0
Comments
5

50 Embeds 5,044

http://xavierbadosa.com 3620
http://datosconinteligencia.blogspot.com.es 302
http://datosconinteligencia.blogspot.com 256
http://www.datosconinteligencia.blogspot.com.es 216
http://www.datosconinteligencia.blogspot.com 210
http://translate.googleusercontent.com 115
http://ultimate 96
http://jdelojo.blogspot.com 49
http://stataccess.blogspot.com 39
http://datosconinteligencia.blogspot.mx 19
http://datosconinteligencia.blogspot.com.ar 18
http://www.datosconinteligencia.blogspot.com.ar 10
http://jdelojo.blogspot.com.es 9
http://abtasty.com 7
http://stataccess.blogspot.com.es 6
http://stataccess.blogspot.co.uk 5
https://twitter.com 5
http://www.datosconinteligencia.blogspot.ch 5
http://www.datosconinteligencia.blogspot.mx 4
http://datosconinteligencia.blogspot.fr 4
url_unknown 4
http://stataccess.blogspot.nl 3
http://feeds.feedburner.com 3
http://datosconinteligencia.blogspot.com.br 3
http://www.datosconinteligencia.blogspot.fr 3
http://www.datosconinteligencia.blogspot.co.uk 2
http://paper.li 2
http://datosconinteligencia.blogspot.co.uk 2
http://datosconinteligencia.blogspot.ru 2
http://www.slideshare.net 2
http://www.datosconinteligencia.blogspot.nl 2
http://stataccess.blogspot.hk 2
http://stataccess.blogspot.com.au 2
http://www.galeradas.com 1
http://www.google.es 1
http://131.253.14.66 1
http://www.datosconinteligencia.blogspot.de 1
http://www.datosconinteligencia.blogspot.com.au 1
http://cloud 1
http://stataccess.blogspot.ca 1
http://twitter.com 1
http://datosconinteligencia.blogspot.fi 1
http://stataccess.blogspot.be 1
http://www.datosconinteligencia.blogspot.se 1
http://datosconinteligencia.blogspot.gr 1
http://datosconinteligencia.blogspot.jp 1
http://datosconinteligencia.blogspot.nl 1
http://duckduckgo.com 1
http://webcache.googleusercontent.com 1
http://xavierbadosa.com&_=1411092009986 HTTP 1
More...

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Actually I thought you knew all people who find the GPDE directory at all ;-)

    But otherwise, I agree something could be done in terms of better discovery. I was surprised when I checked google search for a few of the indicators from your data bundle (in English), but no graph showed up in the search results. Well, 'slideware' doesn't work in all cases, it's not perfect (yet) and there is still enough room for developments and our contributions :-)
    Are you sure you want to
    Your message goes here
    Processing…
  • Katja, you are absolutely right when you say that ’reuse, searchability, malleability, mobility outweight the directory list in terms of visibility’. That said, a unique list still does not seem the right model for a dataset catalog (unless that catalog has less than 10 items!).

    And it’s not only a matter of the dataset’s visibility as a whole: more important than that is the dataset’s *contents* visibility (metrics, dimensions, locations, time...): that’s why this issue is referred to as ’discovery’ in the summary (slide 110). For example, as it is now, a user has a hard time to discover all the information available at GPDE for a certain country like Italy.

    ' I could speculate that you personally know all people who find your dataset through the directory ;-)'

    If that was the case that would prove my point :-): only those who already knew our dataset was there did actually find it. Probably, many of those interested in our information that didn’t expect it to be there and weren’t looking for it didn’t find out about it after visiting the directory list.
    Are you sure you want to
    Your message goes here
    Processing…
  • Great presentation! I'd just like to comment slides 37 + 39. I believe it is not very relevant how your dataset scale in the Google Public Data directory. Reuse, searchability, malleability, mobility outweight the directory list in terms of visibility. I could speculate that you personally know all people who find your dataset through the directory ;-)
    Are you sure you want to
    Your message goes here
    Processing…
  • GPDE has a good data model. But, IMHO, the user interface maps too closely that data model, which I think can lead to confusion sometimes. For instance, a filter (for example for labor market statistics’ purposes: population > 15 years old) can be treated, in the data model, as a dimension with only 1 category (and this is perfectly OK) but should be shown in the user interface as something different from a regular dimension.

    Besides, considering topics as groups of metrics doesn’t seem right. For some topics (’society’, ’labor market’, ’education’...) [see slide 71], it is not just about a metric (’population’) but about a metric * dimensions. Now, if you have many metrics, the only tool at hand to help your users is grouping them into topics. So you are forced to choose between the data model or the user interface.

    We have so many metrics that not grouping them wasn’t an option, so we had to ’cheat’ in the data model front: we made up some metrics like ’Economic activity of the population’ (or ’Knowledge of Catalan’): of course, this is not a real metric, the metric is ’population’, filtered by age and classified by employment status.

    It is very wrong to mess up the data model for user interface reasons, but we couldn’t find a better solution for this trade-off. My proposal for Google is on slide 74: forget about topics for grouping metrics; introduce the idea of ’related’ or ’derived’ metrics as a way of narrowing the metrics’ list.
    Are you sure you want to
    Your message goes here
    Processing…
  • Dear Xavier,
    I have a friend at Facebook asking for an explanation to page 72. Can you help?
    Best regards
    Alf
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Idescat on the Google Public Data Explorer Idescat on the Google Public Data Explorer Presentation Transcript

  • Idescat on the Google Public Data Explorer:
    The Why, the What and the near Future
    Xavier Badosa (@badosa)
    StatisticalInstitute of Catalonia (Idescat)
    Google Public Data Explorer Day
    Eurostat
    Luxembourg, 30 June 2011
  • Idescat on the Google Public Data Explorer:
    The Why, the What and the near Future
    Xavier Badosa (@badosa)
    Statistical Institute of Catalonia (Idescat)
    Google Public Data Explorer Day
    Eurostat
    Luxembourg, 30 June 2011
  • 7.5 M
    Barcelona
  • idescat
  • Dissemination
    products
    idescat
  • Dissemination
    products
    Statistics as platform
  • “Apps”
    Statistics as platform
    “O.S.”
  • General-purpose
    “Apps”
    Statistics as platform
    “O.S.”
  • General-purpose
    “Apps”
    Third-party
    “Apps”
    that solve specific needs
    Statistics as platform
  • General-purpose
    “Apps”
    Third-party
    “Apps”
    that solve specific needs
    REUSE
    Statistics as platform
  • CC BY
    REUSE
    Statistics as platform
  • CC BY
    APIs
    REUSE
    Statistics as platform
  • CC BY
    APIs
    Widgets
    ...
    REUSE
    Statistics as platform
  • CC BY
    APIs
    Widgets
    ...
    GPDE
    REUSE
  • CC BY
    APIs
    Widgets
    ...
    GPDE
    REUSE
    Very powerful tool
  • “To use again”
    REUSE
  • “To use again”
    elsewhere
    REUSE
  • in a new way
    “To use again”
    elsewhere
    REUSE
  • Malleability
    elsewhere
    REUSE
  • Malleability
    Ease of
    transformation
    elsewhere
    REUSE
  • Malleability
    Mobility
    REUSE
  • Malleability
    Ease of
    transportation
    Mobility
    REUSE
  • Malleability
    Mobility
  • Malleability
    Mobility
  • Malleability
    Mobility
  • Malleability
    Mobility
  • Malleability
    highly customizable
    Mobility
  • highly customizable
  • A single big dataset (vs. many small datasets)
  • are
    unconnected
    datasets
    worlds
    I feel so lonely!
    A single big dataset (vs. many small datasets)
  • 1
    dataset
    manysources
    A single big dataset (vs. many small datasets)
  • 1
    dataset
    many sources
  • 1
    dataset
    many sources
  • Feb. 2011
    1
    dataset
    many sources
    28
    31 DS
  • Feb. 2011
    May 2011
    1
    dataset
    many sources
    28
    31 DS
    40 DS
  • Feb. 2011
    May 2011
    1
    dataset
    many sources
    35
    28
    31 DS
    40 DS
  • Feb. 2011
    May 2011
    DOES
    NOT
    SCALE
    35
    28
    31 DS
    40 DS
  • Employment Barcelona
    Hierarchical list of places
    List of metrics
    Commonvocabularies
    List of dimensions
    Available years/months
    List of sources
    Users don’t care about datasets
  • 1
    dataset
    many sources
    data dissemination
    data visualization
  • 1
    existing
    dataset
    many sources
    data dissemination
    data visualization
  • 1
    open
    existing
    dataset
    many sources
    Machine
    processable
  • 1
    local
    open
    dataset
    many sources
  • 1
    local
    open
    dataset
    many sources
  • 988
    1
    local
    open
    dataset
    many sources
    Catalonia 1
    Counties 41
    Municipalities 946
  • 1
    local
    open
    dataset
    many sources
    annual
  • DSPL
    annual
  • Separation of
    data & metadata
    Commonsensical
    use of XML+CSV
    DSPL
    annual
  • Separation of
    data & metadata
    Commonsensical
    use of XML+CSV
    DSPL
    easy to automate
    annual
  • easy to automate
    annual
  • Full
    bundle
    easy to automate
    annual
  • Catalan municipalities indicators
    Full
    bundle
    128 files!
    <10 updated
    easy to automate
    annual
  • Full
    bundle
    Designed
    for humans
    easy to automate
    annual
  • Single
    files
    Designed
    for machines
    Write API
    easy to automate
    annual
  • The King
    of
    API
    s
  • Single
    files
    PUSH
    Designed
    for machines
    Write API
    easy to automate
    annual
  • Single
    files
    PULL
    Designed
    for machines
    Read
    easy to automate
    annual
  • DSPL
    PULL
  • DSPL
    PULL
  • DSPL
    PULL
  • local
    open
    many sources
    annual
  • 53 metrics
  • 4 topics
    53 metrics

  • 4 topics
    53 metrics
  • metrics
    x
    dimensions
    4 topics
    53 metrics
  • metrics
    x
    dimensions
    4 topics
    53 metrics
    population
    x
    employment status
  • metrics
    x
    dimensions
    4 topics
    53 metrics
    population
    x
    employment status
  • metrics
    x
    dimensions
    population
    x
    employment status
    !
    These aren’t metrics
  • Tooclose
    data model
    user interface
    !
    These aren’t metrics
  • topics
  • topics
    related metrics
    Better
    derived metrics
  • 30 dimensions
    946 mun.
    53 metrics
    41 counties
    4 topics
  • 30 dimensions
    946 mun.
    53 metrics
    41 counties
    4 topics
    3 languages
  • highly customizable
  • Malleability
    highly customizable
  • Malleability
    highly customizable
    Mobility
  • embeddable
    Mobility
  • embeddable
    Mobility
  • embeddable
    Mobility
  • embeddable
    Mobility
  • embeddable
    Mobility
  • embeddable
    Mobility
  • embeddable
    Mobility
    Reversing
    thecommunication
    initiative
  • Idescat
    Users
    Mobility
    Reversing
    thecommunication
    initiative
  • Analytics Dashboard
    embeddable
    Mobility
    Reversing
    thecommunication
    initiative
  • Analytics Dashboard
    embeddable
    Mobility
    There’s no GPDE analyticsdashboard!
  • Analytics Dashboard
    # installs, # visits/visitors
    installs with + visits/visitors
    info with + visits/visitors
    chart with + visits/visitors
    ...
  • embeddable
    Mobility
  • embeddable
    Mobility
    3S
  • 3S
    Youtubify yourself
  • 3S
  • http://www.google.com/publicdata/explore?ds=z1foifl1a0gsn2_&ctype=l
    &strail=false&nselm=h&met_y=f_pop&hl=en&dl=en#ctype=c&strail=false
    &nselm=s&met_y=f_pop_percent&fdim_y=birth_place:Abroad&scale_y=lin
    &ind_y=false&ifdim=mun&hl=en&dl=en
    http://goo.gl/XtpLa
    http://goo.gl/pd/XtpLa
    http://gp.de/z1foifl1a0gsn2_?8vH
    Shorten
    3S
  • Shorten
    3S
    Share
  • Shorten
    3S
    Share
  • Shorten
    Share
  • Support
    oEmbed
    Shorten
    Share
  • Support
    oEmbed
    via
    Shorten
    Share
  • Support
    oEmbed
    Shorten
    Share
  • Malleability
    idescat
    Mobility
    REUSE
  • Malleability
    idescat
    Mobility
    REUSE
    Google
    APIs
  • Better
    discovery
    Automatic
    updates
    Easierembedding
    A N A L Y T I C S
  • pageviews?
    visits?
    unique visitors?
    Whatabout
    ourwebsite’s
    success?
  • pageviews?
    visits?
    unique visitors?
    Success metrics?
  • pageviews?
    visits?
    unique visitors?
    Business model?
    Success metrics?
  • pageviews?
    visits?
    unique visitors?
    Business model?
    Success metrics?
  • pageviews?
    Wedon’toperate in the
    eyeballmarket
    visits?
    uniquevisitors?
    Business model?
    Success metrics?
  • pageviews?
    Wedon’toperate in the
    eyeballmarket
    visits?
    uniquevisitors?
    Weoperate in the
    reference
    market
    Business model?
    Success metrics?
  • maximum data exposure & reach
    reference
    market
  • maximum data exposure & reach
    reference
    market
    accuracypreservation
  • maximum data exposure & reach
    reference
    market
    accuracypreservation
    brandrecognition
  • ThankYou !
    Seealso:
    Statisticaldissemination 2.0
  • borman818 / Daniel Borman
    JoshBancroft
    jakevance / Jacob Vance
    Prizmatic
    Cristian Torras
    Mick ㋡rlosky
    Michelle Kinsey Bruns
    Niamor83
    Clarissa Rossarola
  • WikimediaCommons
    NASA
    http://en.wikipedia.org/wiki/File:The_Earth_seen_from_Apollo_17.jpg
    NuclearVacuum
    http://en.wikipedia.org/wiki/File:The_Earth_seen_from_Apollo_17.jpg
    Mutxamel / HansenBCN
    http://en.wikipedia.org/wiki/File:Localizaci%C3%B3n_de_Catalu%C3%B1a.svg
    Authorunknown
    http://www.taltopia.com/media/6/6374/SPERM-ART.jpg
    Maps © by Google and TeleAtlas
    PD