OPEN DATA

  READY, SET, GO!




Paul Groth
Twitter: @pgroth
Blog: thinklinks.wordpress.com
http://www.few.vu.nl/~pgroth
The Science Lifecycle                                 Virtual Learning
                                                                                Undergraduate
                                                        Environment             Students

                                                                                          Next Generation
                                                                                          Researchers

             Digital
            Libraries                            scientists               Graduate
                                                                          Students

                    Reprints

          Peer-
        Reviewed                  Technical
                                              experimentation
        Journal &       Preprints Reports
       Conference          &
         Papers         Metadata



                                    Local                              Data, Metadata,
                                    Web                              Provenance, Scripts,
               Repositories
                                                    Certified
                                                                     Workflows, Services,
                                              Experimental Results
                                                                     Ontologies, Blogs, ...
                                                  & Analyses
Adapted from David De Roure’s                              slides
TWO STORIES

THE CONSUMER AND PRODUCER
MEET JULIE

PhD Student
“institutional influences on
patterns of collaboration in
producing research of
interdisciplinary character”




                               Faculteit der Exacte Wetenschappen
Julie needs data




5                      Faculteit der Exacte Wetenschappen
I AM NOT A LAWYER
    Web of Knowledge Terms of Use


    You are entitled to access the product, download or extract reasonable amounts of data from the
    product that are required for the activities you carry out individually or as part of your employment, and
    include insubstantial portions of extracted data in your work documents and reports, provided that such
    documents or reports are for the benefit of (and belong to) your organization, or where such documents
    or reports are intended for the benefit of third parties (not your organization ), extracted data is
    immaterial in the context of such documents or reports and used only for illustrative/demo purposes.

    Thomson Reuters determines a “reasonable amount” of data to download by comparing your download
    activity against the average annual download rates for all Thomson Reuters clients using the product in
    question. Thomson Reuters determines an “insubstantial portion” of downloaded data to mean an
    amount of data taken from the product which (1) would not have significant commercial value of its
    own; and (2) would not act as a substitute for access to a Thomson Reuters product for someone who
    does not have access to the product.

    You are not entitled to do anything that would cause a breach of the terms of the agreement between
    your organization and Thomson Reuters, such as (1) allowing anyone else to use your
    username/password, (2) downloading excessive amounts of data, (3) providing data to anyone else,
    other than in licensed, source-acknowledged documents or reports created as part of your normal work,
    (4) archiving or using downloaded data to create a derivative database or metrics, (5) using the product
    or any downloaded data to provide services to anyone outside your organization, or (6) using the
    product in a way that risks damaging, disabling, overburdening or impairing the operation of the
    product, or any other person’s use or enjoyment of the product.




6                                                                                   Faculteit der Exacte Wetenschappen
7   Faculteit der Exacte Wetenschappen
8   Faculteit der Exacte Wetenschappen
OPEN DATA: 2 WEEKS  15 MINUTES
SELECT ?author ?affiliation ?uriAffiliation WHERE
{
    GRAPH <$graph> {
      {<$article> swrc:author ?author.
          OPTIONAL{?author swrc:affiliation ?uriAffiliation.}
          OPTIONAL{?author swc:affiliation ?affiliation.} }
          UNION {
             <$article> foaf:maker ?author.
             OPTIONAL{?author swrc:affiliation ?uriAffiliation.}
             OPTIONAL{?author swc:affiliation ?affiliation.}
          }
          UNION {
             <$article> dc:creator ?author.
             OPTIONAL{?author swrc:affiliation ?uriAffiliation.}
             OPTIONAL{?author swc:affiliation ?affiliation.}
          }
}




9                                                                  Faculteit der Exacte Wetenschappen
PRODUCER

INSTITUTION
PRODUCER

PERSONAL
12   Faculteit der Exacte Wetenschappen
13   Faculteit der Exacte Wetenschappen
14   Faculteit der Exacte Wetenschappen
15   Faculteit der Exacte Wetenschappen
16   Photo by IvanClow - http://www.flickr.com/photos/ivanclow/4201955402/   Faculteit der Exacte Wetenschappen
ERR….SUPPORT?




17                   Faculteit der Exacte Wetenschappen
5 TAKE-AWAYS

     1.   Open Data is a boon to young scientists as consumers
     2.   Trade-offs for producers of open data
     3.   Producers need support
     4.   Clear simple guidelines for data publication
     5.   Data citation is a key to open data




18                                                Faculteit der Exacte Wetenschappen

Open Data: Ready Set Go

  • 1.
    OPEN DATA READY, SET, GO! Paul Groth Twitter: @pgroth Blog: thinklinks.wordpress.com http://www.few.vu.nl/~pgroth
  • 2.
    The Science Lifecycle Virtual Learning Undergraduate Environment Students Next Generation Researchers Digital Libraries scientists Graduate Students Reprints Peer- Reviewed Technical experimentation Journal & Preprints Reports Conference & Papers Metadata Local Data, Metadata, Web Provenance, Scripts, Repositories Certified Workflows, Services, Experimental Results Ontologies, Blogs, ... & Analyses Adapted from David De Roure’s slides
  • 3.
  • 4.
    MEET JULIE PhD Student “institutionalinfluences on patterns of collaboration in producing research of interdisciplinary character” Faculteit der Exacte Wetenschappen
  • 5.
    Julie needs data 5 Faculteit der Exacte Wetenschappen
  • 6.
    I AM NOTA LAWYER Web of Knowledge Terms of Use You are entitled to access the product, download or extract reasonable amounts of data from the product that are required for the activities you carry out individually or as part of your employment, and include insubstantial portions of extracted data in your work documents and reports, provided that such documents or reports are for the benefit of (and belong to) your organization, or where such documents or reports are intended for the benefit of third parties (not your organization ), extracted data is immaterial in the context of such documents or reports and used only for illustrative/demo purposes. Thomson Reuters determines a “reasonable amount” of data to download by comparing your download activity against the average annual download rates for all Thomson Reuters clients using the product in question. Thomson Reuters determines an “insubstantial portion” of downloaded data to mean an amount of data taken from the product which (1) would not have significant commercial value of its own; and (2) would not act as a substitute for access to a Thomson Reuters product for someone who does not have access to the product. You are not entitled to do anything that would cause a breach of the terms of the agreement between your organization and Thomson Reuters, such as (1) allowing anyone else to use your username/password, (2) downloading excessive amounts of data, (3) providing data to anyone else, other than in licensed, source-acknowledged documents or reports created as part of your normal work, (4) archiving or using downloaded data to create a derivative database or metrics, (5) using the product or any downloaded data to provide services to anyone outside your organization, or (6) using the product in a way that risks damaging, disabling, overburdening or impairing the operation of the product, or any other person’s use or enjoyment of the product. 6 Faculteit der Exacte Wetenschappen
  • 7.
    7 Faculteit der Exacte Wetenschappen
  • 8.
    8 Faculteit der Exacte Wetenschappen
  • 9.
    OPEN DATA: 2WEEKS  15 MINUTES SELECT ?author ?affiliation ?uriAffiliation WHERE { GRAPH <$graph> { {<$article> swrc:author ?author. OPTIONAL{?author swrc:affiliation ?uriAffiliation.} OPTIONAL{?author swc:affiliation ?affiliation.} } UNION { <$article> foaf:maker ?author. OPTIONAL{?author swrc:affiliation ?uriAffiliation.} OPTIONAL{?author swc:affiliation ?affiliation.} } UNION { <$article> dc:creator ?author. OPTIONAL{?author swrc:affiliation ?uriAffiliation.} OPTIONAL{?author swc:affiliation ?affiliation.} } } 9 Faculteit der Exacte Wetenschappen
  • 10.
  • 11.
  • 12.
    12 Faculteit der Exacte Wetenschappen
  • 13.
    13 Faculteit der Exacte Wetenschappen
  • 14.
    14 Faculteit der Exacte Wetenschappen
  • 15.
    15 Faculteit der Exacte Wetenschappen
  • 16.
    16 Photo by IvanClow - http://www.flickr.com/photos/ivanclow/4201955402/ Faculteit der Exacte Wetenschappen
  • 17.
    ERR….SUPPORT? 17 Faculteit der Exacte Wetenschappen
  • 18.
    5 TAKE-AWAYS 1. Open Data is a boon to young scientists as consumers 2. Trade-offs for producers of open data 3. Producers need support 4. Clear simple guidelines for data publication 5. Data citation is a key to open data 18 Faculteit der Exacte Wetenschappen

Editor's Notes

  • #6 Talk about citation data, difficult to get 2 weeks to gather a couple of hundred citation scores
  • #8 Open data to the rescue…. (
  • #9 My own community
  • #10 Faster Easier to experiment Access to more data
  • #11 Effective at the institutional level: Examples: Uniprot, chembl, astromicial data service, us government weather data
  • #12 Not as much experience at the personal level But good examples from (open source software)
  • #13 Built software during my phd released it as open source…..
  • #14 A fairly highly sighted paper in the UK e-Science All Hands Meeting (not the biggest outlet in the world)
  • #15 Led to new collaborators
  • #17 Exposing your dirty laundry is scary
  • #18 Lots of questions about the software People want support This is a distraction and can take time away from “science”
  • #19 Name-check e-science center for 3