Data management: international challenges, national infrastructure, and institutional responses
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Data management: international challenges, national infrastructure, and institutional responses

  • 1,437 views
Uploaded on

Presentation delivered to UKOLN on April 1, 2011.

Presentation delivered to UKOLN on April 1, 2011.

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,437
On Slideshare
1,179
From Embeds
258
Number of Embeds
2

Actions

Shares
Downloads
4
Comments
0
Likes
1

Embeds 258

http://ukwebfocus.wordpress.com 245
https://ukwebfocus.wordpress.com 13

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • So, let’s look at the state of data in scholarly communication. Unfortunately, it’s inconvenient, imprisoned, invisible, inaccessible, and incomprehensible
  • Need to retype
  • Near impossible to liberate. Talk about ChemXSeer example and DataThief Java application
  • Too transformed
  • Discipline scientist may know how to get these data but I don’t
  • NOTE: Some of these arguments are at individual, national, global levelEfficiency for researcher – don’t reinvent wheelValidation – repeatability of researchIntegrity – of scholarly recordValue for Money for funder – public money funded it, it should be available to public (ClimateGate!)Self-interest – sharing with a future self, greater visibility, more citationsSo, what are some good stories around data sharing?
  • Number of initiatives around the world working to do a better job on data: NSF DataNet (Sayeed/Bill later in conference), JISC Managing Research Data, NL SURF/DANS
  • I’m going to take a programmatic view (because that explains how we are funding stuff), while recognising that the issues don’t necessarily fit neatly inside those boundaries
  • And thank you for the opportunity to speak to you this afternoon.

Transcript

  • 1. Data Management: International challenges, National Infrastructure, and Institutional Responses - an Australian Perspective
    Dr Andrew Treloar
    Director of Technology
    Australian National Data Service
  • 2. International Challenges
  • 3. Inconvenient data
    DOI: 10.1098/rsta.2005.1569
  • 4. Imprisoneddata
    DOI 10.1098/rsta.2006.1793
  • 5. Invisible data
    DOI 10.1098/rsta.2006.1793
  • 6. Inaccessible data
  • 7. Incomprehensible data
    ands.org.au
    7
  • 8. 8
    Summary
    Not a first class object
    Unmanaged
    Disconnected
    Unfindable
    Unreusable
  • 9. Why re-use data?
    Efficiency
    Validation
    Integrity
    Value for money
    Self-interest
  • 10. Astronomy case study
    Hubble Space Telescope (HST) operating since 1990
    Observations are proposed, and if accepted, data is collected and made available to the proposers – who then write a research paper
    Each year around 1,000 proposals are reviewed and approximately 200 are selected, for a total of 20,000 individual observations
    Data is stored at the Space Telescope Science Institute and made available after embargo period
    There are now more research papers written by “second use” of the research data, than by the use initially proposed
    10
  • 11. 11
    Source: http://archive.stsci.edu/hst/bibliography/pubstat.html
  • 12. Cancer micro-array trial case study
    Piwowar, et. al., “Sharing Detailed Research Data Is Associated with Increased Citation Rate”
    http://www.plosone.org/article/info:doi/10.1371/journal.pone.0000308
    Looked at the citation history of cancer microarray clinical trial publications
    Found that publicly available data was associated with a 69% increase in citations, independent of journal impact factor, date of publication, and author country of origin
    12
  • 13. Alzheimer’s Disease NeuroImaging Initiative
    Collaborative effort to find brain biomarkers for Alzheimer’s disease
    Key: All brain scans and other data freely available to scientific community without embargo.
    Over 3K full downloads and 1M scan downloads by over 400 investigators world-wide
    Over 100 publications
    13
    Institut Douglas CC BY-NC-ND
    http://www.fnih.org/work/areas/chronic-disease/adni
  • 14. National Infrastructure
    14
  • 15. National approaches
    Number of different countries: UK, US, DE, NL
    Different environments => different ecosystems
    and so some local tradeoffs
    But some common themes emerging:
    Do the things that only you can do
    Be the ‘voice for data’
    Prime the pump
  • 16. Australian National Data Service
    • An initiative of the Australian Government being conducted as part of the National Collaborative Research Infrastructure Strategy ($A24M) and the Super Science Initiative ($A48M)
    • 17. A collaboration between Monash University, the Australian National University and CSIRO
    • 18. Nearly 50staff, funded to mid 2013
    • 19. More researchers re-using more data more often
    • 20. Data as a first-class object
    ands.org.au
    16
  • 21. ANDS is enabling the transformation of:
    Data that are:
    Unmanaged
    Disconnected
    Invisible
    Single use
    17
    Collections that are:
    Managed
    Connected
    Findable
    Reusable
    so that Australian researchers can easily discover, access and re-use data
  • 22. 18
    Defining characteristics of ANDS
    Building national services
    Engaging with institutions not researchers (mostly)
    Working within funding constraints
    use, not amount!
    Building the Australian Research Data Commons
  • 23.
  • 24. 20
    ANDS Programs
    Frameworks and Capability
    Seeding the Commons
    Data Capture
    Metadata Stores
    ARDC Core
    Public Sector Data
    Applications
  • 25. 21
    Spending profile
  • 26. RDA Demo
    http://www.google.com/
    22
  • 27. Institutional Responses
  • 28. 24
    Driven by Australian Code for Responsible Conduct of Research
    Equivalent of UKRIO’s Code of Practice for Research: Promoting good practice and preventing misconduct
    Takes significant time to get accepted
    ANDS providing models of good practice
    Seeding the Commons
    U->M
    Data management policy and planning
  • 29. 25
    Retrospective data description
    Different selection mechanisms
    Seeding the Commons
    U->M
    Fixing the past
  • 30. 26
    Improving internal CRIS systems
    Better integration
    Moving beyond publications
    Better links to data collection descriptions
    Seeding the Commons, Metadata Stores
    D->C
  • 31. 27
    Facilitating easier/better capture of data and metadata from selected ‘instruments’
    Making the right thing easier
    Improving quality of metadata
    Data Capture
    U->M
    S->R
    Fixing the future
  • 32. 28
    Describing institutions research data assets
    Series of metadata stores rollouts plus some ancillary activity
    Metadata Stores, Seeding the Commons, Data Capture
    D->C
    I->F
  • 33. 29
  • 34. Ongoing Issues
    30
  • 35. Country-Institution-Discipline
    Who wins?
    Who should win?
    31
  • 36. Sustainability, sustainability, sustainability…
    Institutional activity
    National services/resources
    Developed software
    32
  • 37. 33
    Priming the pump, or continuing to pump?
    If institutions/researchers/disciplines don’t care, why should the funders?
    Role of Government
  • 38. Questions/Links
    ands.org.au
    services.ands.org.au
    andrew.treloar@ands.org.au
    @atreloar
    andrew.treloar.net