UKOLN is supported  by: Acting as Advocate? Seven steps for libraries in the data decade  Dr Liz Lyon,  Director, UKOLN, U...
<ul><li>Scale, Complexity, Predictive Potential  </li></ul><ul><li>Continuum of Openness </li></ul><ul><li>Citizen Science...
data  s c a l e Human Genome printed   http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/ Human Genome printed  ht...
“ Data sets are becoming the new instruments of science”
$1000 genome in <15 minutes ....by 2013?
...data logistic challenges.... <ul><li>Large-scale data storage that is: </li></ul><ul><ul><li>Cost-effective (rent on-de...
Clients in the cloud
Library Actions <ul><li>Provide  Briefings on Cloud Data Services  (in partnership with local IT Services?) </li></ul>
Workflows, Models, Tools  Sage Bionetworks genomics Workflow
Reference Linking Research Outputs User registration data; Instrument allocation data etc. Comments, annotations, ratings ...
State-of-the-Art Report :  Models & Tools  (Alex Ball, June 2010) <ul><li>Data Lifecycles </li></ul><ul><li>Data Policies ...
Library Actions <ul><li>Provide  Briefings on Cloud Data Services  (in partnership with local IT Services?) </li></ul><ul>...
Data Sustainability….
Benefits Taxonomy: Summary Keeping Research Data Safe2 Report: April 2010 Dimension 1 Direct Indirect (costs avoided) Dime...
Library Actions <ul><li>Provide  Briefings on Cloud Data Services  (in partnership with local IT Services?) </li></ul><ul>...
Ethics, Privacy, Culture “ You have zero privacy anyway. Get over it” Scott McNealy, CEO Sun Microsystems, 1999
Post-genome decade Human genomes: >24  published & almost 200 unpublished
“ P4 medicine : Predictive, Personalised, Preventive, Participatory.” Leroy Hood –  Institute for Systems Biology <ul><li>...
P4 medicine <ul><li>Each patient’s genome sequenced </li></ul><ul><li>Your genome is basis of your medical record </li></u...
They have shared their data….
Share  my  data?
“ While many researchers are positive about sharing data in principle, they are almost universally reluctant in practice. ...
<ul><li>Sage Bionetworks : Integrative genomics </li></ul><ul><li>Open data in the Sage Commons repository </li></ul><ul><...
Participatory medicine : share data & empower the patient... Sage Congress  San Francisco April 2010
Library Actions <ul><li>Provide  Briefings on Cloud Data Services  (in partnership with local IT Services?) </li></ul><ul>...
Library Actions <ul><li>Provide  Briefings on Cloud Data Services  (in partnership with local IT Services?) </li></ul><ul>...
Professional Scientists   Enthusiastic amateurs Training   Citizen scientist Standards and ethics   Local : natural histor...
 
Citizen Science :  validated in the professional press
Working  with  science professionals
Library Actions <ul><li>Raise awareness of Citizen Science  opportunities & guidelines for good practice </li></ul>
Data Publication and Attribution http://www.flickr.com/photos/digitalfemme57/3271063366 /
Calls for action, new metrics
<ul><li>Journal  </li></ul><ul><li>Article </li></ul><ul><li>Workflow </li></ul><ul><li>Visualisation </li></ul><ul><li>Mo...
How to cite large-scale predictive network models? <ul><li>Multiple data sources </li></ul><ul><li>Linked data approach </...
Library Actions <ul><li>Raise awareness of Citizen Science  opportunities & guidelines for good practice </li></ul><ul><li...
Take homes... <ul><li>Briefings on Cloud Data Services </li></ul><ul><li>Build  usable  Data Management Tools </li></ul><u...
Chicago Mart Plaza, 6-8 December 2010 Thank you…
Upcoming SlideShare
Loading in …5
×

Acting as Advocate? Seven steps for libraries in the data decade

1,444
-1

Published on

Presentation given at the IATUL Conference, Purdue University in June 2010.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,444
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • The I2S2 project aims to understand and identify the requirements for a data-driven research infrastructure in the Structural Sciences. The work is focused on the exemplar domain of Chemistry, but with a view towards inter-disciplinary application. This Idealised Scientific Research Data Lifecycle Model produced by the I2S2 project seeks to extend and adapt from a “researcher perspective”, the Keeping Research Data Safe (KRDS) Activity Model. It adapts KRDS from an archive-centric to a researcher-centric view by: Defining and emphasising more of the activities in the research (KRDS “Pre-Archive” ) phase where research data is created; Adding a “Publication” set of activities; Concatenating the KRDS “Archive” phase activities in the centre of the model for simplification and presentational purposes; Adding some specific local research administration activities. In addition for the purposes of the project, it adds some selective detail of information flows and information objects between the activities. Note this is an idealised model and several activities such as peer review or conduct experiment may have multiple instances or repetitions. It also represents a project view as of June 2010 and may be subject to further changes.
  • Acting as Advocate? Seven steps for libraries in the data decade

    1. 1. UKOLN is supported by: Acting as Advocate? Seven steps for libraries in the data decade Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre IATUL Conference, Purdue University, June 2010 . This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
    2. 2. <ul><li>Scale, Complexity, Predictive Potential </li></ul><ul><li>Continuum of Openness </li></ul><ul><li>Citizen Science </li></ul><ul><li>Credentials, Incentives, Rewards </li></ul><ul><li>Institutional Readiness & Response </li></ul><ul><li>Data Informatics Capacity & Capability </li></ul>http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/publications.html#november-2009 <ul><li>Open Science at Web-Scale </li></ul><ul><li>Consultation: </li></ul><ul><li>Write-To-Reply </li></ul><ul><li>Keynote Presentations: </li></ul><ul><li>eResearch Australasia Nov 2009 </li></ul><ul><li>CNI, Baltimore April 2010 </li></ul><ul><li>http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html </li></ul>
    3. 3. data s c a l e Human Genome printed http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/ Human Genome printed http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/
    4. 4. “ Data sets are becoming the new instruments of science”
    5. 5. $1000 genome in <15 minutes ....by 2013?
    6. 6. ...data logistic challenges.... <ul><li>Large-scale data storage that is: </li></ul><ul><ul><li>Cost-effective (rent on-demand) </li></ul></ul><ul><ul><li>Secure (privacy and IPR) </li></ul></ul><ul><ul><li>Robust and resilient </li></ul></ul><ul><ul><li>Low entry barrier / ease-of-use </li></ul></ul><ul><ul><li>Has data-handling / transfer / analysis capability </li></ul></ul><ul><li>Move sequencing out of genome centres </li></ul><ul><li>“.... analyse an entire human genome in a single day sitting with a laptop at your local Starbucks. ” </li></ul>...cloud services
    7. 7. Clients in the cloud
    8. 8. Library Actions <ul><li>Provide Briefings on Cloud Data Services (in partnership with local IT Services?) </li></ul>
    9. 9. Workflows, Models, Tools Sage Bionetworks genomics Workflow
    10. 10. Reference Linking Research Outputs User registration data; Instrument allocation data etc. Comments, annotations, ratings etc. Risk assessment data; other sample data Analyse Derived Data Research Concept and/or Experiment Design Acquire Sample Peer-review Proposal Conduct Experiment Generate, Create, & Collect Raw Data Process Raw Data into Derived Data Interpret & Analyse Results Data Archive, Preservation & Curation IPR, Embargo & Access Control Validate, Reuse & Repurpose Data Publish Research Results Data Derived Data Processed Data Raw, Correction & Calibration Data Papers, articles, presentations, reports An Idealised Scientific Research Data Lifecycle Model Documentation, Metadata & Storage (Reference, Provenance, Context, Calibration etc.) Start Project Write Proposal (include DMP) Scholarly Knowledge Write Usage Reports Publication Database Research Activity Research Admin Activity Archive Activity Information Flow KEY Prepare Supplementary Data Prepare Manuscript Peer Review Research Discover & Access Appraisal & Quality Control Programs (generate customised software) Publication Activity
    11. 11. State-of-the-Art Report : Models & Tools (Alex Ball, June 2010) <ul><li>Data Lifecycles </li></ul><ul><li>Data Policies (UK) incl DMP </li></ul><ul><li>Standards & tools </li></ul><ul><li>Data Asset Framework (DAF) </li></ul><ul><li>DANS Seal of Approval </li></ul><ul><li>Preservation metadata </li></ul><ul><li>Archive management tools </li></ul><ul><li>Cost / benefit tools </li></ul>
    12. 12. Library Actions <ul><li>Provide Briefings on Cloud Data Services (in partnership with local IT Services?) </li></ul><ul><li>Build usable Data Management Tools working in partnership with researchers </li></ul>
    13. 13. Data Sustainability….
    14. 14. Benefits Taxonomy: Summary Keeping Research Data Safe2 Report: April 2010 Dimension 1 Direct Indirect (costs avoided) Dimension 2 Near-term Long-term Dimension 3 Private Public
    15. 15. Library Actions <ul><li>Provide Briefings on Cloud Data Services (in partnership with local IT Services?) </li></ul><ul><li>Build usable Data Management Tools working in partnership with researchers </li></ul><ul><li>Develop Data Sustainability Strategies and articulate the cost-benefits </li></ul>
    16. 16. Ethics, Privacy, Culture “ You have zero privacy anyway. Get over it” Scott McNealy, CEO Sun Microsystems, 1999
    17. 17. Post-genome decade Human genomes: >24 published & almost 200 unpublished
    18. 18. “ P4 medicine : Predictive, Personalised, Preventive, Participatory.” Leroy Hood – Institute for Systems Biology <ul><li>...“medicine is going to become an information science”... </li></ul>Image from Scientific American
    19. 19. P4 medicine <ul><li>Each patient’s genome sequenced </li></ul><ul><li>Your genome is basis of your medical record </li></ul><ul><li>New method to anonymise medical records for genomics research at Vanderbilt Univ (April ‘10) </li></ul><ul><li>New predictive models of health and disease </li></ul><ul><li>Personalised treatments focus on preventative therapies </li></ul>Genome scale network biology Genomic data as a commodity
    20. 20. They have shared their data….
    21. 21. Share my data?
    22. 22. “ While many researchers are positive about sharing data in principle, they are almost universally reluctant in practice. ..... using these data to publish results before anyone else is the primary way of gaining prestige in nearly all disciplines.” INCREMENTAL Project
    23. 23. <ul><li>Sage Bionetworks : Integrative genomics </li></ul><ul><li>Open data in the Sage Commons repository </li></ul><ul><li>Human and mouse: clinical and genetics data </li></ul><ul><li>Develop predictive models of disease: liver / breast / colon cancer, diabetes, obesity </li></ul><ul><li>Crowd-sourced effort : global scope </li></ul>Stephen Friend
    24. 24. Participatory medicine : share data & empower the patient... Sage Congress San Francisco April 2010
    25. 25. Library Actions <ul><li>Provide Briefings on Cloud Data Services (in partnership with local IT Services?) </li></ul><ul><li>Build usable Data Management Tools working in partnership with researchers </li></ul><ul><li>Develop Data Sustainability Strategies and articulate the cost-benefits </li></ul><ul><li>Publish Case Studies on Open Science to show benefits of universal data sharing </li></ul>
    26. 26. Library Actions <ul><li>Provide Briefings on Cloud Data Services (in partnership with local IT Services?) </li></ul><ul><li>Build usable Data Management Tools working in partnership with researchers </li></ul><ul><li>Develop Data Sustainability Strategies and articulate the cost-benefits </li></ul><ul><li>Publish Case Studies on Open Science to show benefits of universal data sharing </li></ul><ul><li>Present at University Ethics Committee to highlight open data issues for faculty </li></ul>
    27. 27. Professional Scientists Enthusiastic amateurs Training Citizen scientist Standards and ethics Local : natural history, environ. Peer-review Global : astronomy Organisational support Self-supporting
    28. 29. Citizen Science : validated in the professional press
    29. 30. Working with science professionals
    30. 31. Library Actions <ul><li>Raise awareness of Citizen Science opportunities & guidelines for good practice </li></ul>
    31. 32. Data Publication and Attribution http://www.flickr.com/photos/digitalfemme57/3271063366 /
    32. 33. Calls for action, new metrics
    33. 34. <ul><li>Journal </li></ul><ul><li>Article </li></ul><ul><li>Workflow </li></ul><ul><li>Visualisation </li></ul><ul><li>Model </li></ul><ul><li>Data </li></ul><ul><li>Annotation </li></ul><ul><li>Concept </li></ul>Macro Micro / Nano Attribution granularity What are we citing?
    34. 35. How to cite large-scale predictive network models? <ul><li>Multiple data sources </li></ul><ul><li>Linked data approach </li></ul><ul><li>Visualise : Cytoscape </li></ul><ul><li>Workflow : Taverna </li></ul><ul><li>Provenance issues </li></ul>
    35. 36. Library Actions <ul><li>Raise awareness of Citizen Science opportunities & guidelines for good practice </li></ul><ul><li>Promote Data Citation and Attribution to embed in publication practice and influence funder policy </li></ul>
    36. 37. Take homes... <ul><li>Briefings on Cloud Data Services </li></ul><ul><li>Build usable Data Management Tools </li></ul><ul><li>Develop Data Sustainability Strategies </li></ul><ul><li>Publish Case Studies on Open Science </li></ul><ul><li>Present at University Ethics Committee </li></ul><ul><li>Raise awareness of Citizen Science </li></ul><ul><li>Promote Data Citation and Attribution </li></ul>...Acting as Advocate
    37. 38. Chicago Mart Plaza, 6-8 December 2010 Thank you…
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×