• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Codes, Clouds & Constellations: Open Science in the Data Decade

Codes, Clouds & Constellations: Open Science in the Data Decade



Presentation given at the CNI Meeting, Baltimore in April 2010.

Presentation given at the CNI Meeting, Baltimore in April 2010.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Codes, Clouds & Constellations: Open Science in the Data Decade Codes, Clouds & Constellations: Open Science in the Data Decade Presentation Transcript

    • UKOLN is supported by: Codes, Clouds & Constellations: Open Science in the Data Decade Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation Centre CNI Meeting, Baltimore, April 2010 . This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
      • Scaling to Share
      • Publication and Attribution
      • Pathways to Participation
      • Institutions and Informatics
      • 2010 Perspectives
      • November 2009
      • Consultation
      • eResearch Australasia slides
      • http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html#2009-november-australasia
      • Progress, Prospects?
    • Scaling to Share Human Genome printed http://www.flickr.com/photos/johnjobby/2252981353/sizes/l/
    • From the Laboratory bench....
    • … to a national crystallography service....
    • ....to Diamond Light Source
      • “ Bridging the chasm ” between the local laboratory bench and large scale facilities
      • Develop Integrated Information Model
      • Use cases and Inter-disciplinary Pilots
      • Cost-benefit analysis: before and after
    • Diamond Light Source National Crystallography Service (NCS) Local Earth Sciences Lab University of Cambridge Function International service -multiple communities UK service - multiple institutions. Also uses Diamond Lone researcher at institution - uses NCS and ISIS large-scale facility Administration Peer-reviewed proposal required Paper-based records –experiments, safety ERA, instrument time Multiple proposals, multiple forms Metadata Core Scientific MetaData Model eBank/eCrystals schema ? Identifiers Beam-line number DOI InChI ? Workflow Formulaic and bespoke Formulaic, unrecorded Complex, unrecorded Software In-house scripts In-house scripts + open-source suite In-house scripts + open-source suite Raw data In-house GDA store ATLAS data-store Laptop / local server Derived data Taken offsite on laptop / USB stick eCrystals repository Laptop / local server / USB stick
    • Technology race to market $1000 genome in <15 minutes ....by 2013?
    • ...data deluge challenges....
      • Large-scale data storage that is:
        • Cost-effective (rent on-demand)
        • Secure (privacy and IPR)
        • Robust and resilient
        • Low entry barrier / ease-of-use
        • Has data-handling / transfer / analysis capability
      • Move sequencing out of genome centres
      • “ .... analyse an entire human genome in a single day sitting with a laptop at your local Starbucks. ”
      ...cloud services?
    • ...data clouds in the media
    • Clients in the cloud
    • Post-genome decade Human genomes: >24 published & almost 200 unpublished
    • “ P4 medicine : predictive, personalised, preventive, participatory.” Leroy Hood – Institute for Systems Biology
      • Each patient’s genome sequenced
      • Your genome is the basis of your medical record
      • New predictive models of health and disease
      • Individualised treatments focusing on preventative therapies
      Image from Scientific American Genome scale network biology Genomic data as a commodity
      • Sage Bionetworks : Integrative genomics
      • Develop predictive models of disease: liver / breast / colon cancer, diabetes, obesity
      • Open data in the Sage Commons
      • Human and mouse: clinical and genetics data
      • Congress San Francisco 23-24 April 2010
      Stephen Friend
    • They have shared their data….
    • Heather Piwowar … but many researchers don’t share… … and are reluctant to re-use data…
    • Publication and Attribution http://www.flickr.com/photos/digitalfemme57/3271063366 /
    • Calls for action, new metrics
      • Journal
      • Article
      • Workflow
      • Data
      • Annotation
      • Concept
      Macro Micro / Nano Attribution granularity ... complexity challenges...
    • Citing network models
      • Multiple data sources
      • Many standards
      • Workflow integration
      • User requirements
      • Service functionality?
    • Pathways to Participation http://www.flickr.com/photos/lemontwist/502860137/sizes/o/
    • Continuum of Openness Open access Closed Access Participation Lone scholar Professional, experts Volunteers interested amateurs Citizen science “ dark data” Creative Commons Attribution-Non-Commercial-Share Alike 2.0
    • Data Informatics: Logistics dilemma Professional scientist Citizens Capability Capacity Data scientists , LIS Peer production Volunteers, interested amateurs Community curation Creative Commons Attribution-Non-Commercial-Share Alike 2.0 Professional scientist Observations Audit Preservation Ontologies Metadata schema Annotation Data management plans Selection & Appraisal Data cleansing Training Visualisation
    • Peer Production
    • Using gaming to drive curation
    • Professional Scientists Enthusiastic amateurs Training Citizen scientist Standards and ethics Local : natural history, environ. Peer-review Global : astronomy Organisational support Self-supporting
    • Citizen science...
    • Privacy issues? … “ participatory urbanism”?
    • “ You have zero privacy anyway. Get over it” Scott McNealy, CEO Sun Microsystems, 1999
    • Working with science professionals ...cultural challenges for faculty?
    • Institutions and Informatics University of Edinburgh Informatics Forum http://www.flickr.com/photos/chris_malcolm/2638210422/sizes/l/
    • Open Science at Web-Scale Report 2009
    • Institutional response : High Throughput Biology
      • North Carolina universities
      • Cyber-infrastructure project
      • Data cloud across three campuses
      • “ regional”
      • Policy & practice
    • New data support structures
    • Facilitating team science - Future Chips - Biocomputation & Bioinformatics - Tetherless World - Integrative Systems Biology - Graphic designers? - Animators? - Social scientists? - Legal experts?
    • Embedding data informatics education ...for faculty & LIS...
    • Take homes
      • Data sharing requires pragmatic solutions
      • Attribution granularity & citation complexity
      • We need “the crowd”
      • Institutional strategies embrace informatics
      • The prospects are transformational ...
    • Slides will be available at : http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html http://www.dcc.ac.uk/