SlideShare a Scribd company logo
1 of 39
The life-sciences as a
pathfinder in data-
intensive research
practice
Dr Andrew Treloar, Director of
Technology
11 July 2014 CC-BY-SA, @atreloar 1
Structure presentation
 Research Lifecycles
 Functions of Scholarly Communication
 Pointers to the future
 Characterising the future
 Pathfinder problems
 Conclusions
11 July 2014 CC-BY-SA, @atreloar 2
So many lifecycles…
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 3
Minimal Research Lifecycle
Think
DoShare
11 July 2014 CC-BY-SA, @atreloar 4
Sharing: Scholarly Communication
System and its Functions
 Registration
 Certification
 Awareness
 Archiving
(Rosendaal and Geurts, 1997)
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 5
System of Journals
 Registration
 submission of manuscript
 Certification
 peer-review (pre-publication)
 commentary (post-publication)
 Awareness
 discovery services
 Archiving
 libraries (print)
 publishers (electronic)
 special purpose organisations (e.g. Portico)
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 6
Pointers to the future
“the future is already here – it’s
just not very evenly distributed”
William Gibson, NPR interview
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 7
Registration: BioRxiv
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 8
Registration: Github
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 9
Registration: WikiPathways
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 10
Registration: NeuroLex
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 11
Registration: Nanopublications
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 12
Registration: some observations
 Decoupling registration from certification
 Timestamping, versioning
 Registration of various types of objects
 Machines as creators and contributors
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 13
Certification: PubMed Commons
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 14
Certification: PubPeer
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 15
Certification: Publons
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 16
Certification: some observations
 Peer-review decoupled from publication process
 Certification of various types of objects
 Machines validating form
 Social endorsement
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 17
Awareness: myExperiment
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 18
Awareness: eLabNotebook RSS
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 19
Awareness: Twitter
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 20
Awareness: some observations
 Awareness for various types of objects
 Real time awareness
 Awareness support targeted at machines
 Awareness through social media
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 21
Archiving: PDB
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 22
Archiving: GenBank
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 23
Characterising the future
Fixed Varying
Discrete Continuous
Hidden VisibleResearch Process
Nature of object
Process of making public
Speed of communicationDelayed Instant
Atomic CompoundAtomicity of object
Communicated object
Publication
+data proxies
Publication +
linked data +
linked models
Formal InformalNature of process11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 24
Fundamental changes
 The research process (objects, social
dimension) is becoming more exposed
 Articles, books are no longer the only
relevant objects for research
communication
 Objects are no longer static
 Machines are joining humans as (co-
)creators and consumers of research
objects
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 25
Pathfinder problems
 Integrity of the scholarly record
 The three obsolescences
 hardware
 file format
 software
11 July 2014 CC-BY-SA, @atreloar 26
System of Journals: Archiving
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 27
Web of Objects: Archiving?
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 28
Not just citation relationships
11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 29
The problem of obsolescence
 Lifescience research environment can be viewed
as undergoing a process of accelerated evolution
 Other disciplines will hit these problems in time
11 July 2014 CC-BY-SA, @atreloar 30
Cambrian explosion
11 July 2014 31
Hardware obsolescence: Roche 454
11 July 2014 CC-BY-SA, @atreloar 32
Software obsolescence: too much choice, not
enough support
11 July 2014 CC-BY-SA, @atreloar 33
Abandonware
 “Last summer, a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna
with an intriguing project. The ANREP program, which
annotates structural motifs in gene or protein sequences,
was out of date having been written more than a decade
ago. Although still used by molecular biologists, its slow
computing ability meant a straightforward multiple search
could take all night on a desktop PC. The Udine biologist
wanted Vitacolonna, a postdoctoral fellow in
computational biology, to write a program that could do
the job more quickly.”
 Sam Jaffe, Scientists Abandon their Software, The Scientist, Feb 16, 2004
11 July 2014 CC-BY-SA, @atreloar 34
File format obsolescence: Illumina
 Probability of error in basecalling encoded using ascii
code to reduce file size
 Meaning of the ascii code changed along the life cycle
and for data generated at different time points the
quality might be encoded differently
 “If you get an error like "Invalid quality score value",
your fastq file probably has Sanger (offset 33) instead
of Illumina (ASCII offset 64) quality scores. You'll need
to add the option "-Q33" to your FASTX Toolkit
arguments”. Obviously…
11 July 2014 CC-BY-SA, @atreloar 35
Everett Rogers, Diffusion of Innovation, 1962
11 July 2014 CC-BY-SA, @atreloar 36
Conclusions
 Need to move to a smaller number of standard file
formats
 Need to move to a more sustainable model of
software development and maintenance
 Need to encourage platform manufacturers to
innovate around the hardware, not the software
 NOTE: other disciplines are looking to lifesciences
to work out how to solve some of these problems
11 July 2014 CC-BY-SA, @atreloar 37
On best practices in the development of
bioinformatics software, Front. Genet., 02 Jul 14
 Source code available to reviewers
 Software indexed, citable, available
 Source code documented
 Source code managed
 Test libraries, sample data and dataset repositories
available
11 July 2014 CC-BY-SA, @atreloar 38
Questions?
 andrew.treloar@ands.org.au
 @atreloar
 https://www.slideshare.net/atreloar/the-
lifesciences-as-a-pathfinder-in-dataintensive-
research-practice
11 July 2014 CC-BY-SA, @atreloar 39

More Related Content

Similar to The life-sciences as a pathfinder in data-intensive research practice

Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...Pedro Príncipe
 
Streamlining deposit an ojs to repository plugin
Streamlining deposit an ojs to repository pluginStreamlining deposit an ojs to repository plugin
Streamlining deposit an ojs to repository pluginJisc
 
A user journey in OpenAIRE services through the lens of repository managers -...
A user journey in OpenAIRE services through the lens of repository managers -...A user journey in OpenAIRE services through the lens of repository managers -...
A user journey in OpenAIRE services through the lens of repository managers -...OpenAIRE
 
Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...
Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...
Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...DOAJ (Directory of Open Access Journals)
 
Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015University of Edinburgh
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community CallOpenAIRE
 
Facilitate Research Communities Adoption of Open Science Publishing Principle...
Facilitate Research Communities Adoption of Open Science Publishing Principle...Facilitate Research Communities Adoption of Open Science Publishing Principle...
Facilitate Research Communities Adoption of Open Science Publishing Principle...OpenAIRE
 
Tracking research and research systems
Tracking research and research systemsTracking research and research systems
Tracking research and research systemsJisc
 
Moving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howMoving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howDavid T Palmer
 
Reshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha MunshiReshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha MunshiAta Rehman
 
Open Science : Democratizing Access to Science
Open Science : Democratizing Access to ScienceOpen Science : Democratizing Access to Science
Open Science : Democratizing Access to ScienceOkba Bekhelifi
 
Open Access Publishing
Open Access PublishingOpen Access Publishing
Open Access PublishingRenjithVRavi1
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objectsseanb
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCarole Goble
 

Similar to The life-sciences as a pathfinder in data-intensive research practice (20)

Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
 
Bits of Research
Bits of ResearchBits of Research
Bits of Research
 
Streamlining deposit an ojs to repository plugin
Streamlining deposit an ojs to repository pluginStreamlining deposit an ojs to repository plugin
Streamlining deposit an ojs to repository plugin
 
A user journey in OpenAIRE services through the lens of repository managers -...
A user journey in OpenAIRE services through the lens of repository managers -...A user journey in OpenAIRE services through the lens of repository managers -...
A user journey in OpenAIRE services through the lens of repository managers -...
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...
Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...
Publishing in Open Access Journals – How DOAJ can help to avoid questionable ...
 
Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
SciVerse @ TJU
SciVerse @ TJUSciVerse @ TJU
SciVerse @ TJU
 
Facilitate Research Communities Adoption of Open Science Publishing Principle...
Facilitate Research Communities Adoption of Open Science Publishing Principle...Facilitate Research Communities Adoption of Open Science Publishing Principle...
Facilitate Research Communities Adoption of Open Science Publishing Principle...
 
Tracking research and research systems
Tracking research and research systemsTracking research and research systems
Tracking research and research systems
 
Moving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howMoving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & how
 
UAEM EU Conference
UAEM EU Conference UAEM EU Conference
UAEM EU Conference
 
Reshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha MunshiReshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha Munshi
 
The State of Open Access in USA | Ensuring Quality
The State of Open Access in USA | Ensuring QualityThe State of Open Access in USA | Ensuring Quality
The State of Open Access in USA | Ensuring Quality
 
Open Science : Democratizing Access to Science
Open Science : Democratizing Access to ScienceOpen Science : Democratizing Access to Science
Open Science : Democratizing Access to Science
 
Csora, "2Collab, The Research Collaboration Tool"
Csora, "2Collab, The Research Collaboration Tool"Csora, "2Collab, The Research Collaboration Tool"
Csora, "2Collab, The Research Collaboration Tool"
 
Open Access Publishing
Open Access PublishingOpen Access Publishing
Open Access Publishing
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 

More from Andrew Treloar

Building a National Research Data Commons – Transforming Scholarship Through ...
Building a National Research Data Commons – Transforming Scholarship Through ...Building a National Research Data Commons – Transforming Scholarship Through ...
Building a National Research Data Commons – Transforming Scholarship Through ...Andrew Treloar
 
Provenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsProvenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsAndrew Treloar
 
ANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseAndrew Treloar
 
Instutional repositories and data
Instutional repositories and dataInstutional repositories and data
Instutional repositories and dataAndrew Treloar
 
Closing comments at #iPres 2014 conference
Closing comments at #iPres 2014 conferenceClosing comments at #iPres 2014 conference
Closing comments at #iPres 2014 conferenceAndrew Treloar
 
The universe of identifiers and how ANDS is using them
The universe of identifiers and how ANDS is using themThe universe of identifiers and how ANDS is using them
The universe of identifiers and how ANDS is using themAndrew Treloar
 
Adding value to researchers' data
Adding value to researchers' dataAdding value to researchers' data
Adding value to researchers' dataAndrew Treloar
 
Scholarly archive-of-the-future
Scholarly archive-of-the-futureScholarly archive-of-the-future
Scholarly archive-of-the-futureAndrew Treloar
 
Data Infrastructure and the Scholarly Ecosystem of the Future
Data Infrastructure and the Scholarly Ecosystem of the FutureData Infrastructure and the Scholarly Ecosystem of the Future
Data Infrastructure and the Scholarly Ecosystem of the FutureAndrew Treloar
 
Research data and the ANDS agenda in Australia
Research data and the ANDS agenda in AustraliaResearch data and the ANDS agenda in Australia
Research data and the ANDS agenda in AustraliaAndrew Treloar
 
Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)Andrew Treloar
 
Journal literature size in the context of the LHC data
Journal literature size in the context of the LHC dataJournal literature size in the context of the LHC data
Journal literature size in the context of the LHC dataAndrew Treloar
 
From Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationFrom Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationAndrew Treloar
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Andrew Treloar
 
The Past, Present and Future of data
The Past, Present and Future of dataThe Past, Present and Future of data
The Past, Present and Future of dataAndrew Treloar
 
Data, librarians, and services
Data, librarians, and servicesData, librarians, and services
Data, librarians, and servicesAndrew Treloar
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier SolutionAndrew Treloar
 

More from Andrew Treloar (20)

Building a National Research Data Commons – Transforming Scholarship Through ...
Building a National Research Data Commons – Transforming Scholarship Through ...Building a National Research Data Commons – Transforming Scholarship Through ...
Building a National Research Data Commons – Transforming Scholarship Through ...
 
Provenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsProvenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four Transformations
 
ANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data Reuse
 
Instutional repositories and data
Instutional repositories and dataInstutional repositories and data
Instutional repositories and data
 
Closing comments at #iPres 2014 conference
Closing comments at #iPres 2014 conferenceClosing comments at #iPres 2014 conference
Closing comments at #iPres 2014 conference
 
The universe of identifiers and how ANDS is using them
The universe of identifiers and how ANDS is using themThe universe of identifiers and how ANDS is using them
The universe of identifiers and how ANDS is using them
 
Adding value to researchers' data
Adding value to researchers' dataAdding value to researchers' data
Adding value to researchers' data
 
Scholarly archive-of-the-future
Scholarly archive-of-the-futureScholarly archive-of-the-future
Scholarly archive-of-the-future
 
Data Infrastructure and the Scholarly Ecosystem of the Future
Data Infrastructure and the Scholarly Ecosystem of the FutureData Infrastructure and the Scholarly Ecosystem of the Future
Data Infrastructure and the Scholarly Ecosystem of the Future
 
Research data and the ANDS agenda in Australia
Research data and the ANDS agenda in AustraliaResearch data and the ANDS agenda in Australia
Research data and the ANDS agenda in Australia
 
Data drives decisions
Data drives decisionsData drives decisions
Data drives decisions
 
Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)
 
Journal literature size in the context of the LHC data
Journal literature size in the context of the LHC dataJournal literature size in the context of the LHC data
Journal literature size in the context of the LHC data
 
Seeking serendipity
Seeking serendipitySeeking serendipity
Seeking serendipity
 
Research data ecology
Research data ecologyResearch data ecology
Research data ecology
 
From Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationFrom Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly Communication
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...
 
The Past, Present and Future of data
The Past, Present and Future of dataThe Past, Present and Future of data
The Past, Present and Future of data
 
Data, librarians, and services
Data, librarians, and servicesData, librarians, and services
Data, librarians, and services
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier Solution
 

Recently uploaded

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 

Recently uploaded (20)

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 

The life-sciences as a pathfinder in data-intensive research practice

  • 1. The life-sciences as a pathfinder in data- intensive research practice Dr Andrew Treloar, Director of Technology 11 July 2014 CC-BY-SA, @atreloar 1
  • 2. Structure presentation  Research Lifecycles  Functions of Scholarly Communication  Pointers to the future  Characterising the future  Pathfinder problems  Conclusions 11 July 2014 CC-BY-SA, @atreloar 2
  • 3. So many lifecycles… 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 3
  • 4. Minimal Research Lifecycle Think DoShare 11 July 2014 CC-BY-SA, @atreloar 4
  • 5. Sharing: Scholarly Communication System and its Functions  Registration  Certification  Awareness  Archiving (Rosendaal and Geurts, 1997) 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 5
  • 6. System of Journals  Registration  submission of manuscript  Certification  peer-review (pre-publication)  commentary (post-publication)  Awareness  discovery services  Archiving  libraries (print)  publishers (electronic)  special purpose organisations (e.g. Portico) 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 6
  • 7. Pointers to the future “the future is already here – it’s just not very evenly distributed” William Gibson, NPR interview 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 7
  • 8. Registration: BioRxiv 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 8
  • 9. Registration: Github 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 9
  • 10. Registration: WikiPathways 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 10
  • 11. Registration: NeuroLex 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 11
  • 12. Registration: Nanopublications 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 12
  • 13. Registration: some observations  Decoupling registration from certification  Timestamping, versioning  Registration of various types of objects  Machines as creators and contributors 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 13
  • 14. Certification: PubMed Commons 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 14
  • 15. Certification: PubPeer 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 15
  • 16. Certification: Publons 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 16
  • 17. Certification: some observations  Peer-review decoupled from publication process  Certification of various types of objects  Machines validating form  Social endorsement 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 17
  • 18. Awareness: myExperiment 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 18
  • 19. Awareness: eLabNotebook RSS 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 19
  • 20. Awareness: Twitter 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 20
  • 21. Awareness: some observations  Awareness for various types of objects  Real time awareness  Awareness support targeted at machines  Awareness through social media 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 21
  • 22. Archiving: PDB 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 22
  • 23. Archiving: GenBank 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 23
  • 24. Characterising the future Fixed Varying Discrete Continuous Hidden VisibleResearch Process Nature of object Process of making public Speed of communicationDelayed Instant Atomic CompoundAtomicity of object Communicated object Publication +data proxies Publication + linked data + linked models Formal InformalNature of process11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 24
  • 25. Fundamental changes  The research process (objects, social dimension) is becoming more exposed  Articles, books are no longer the only relevant objects for research communication  Objects are no longer static  Machines are joining humans as (co- )creators and consumers of research objects 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 25
  • 26. Pathfinder problems  Integrity of the scholarly record  The three obsolescences  hardware  file format  software 11 July 2014 CC-BY-SA, @atreloar 26
  • 27. System of Journals: Archiving 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 27
  • 28. Web of Objects: Archiving? 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 28
  • 29. Not just citation relationships 11 July 2014 CC-BY-SA, @hvdsomp and @atreloar 29
  • 30. The problem of obsolescence  Lifescience research environment can be viewed as undergoing a process of accelerated evolution  Other disciplines will hit these problems in time 11 July 2014 CC-BY-SA, @atreloar 30
  • 32. Hardware obsolescence: Roche 454 11 July 2014 CC-BY-SA, @atreloar 32
  • 33. Software obsolescence: too much choice, not enough support 11 July 2014 CC-BY-SA, @atreloar 33
  • 34. Abandonware  “Last summer, a member of the biology department of the University of Udine in Italy approached Nicola Vitacolonna with an intriguing project. The ANREP program, which annotates structural motifs in gene or protein sequences, was out of date having been written more than a decade ago. Although still used by molecular biologists, its slow computing ability meant a straightforward multiple search could take all night on a desktop PC. The Udine biologist wanted Vitacolonna, a postdoctoral fellow in computational biology, to write a program that could do the job more quickly.”  Sam Jaffe, Scientists Abandon their Software, The Scientist, Feb 16, 2004 11 July 2014 CC-BY-SA, @atreloar 34
  • 35. File format obsolescence: Illumina  Probability of error in basecalling encoded using ascii code to reduce file size  Meaning of the ascii code changed along the life cycle and for data generated at different time points the quality might be encoded differently  “If you get an error like "Invalid quality score value", your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores. You'll need to add the option "-Q33" to your FASTX Toolkit arguments”. Obviously… 11 July 2014 CC-BY-SA, @atreloar 35
  • 36. Everett Rogers, Diffusion of Innovation, 1962 11 July 2014 CC-BY-SA, @atreloar 36
  • 37. Conclusions  Need to move to a smaller number of standard file formats  Need to move to a more sustainable model of software development and maintenance  Need to encourage platform manufacturers to innovate around the hardware, not the software  NOTE: other disciplines are looking to lifesciences to work out how to solve some of these problems 11 July 2014 CC-BY-SA, @atreloar 37
  • 38. On best practices in the development of bioinformatics software, Front. Genet., 02 Jul 14  Source code available to reviewers  Software indexed, citable, available  Source code documented  Source code managed  Test libraries, sample data and dataset repositories available 11 July 2014 CC-BY-SA, @atreloar 38
  • 39. Questions?  andrew.treloar@ands.org.au  @atreloar  https://www.slideshare.net/atreloar/the- lifesciences-as-a-pathfinder-in-dataintensive- research-practice 11 July 2014 CC-BY-SA, @atreloar 39

Editor's Notes

  1. Story that is being told here – might seem initially in pieces, but there is a common thread. Point of first section is broad context for two case studies
  2. Increasingly, Share is bleeding into Do, so let’s zoom in on this
  3. Want to provide a series of snapshots of the future drawn from lifesciences
  4. Sourceforge is another example
  5. DNA variant of NG_000007.3 (hemoglobin) Sardinian population Provenance: authors of the article from which the nanopub was mined
  6. Content: Post-publication peer review of pubs
  7. Content: Post-publication peer review of pubs
  8. Publons aims to change all that. Members of the site can import papers, rate them, and discuss them. In ongoing discussions, members can endorse reviews. When the endorsements reach a certain threshold, the review gains a digital object identifier (DOI), turning it into an object that can be cited in more traditional academic literature.
  9. Content: Multiple sources checking the validity/classification of data
  10. Content: Multiple sources checking the validity/classification of data
  11. Content: Multiple sources checking the validity/classification of data
  12. Could also have had this for Registration, of course
  13. Content: Multiple sources checking the validity/classification of data
  14. Problem of reproducibility is just part of the problem
  15. Integrity used to be based on reliable archives
  16. Accelerated evolution (again, like Cambrian explosion)
  17. Not supported after 2016
  18. Omictools, Seqanswers I am reminded a bit of the early days of computing and the proliferation of word processors
  19. One way to think about this problem is in terms of diffusion of innovation
  20. So no pressure then…