Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A revolution in
open science?
- open data & the
role of libraries
- Geoffrey Boulton
LIBER
Munich
June 2013
Open communication of data: the source of a
scientific revolution and of scientific progress
Henry Oldenburg
Problems & opportunities in the data deluge
1020 bytes
Available storage
The challenge to Oldenburg’s principle:
- a crisis of replicability and credibility?
A fundamental principle: the data pro...
The opportunity:
new knowledge from data
(some technical opportunities & priorities)
Exploiting the potential
of linked da...
BUT - its not just sharing and
integrating data – its also what we do with it!
Jim Gray - “When you go and look at what sc...
Benefits of open communication of data
• Greater benefit to the individual than hugging their own data (e.g.
bioinformatic...
Benefits 1 : data sharing in ethos & practice
– the example of bio-informatics
ELIXIR Hub (European Bioinformatic Institut...
• E-coli outbreak spread through
several countries affecting 4000 people
• Strain analysed and genome
released under an op...
“Scientific fraud is rife: it's time to stand up for good science”
“Science is broken”
Examples:
 psychology academics ma...
Mathematics related discussions
Tim Gowers
- crowd-sourced mathematics
An unsolved problem posed on
his blog.
32 days – 27...
• Openly collected science is already helping policy
makers.
• AshTag app allows users to submit photos and
locations of s...
•
Benefits 6: The Economy
Benefits 7: Openness to public scrutiny
Benefits 8: Global challenges
Boundaries of openness?
Openness should be the default position, with
proportional exceptions for:
• Legitimate commercial...
Openness of data per se has little value.
Open science is more than disclosure
For effective communication, replication an...
Responsibilities & actions
• Scientists: - changing the mindset
• Learned Societies: - influencing their communities
• Uni...
A data management ecology?
The role of the top-down
and the bottom-up?
Massive data loss
Sum (little science data) >
Sum (...
Can libraries rise to the challenges of a post-
Gutenberg world?
“Libraries do the wrong things, employ the wrong people”
...
A taxonomy of openness
Inputs Outputs
Open access
Administrative
data (held by
public
authorities e.g.
prescription
data)
...
A realiseable aspiration: all scientific
literature open & online,
all data open & online, and for them to
interoperate
… ...
www.royalsociety.org
Upcoming SlideShare
Loading in …5
×

A Revolution in Open Science: Open Data and the Role of Libraries (Professor Geoffrey Boulton at LIBER 2013)

This talk was given by Prof. Geoffrey Boulton of the University of Edinburgh at LIBER's 42nd annual conference in Munich. Here is a brief summary: "The data storm that has been unleashed by novel means of data acquisition, manipulation and their instantaneous communication have posed both great challenges and opportunities for science. The challenge is to maintain scientific self-correction, which depends on concurrent publication of concepts and the underlying evidence. The opportunity is to exploit massive and complex data volumes in creating new knowledge. Both are non-trivial tasks. The former requires ‘intelligent openness‘."

"The latter requires new ways of thinking and new forms of collaboration, which make major demands on scientists, their institutions, those that fund science and those who publish it. Open access publishing is important, but open data is fundamental to scientific progress."

"In a post-Gutenberg era, can the library maintain its historic role as an efficient repository of scientific knowledge? Can it provide support for the creation of new knowledge? What responsibilities should it discharge, and how? What skills are required by those discharging the library function? And how do we achieve a realisable objective, of having all the publications online, all the data online, and for the two to be interoperable?"

Learn more about LIBER at www.libereurope.eu

  • Be the first to comment

A Revolution in Open Science: Open Data and the Role of Libraries (Professor Geoffrey Boulton at LIBER 2013)

  1. 1. A revolution in open science? - open data & the role of libraries - Geoffrey Boulton LIBER Munich June 2013
  2. 2. Open communication of data: the source of a scientific revolution and of scientific progress Henry Oldenburg
  3. 3. Problems & opportunities in the data deluge 1020 bytes Available storage
  4. 4. The challenge to Oldenburg’s principle: - a crisis of replicability and credibility? A fundamental principle: the data providing the evidence for a published concept MUST be concurrently published, together with the metadata But what about the vast data volumes that are not used to support publication as well as those that are?
  5. 5. The opportunity: new knowledge from data (some technical opportunities & priorities) Exploiting the potential of linked data requires: • data integration • dynamic data Solutions/agreements are needed for: • provenance • persistent identifiers • standards • data citation formats • algorithm integration • file-format translation • software-archiving • automated data reading • metadata generation • timing of data release
  6. 6. BUT - its not just sharing and integrating data – its also what we do with it! Jim Gray - “When you go and look at what scientists are doing, day in and day out, in terms of data analysis, it is truly dreadful. We are embarrassed by our data!” ….. and we need a new breed of informatics-trained data scientist as the new librarians of the post- Gutenberg world but will they be in the Library? So what are the priorities? 1. Ensuring valid reasoning 2. Innovative manipulation to create new information 3. Effective management of the data ecology 4. Education & training in data informatics & statistics
  7. 7. Benefits of open communication of data • Greater benefit to the individual than hugging their own data (e.g. bioinformatics) • Permits faster responses to emergencies (e.g. Hamburg 2011 E. Coli outbreak) • Deterrent to fraud (system integrity as well as personal integrity) • Stimulates novel and creative collaborations (e.g. Tim Gowers & “crowd-sourcing in mathematics) • Support for citizen science movement ( e.g. Galaxy Zoo, Fold-it, Ash-Tag, etc) • Demands by citizens for access (e.g. “Climategate”) • Efficient responses to global challenges
  8. 8. Benefits 1 : data sharing in ethos & practice – the example of bio-informatics ELIXIR Hub (European Bioinformatic Institute) and ELIXIR Nodes provide infrastructure for data, computing, tools, standards and training.
  9. 9. • E-coli outbreak spread through several countries affecting 4000 people • Strain analysed and genome released under an open data license. • Two dozen reports in a week with interest from 4 continents • Crucial information about strain’s virulence and resistance Benefits 2: Response to Gastro-intestinal infection in Hamburg
  10. 10. “Scientific fraud is rife: it's time to stand up for good science” “Science is broken” Examples:  psychology academics making up data,  anaesthesiologist Yoshitaka Fujii with 172 faked articles  Nature - rise in biomedical retraction rates overtakes rise in published papers Cause: Rewards and pressures promote extreme behaviours, and normalise malpractice (e.g. selective publication of positive novel findings) Cures: Open data for replication Transparent peer review Not just personal integrity – but system integrity Benefits 3: Response to fraud
  11. 11. Mathematics related discussions Tim Gowers - crowd-sourced mathematics An unsolved problem posed on his blog. 32 days – 27 people – 800 substantive contributions Emerging contributions rapidly developed or discarded Problem solved! “Its like driving a car whilst normal research is like pushing it” What inhibits such processes? - The criteria for credit and promotion. Benefits 4: Opening-up science: e.g. crowd-sourcing
  12. 12. • Openly collected science is already helping policy makers. • AshTag app allows users to submit photos and locations of sightings to a team who will refer them on to the Forestry Commission, which is leading efforts to stop the disease's spread with the Department for Environment, Food and Rural Affairs (Defra). Chalara spread: 1992-2012 Benefits 5: Citizen Science
  13. 13. • Benefits 6: The Economy
  14. 14. Benefits 7: Openness to public scrutiny
  15. 15. Benefits 8: Global challenges
  16. 16. Boundaries of openness? Openness should be the default position, with proportional exceptions for: • Legitimate commercial interests (sectoral variation) • Privacy (completely anonymised data is impossible) • Safety, security & dual use (impacts contentious) All these boundaries are fuzzy
  17. 17. Openness of data per se has little value. Open science is more than disclosure For effective communication, replication and re-purposing we need intelligent openness. Data and meta-data must be: • Accessible • Intelligible • Assessable • Re-usable Only when these four criteria are fulfilled are data properly open. But, intelligent openness must be audience sensitive. Intelligent openness to fellow citizens normally makes far greater demands than openness to peers – we should prioritise “PUBLIC INTEREST SCIENCE” for the former.
  18. 18. Responsibilities & actions • Scientists: - changing the mindset • Learned Societies: - influencing their communities • Universities/Insts: - responsibility for the knowledge they produce? - incentives & promotion criteria - proactive, not just compliant - strategies (e.g. the library) - management processes • Funders of research: - mandate intelligent openness - accept diverse outputs - cost of open data is a cost of science - strategic funding for technical solutions (a priority for international collaboration) • Publishers: - mandate concurrent open deposition
  19. 19. A data management ecology? The role of the top-down and the bottom-up? Massive data loss Sum (little science data) > Sum (big science data)?
  20. 20. Can libraries rise to the challenges of a post- Gutenberg world? “Libraries do the wrong things, employ the wrong people” People • Funders mandate novel customers – the public • Can they attract data scientists? • Support for researchers & students Things • Reversing centralisation • A data repository – directory - metadata – background • Dynamic data • Selection problem • Compliant or proactive?
  21. 21. A taxonomy of openness Inputs Outputs Open access Administrative data (held by public authorities e.g. prescription data) Public Sector Research data (e.g. Met Office weather data) Research Data (e.g. CERN, generated in universities) Research publications (i.e. papers in journals) Open data Open science Collecting the data Doing research Doing science openly Researchers - Citizens - Citizen scientists – Businesses – Govt & Public sector Science as a public enterprise
  22. 22. A realiseable aspiration: all scientific literature open & online, all data open & online, and for them to interoperate … but, this is a process, not an event!
  23. 23. www.royalsociety.org

    Be the first to comment

    Login to see the comments

  • SarahG_SS

    Jul. 5, 2013
  • keita.bando

    Jul. 5, 2013
  • mishdalton

    Jul. 6, 2013
  • m_akbulut

    Jul. 6, 2013
  • NicolasLoubet

    Jul. 6, 2013
  • bobb81

    Jul. 8, 2013
  • scholz

    Jul. 8, 2013
  • albertoconti

    Jul. 8, 2013
  • mondaugen

    Jul. 10, 2013
  • carlosqg

    Jul. 11, 2013
  • rutsi

    Jul. 13, 2013
  • e_zazani

    Jul. 15, 2013
  • heypeggy

    Jul. 18, 2013
  • PaolaPala

    Jul. 24, 2013
  • TetianaYaroshenko

    Jul. 24, 2013
  • Federico_Morando

    May. 23, 2014
  • kosson

    Dec. 29, 2014
  • iradche

    Dec. 4, 2016
  • inforan

    Feb. 12, 2017
  • MichaelFagbohun1

    Nov. 26, 2018

This talk was given by Prof. Geoffrey Boulton of the University of Edinburgh at LIBER's 42nd annual conference in Munich. Here is a brief summary: "The data storm that has been unleashed by novel means of data acquisition, manipulation and their instantaneous communication have posed both great challenges and opportunities for science. The challenge is to maintain scientific self-correction, which depends on concurrent publication of concepts and the underlying evidence. The opportunity is to exploit massive and complex data volumes in creating new knowledge. Both are non-trivial tasks. The former requires ‘intelligent openness‘." "The latter requires new ways of thinking and new forms of collaboration, which make major demands on scientists, their institutions, those that fund science and those who publish it. Open access publishing is important, but open data is fundamental to scientific progress." "In a post-Gutenberg era, can the library maintain its historic role as an efficient repository of scientific knowledge? Can it provide support for the creation of new knowledge? What responsibilities should it discharge, and how? What skills are required by those discharging the library function? And how do we achieve a realisable objective, of having all the publications online, all the data online, and for the two to be interoperable?" Learn more about LIBER at www.libereurope.eu

Views

Total views

8,729

On Slideshare

0

From embeds

0

Number of embeds

3,298

Actions

Downloads

0

Shares

0

Comments

0

Likes

20

×