The document discusses openness in 21st century science. It argues that open science, including open data and open collaboration, has benefits like increased scale, statistical power and collaboration. However, open data also presents challenges regarding privacy, security and commercial interests. The document advocates for "intelligent openness" through frameworks that balance openness with protecting privacy and addressing issues like dual use science. When open data is made intelligible, accessible, assessable and usable, it can translate to societal benefits through increased knowledge and innovative applications.
Kicking off the INCENTIVE project with an intro to the CS Principles and Char...Margaret Gold
-The Citizen Science Lab at Leiden University
- The core concept of the INCENTIVE project
- The ECSA 10 Principles of Citizen Science
- The ECSA Characteristics of Citizen Science
Invited presentation for plenary session 1: Leveraging a Never Ending Technological Revolution as part of the 4th GEOSS Science and Technology Stakeholder Workshop: Concepts, Technologies, Systems and Users of the Next GEOSS, Norfolk, VA, held on March 24-26, 2015. http://www.gstss.org/2015_Norfolk_4th/program.php
Open Science - Paradigm Shift or Revival of Old Ideas?Heidi Laine
Slides for a lecture held as part of a course on Science and Society, organized by the University of Helsinki Doctoral School HYMY during spring semester 2016.
Kicking off the INCENTIVE project with an intro to the CS Principles and Char...Margaret Gold
-The Citizen Science Lab at Leiden University
- The core concept of the INCENTIVE project
- The ECSA 10 Principles of Citizen Science
- The ECSA Characteristics of Citizen Science
Invited presentation for plenary session 1: Leveraging a Never Ending Technological Revolution as part of the 4th GEOSS Science and Technology Stakeholder Workshop: Concepts, Technologies, Systems and Users of the Next GEOSS, Norfolk, VA, held on March 24-26, 2015. http://www.gstss.org/2015_Norfolk_4th/program.php
Open Science - Paradigm Shift or Revival of Old Ideas?Heidi Laine
Slides for a lecture held as part of a course on Science and Society, organized by the University of Helsinki Doctoral School HYMY during spring semester 2016.
Science advice to government - Auckland conferencebis_foresight
Presentation by Sir Mark Walport at the Science Advice to Governments conference held in Auckland, 28-29 August 2014.
(This is the final version of the presentation, as it was delivered.)
Crop Protection Association - Managing risk, not avoiding itbis_foresight
Presentation by Sir Mark Walport at the Crop Protection Association (CPA) conference on 14 May 2015.
Read an extract of the speech on the current science around neonicotinoid insecticides: https://www.gov.uk/government/speeches/crop-protection-managing-risk-not-avoiding-it
Future of Manufacturing launch - presentationbis_foresight
Slides from the launch of the Foresight 'Future of Manufacturing' report - 30 October 2013.
See the reports:
Summary - http://www.slideshare.net/bis_foresight/13-810futuremanufacturingsummaryreport
Full report - http://www.slideshare.net/bis_foresight/future-of-manufacturing-a-new-era-of-opportunity-and-challenge-for-the-uk-project-report
For more information, see: http://bit.ly/FoMn
Palestra apresentada à CONFOA 2013 (Universidade de São Paulo, São Paulo, Brasil, de 06 a 08 de outubro de 2013) na Mesa III - A ciência aberta e a gestão de dados de pesquisa - pelo Prof. Dr. Peter Elias – REINO UNIDO - The Royal Society of UK.
The State of Open Data Report by @figshare.
A selection of analyses and articles about open data, curated by Figshare
Foreword by Professor Sir Nigel Shadbolt
OCTOBER 2016
Citation: O Riordan, N. 2013. An initial exploration of Citizen Science. NUIG Whitaker Institute Working Paper Series.
A working paper summarising the latest research on citizen science and its relationship with open innovation and the wisdom of crowds. Considers well known cases of citizen science including Galaxy Zoo. Identifies key research questions for future study.
Science advice to government - Auckland conferencebis_foresight
Presentation by Sir Mark Walport at the Science Advice to Governments conference held in Auckland, 28-29 August 2014.
(This is the final version of the presentation, as it was delivered.)
Crop Protection Association - Managing risk, not avoiding itbis_foresight
Presentation by Sir Mark Walport at the Crop Protection Association (CPA) conference on 14 May 2015.
Read an extract of the speech on the current science around neonicotinoid insecticides: https://www.gov.uk/government/speeches/crop-protection-managing-risk-not-avoiding-it
Future of Manufacturing launch - presentationbis_foresight
Slides from the launch of the Foresight 'Future of Manufacturing' report - 30 October 2013.
See the reports:
Summary - http://www.slideshare.net/bis_foresight/13-810futuremanufacturingsummaryreport
Full report - http://www.slideshare.net/bis_foresight/future-of-manufacturing-a-new-era-of-opportunity-and-challenge-for-the-uk-project-report
For more information, see: http://bit.ly/FoMn
Palestra apresentada à CONFOA 2013 (Universidade de São Paulo, São Paulo, Brasil, de 06 a 08 de outubro de 2013) na Mesa III - A ciência aberta e a gestão de dados de pesquisa - pelo Prof. Dr. Peter Elias – REINO UNIDO - The Royal Society of UK.
The State of Open Data Report by @figshare.
A selection of analyses and articles about open data, curated by Figshare
Foreword by Professor Sir Nigel Shadbolt
OCTOBER 2016
Citation: O Riordan, N. 2013. An initial exploration of Citizen Science. NUIG Whitaker Institute Working Paper Series.
A working paper summarising the latest research on citizen science and its relationship with open innovation and the wisdom of crowds. Considers well known cases of citizen science including Galaxy Zoo. Identifies key research questions for future study.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
From Open Data to Open Science, by Geoffrey BoultonLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Geoffrey Boulton, University of Edinburgh & CODATA
Overview of Citizen Science - Zurich November 2015Muki Haklay
A presentation that provides an overview of the societal & technical trends that are at the basis of citizen science (as in previous talks), then a classification of the main types of citizen science and finally a short overview of policy trends.
4º National Plan for Open Government - Mechanisms of Scientific Data Governan...ATMOSPHERE .
Alex Moura, Brazilian National Research and Educational Network (RNP) and Research Data Alliance (RDA) in Brazil - "4º National Plan for Open Government - Mechanisms of Scientific Data Governance to develop Open Science in Brazil"
By René Tanner,
Head of Research Services
Olin Library, Rollins College
Winter Park, Florida, USA
Presented at the ENSULIB Satellite Meeting in Cork, Ireland, July 2022
Lessons from the UK: Data access, patient trust & real-world impact with heal...Varsha Khodiyar
Slides supporting presentation given at the virtual Beilstein Open Science Symposium in October 2021.
Abstract:
Health Data Research UK’s mission is to unite the UK’s health data to enable discoveries that improve people’s lives. Our 20-year vision is for large scale data and advanced analytics to benefit every patient interaction, clinical trial, biomedical discovery and enhance public health. A key part of HDR UK’s vision is our data portal, the Innovation Gateway. The Gateway facilitates discovery of healthcare data and simplifies data request procedures across multiple data custodians. The Gateway contains metadata on a variety of datasets, including those related to COVID-19, cardiovascular, maternal health, emergency care, primary care, secondary care, acute care, palliative care, biobanks, research cohorts and deeply phenotyped patient cohorts.
From the outset HDR UK has sought the voices, views and experiences of patient and lay-public groups to ensure there is transparency and clear public benefit in the use of the UK’s health data. Patient and public involvement is key to making the Gateway accessible, transparent and to ensure public confidence in research access to health data. The importance of public outreach combined with providing research access to data is illustrated with HDR UK’s contribution to the UK’s coronavirus pandemic response. HDR UK was tasked by the UK’s Chief Scientific Office to build and facilitate the infrastructure to support the National Core Studies, providing key insights on the evolving situation to UK policy makers during the course of the pandemic.
In this talk, I will show how HDR UK is enabling open science by facilitating the discovery of health data, and simplifying the process of requesting access to multiple datasets. I’ll discuss HDR UK’s approach to embedding transparency on research data usage for patients and public, and summarise some of the key ways in which HDR UK has contributed to the coronavirus pandemic.
Research data management: a tale of two paradigms: Martin Donnelly
Presentation I was supposed to give at "Scotland’s Collections and the Digital Humanities" workshop in Edinburgh on May 2nd 2014. Illness prevented it, but my heroic DCC colleague Jonathan Rans stepped up and delivered the presentation on my behalf.
How will we power the UK in the future? bis_foresight
Sir Mark Walport gave a series of public talks on energy at Science and Discovery Centres across the UK between September 2015 and April 2016. In these talks he explored how we could power the UK in the future.
These slides come from the last talk given in Birmingham, but differ only slightly from the slides used in earlier talks.
See the accompanying animations at:
https://www.youtube.com/playlist?list=PLb-lLN3v5qAxFKlzS-eaaGJUEhVbyES2f
On 21 October 2015, the British Embassy in Paris hosted a day of discussions on French-British collaboration on resilience to extreme weather, with talks from UK Government Chief Scientific Adviser Sir Mark Walport, former vice-chair of IPCC WKI Dr. Jean Jouzel, as well as representatives from the Met Office and Meteo France, UK and French government departments, and the private sector.
Review looking at the future of financial technologies (FinTech) up to 2025.
This report sets out the findings of a review by the Government Chief Scientific Adviser on FinTech. It looks at:
* what government can do to help achieve the economic potential of FinTech
* how companies can work more closely with academia to ensure that the UK continues to be a world leader in this area
It recommends a number of actions for government to support the growth of the sector.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
1. Rigour and Openness in 21st
Century Science
Sir Mark Walport, Chief Scientific Advisor to HM Government
2. 2 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Knowledge translated to economic
advantage: Promoting the contribution of science,
engineering, technology and the social sciences to
economic growth by linking industry, academia and
government
• Infrastructure resilience: Developing the
capabilities that are vital to the infrastructure that
underpins our security, well-being and resilience
• The right science for emergencies:
Providing the best scientific advice in the case of
emergencies
• Underpinning policy with evidence:
Ensuring the best use of quantitative and qualitative
analysis across government
• Advocacy and leadership for science:
Providing advocacy and strong leadership for science
inside and outside government
Government Chief Scientific Adviser
3. 3 Rigour and Openness in 21st Century Science, 11-12th
April 2013
A taxonomy of openness
Inputs OutputsResearch
Open access
Administrative
data (held by
public
authorities e.g.
prescription
data)
Public Sector
Research data
(e.g. Met
Office weather
data)
Research
Data (e.g.
CERN,
generated in
universities)
Research
publications
(i.e. papers in
journals)
Open data
Open science
Collecting the
data
Doing science
openly
4. 4 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Examples:
• Genome project
• Citizen science:
– AshTag
– Galaxy Zoo
Benefits:
• Collaboration
• Scale
• Statistical power
Collecting the Data: professional
community working in a different way
5. Unraveling the genome…
• Human Genome Project
• SNP consortium
• Hapmap
• Cancer Genome Project
• Copy Number Variation
• WT Trust Case Control Consortium
• 1000 Genomes
• Encode
• UK 10,000 Genomes
• Deciphering Developmental Disease
• H3 Africa
5 Rigour and Openness in 21st Century Science, 11-12th
April 2013
7. 7
• Infrastructure to accelerate advanced materials discovery
• Time frame for incorporating new classes of materials into
applications is typically about 10 to 20 years from initial research to
first use.
• $100M initiative proposes: open innovation; advances in
modelling algorithms; a data exchange system.
• Aims to shorten cycle by 50%.
Rigour and Openness in 21st Century Science, 11-12th
April 2013
Materials Genome Initiative
8. • Openly collected science is already helping policy
makers.
• AshTag app allows users to submit photos and
locations of sightings to a team who will refer them on
to the Forestry Commission, which is leading efforts to
stop the disease's spread with the Department for
Environment, Food and Rural Affairs (Defra).
8 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Chalara spread: 1992-2012
Citizen Science
9. • Novel communication technologies are changing the social dynamics of science
• Massively Collaborative Mathematics: Tim Gowers and the Polymath Project.
• A blog serving as an open forum for contributors to work on a complex unsolved
mathematical problem:
“A new combinatorial proof to the density version of the Hales-Jewett theorem”
•27 people made more than 800 comments
• In just over a month, the problem was solved
9 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Massively Collaborative Mathematics
10. 10
• Intelligible
• Accessible
• Assessable
• Useable
Rigour and Openness in 21st Century Science, 11-12th
April 2013
Data per se often of little value
11. • Data
• Information
• Knowledge
• Application
Societal benefit
11 Rigour and Openness in 21st Century Science, 11-12th
April 2013
From data to knowledge to society
12. Presentation title - edit in Header and Footer
Vemurafenib – oral targeted therapy for treatment of metastatic malignant melanoma in
patients whose tumors carry the BRAFV600E mutation. Approximately 60% of
melanoma patients have tumors that carry this mutation
12 Rigour and Openness in 21st Century Science, 11-12th
April 2013
From data to knowledge to society
13. Openness not an unalloyed good
A challenge to:
• Commerce
• Privacy
• Security
…Need intelligent openness
…Some dilemmas
13 Rigour and Openness in 21st Century Science, 11-12th
April 2013
14. 14
Structural Genomics Consortium
• Not-for-profit public-private partnership to
conduct basic science.
• Determine 3D protein structures which may
be targets for drug discovery.
• Once such targets are discovered, they are
placed in the public domain.
Rigour and Openness in 21st Century Science, 11-12th
April 2013
Privately collected data often can
be open…
…but not always
- and what about issues of public safety:
- Clinical trials,
- Aircraft safety
15. 15 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Dual use science
Developed for one context, used for others
• Civilian Military use
• Science subverted to do harm
• Civilian: developed in one domain – used in others
16. 16 Rigour and Openness in 21st Century Science, 11-12th
April 2013
• Flu: Spanish flu sequencing 2005, H5N1
• Avian flu controversy, Netherlands and US teams
introducing genetic changes to understand H5N1
transmissibility between species
• Publication in Nature and Science put on hold 2012
• International actors explored the issues:
• Research is not risk free
• Need to get framework right (legislative, ethical,
physical)
• Tools to address risks: risk assessment, ethical
standards, checks and balances, eg peer review
• Self governance and the role of scientists – research
funders, science press: putting in place standards and
guidance to balance benefits of publication vs potential
harm
Dual use science
Image: Science Photo Library
17. The default should be openness
• Where it has worked
• Where lack of openness has
caused problems
• Where there are genuine
difficulties
17 Rigour and Openness in 21st Century Science, 11-12th
April 2013
18. • E-coli outbreak spread through several
countries affecting 4000 people
• Strain analysed and genome released
under an open data license.
• Two dozen reports in a week with
interest from 4 continents
• Crucial information about strain’s
virulence and resistance
18 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Gastro-intestinal infection in Hamburg
19. 19 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Climate change transparency
20. 20 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Developing world - Indonesian flu
• There are difficulties in ensuring access to data
from developing countries.
• Some uneasy at the prospect that those with
greater scientific resources will benefit overseas
interests, to the detriment of home researchers.
• Indonesia ceased providing access to their flu
samples in 2007
• This policy was reversed only after the World
Health Organisation put in place protocols for
equitable access to vaccines and medicines in
future pandemics.
21. Open Access
• Research isn’t finished until it is
published
• Maximise impact of research by
maximising distribution
• Publication is a cost of research
• IT revolution enables revolution in
methods in disseminating the
findings of research
21 Rigour and Openness in 21st Century Science, 11-12th
April 2013
22. Open access offers much more than
just the ‘paper’
22 Rigour and Openness in 21st Century Science, 11-12th
April 2013
• Massive data sets and
metadata
• Data display
• Text mining
• Community annotation and
feedback
• Other linkages
23. Rigour
• How we can improve process:
– Scale, collaboration
– Scrutiny
– Enabling reproducibility
• Honest error vs fraud
• The best way forward is through
openness
23 Rigour and Openness in 21st Century Science, 11-12th
April 2013
24. 24
• Muon neutrinos fired 730 km from CERN
to the Gran Sasso National Laboratory,
central Italy
• Neutrinos seemed to travel faster than the
speed of light
• CERN opened the result to broader
scrutiny
• More than 200 papers appeared on
arXiv.org attempting to debunk or explain the
effect
• Team later reported two flaws in equipment
set-up: a fiber optic cable attached
improperly & a clock oscillator ticking too fast
Rigour and Openness in 21st Century Science, 11-12th
April 2013
‘Faster than light’ neutrinos
Image: Science Photo Library
25. Societal benefit
The right open data enables:
• Measurement
• Accountability
• Better execution
• Innovation
25 Rigour and Openness in 21st Century Science, 11-12th
April 2013
26. Harnessing ICT: A national
diabetes system for Scotland
Total Scottish Population 5.2M
People with diabetes : 251,132 (4.9%)
People with Type 1 DM : ~27,000
(0.5%)
All patients nationally are registered
onto a single register; the SCI-DC
register
SCI-DC used in all 38 hospitals
Nightly capture of data from all 1043
primary care practices across Scotland
Courtesy of Andrew Morris
26 Rigour and Openness in 21st Century Science, 11-12th
April 2013
27. 27 Rigour and Openness in 21st Century Science, 11-12th
April 2013
PercentageofPatients
Data recorded within the previous 15 months
http://www.diabetesinscotland.org.uk/Publications/SDS%202010.pdf
Courtesy of Andrew Morris
Scottish Diabetes Survey – over 90%
capture of key variables since 2007
Recording of Key Biomedical Markers
28. Diabetes Care 2008Diabetic Medicine 2009
Courtesy of Andrew Morris
Improved clinical outcomes
28 Rigour and Openness in 21st Century Science, 11-12th
April 2013
29. 29
• Personal information is individual &
precious to each one of us – it’s vital that we
treat it properly
• A balancing act…between the right to
privacy and the necessity to hold and share
data
• A framework is needed:
- to protect individuals
- build & maintain confidence
- facilitate research
Rigour and Openness in 21st Century Science, 11-12th
April 2013
Privacy and confidentiality
30. 30
• Formed in December 2011, reported in December 2012.
• Examined the best procedures and mechanisms to make
administrative data available for research safely
• Aim allow research that is already technically feasible to be
undertaken with integrity in a much more consistent, reliable
and efficient manner
• Value to academic research, but also research and policy
evaluation within government departments
Rigour and Openness in 21st Century Science, 11-12th
April 2013
Administrative Data Taskforce
Challenge to bring together data sets:
• across government
• and from the private sector
Good policy depends on the best data
31. 31
5 key recommendations:
• An Administrative Data Research Centre (ADRC) should be
established in each of the four countries in the UK
• Legislation should be enacted to facilitate research access to
administrative data and to allow data linkage between
departments to take place more efficiently
• A single UK-wide researcher accreditation process, built on
best national and international practice, should be established
• The centres should put in place plans for engaging with the
public
• Funding should be provided
Rigour and Openness in 21st Century Science, 11-12th
April 2013
Administrative Data Taskforce
Government to respond by Summer 2013
32. 32 Rigour and Openness in 21st Century Science, 11-12th
April 2013
Summary
Open science requires:
• Collecting the data in the best way
• Sharing the data openly
• Making the products of research available openly
But open data per se not an unalloyed good. Needs to be:
• Intelligible
• Accessible
• Assessable
• Useable
When we get that right we are able to ensure:
Data → Information → Knowledge→ Application→ Societal
benefit
Editor's Notes
Lots of interchangeable and fluid terms but many shared principles.
Lots of interchangeable and fluid terms but many shared principles.
Lots of interchangeable and fluid terms but many shared principles.
Lots of interchangeable and fluid terms but many shared principles.
Advanced materials are essential to economic security and human well-being, with applications in multiple industries, including those aimed at addressing challenges in clean energy, national security, and human welfare. Accelerating the pace of discovery and deployment of advanced material systems will therefore be crucial to achieving global competitiveness in the 21st century. The Materials Genome Initiative will create a new era of materials innovation that will serve as a foundation for strengthening domestic industries in these fields. This initiative offers a unique opportunity for the United States to discover, develop, manufacture, and deploy advanced materials at least twice as fast as possible today, at a fraction of the cost. At present, the time frame for incorporating new classes of materials into applications is remarkably long, typically about 10 to 20 years from initial research to first use. As today’s scientists and engineers explore a new generation of advanced materials to solve the grand challenges of the 21st century, reducing the time required to bring these discoveries to market will be a key driving force behind a more competitive domestic manufacturing sector and economic growth. To achieve faster materials development, the materials community must embrace open innovation. Rapid advances in computational modeling and data exchange and more advanced algorithms for modeling materials behavior must be developed to supplement physical experiments; and a data exchange system that will allow researchers to index, search, and compare data must be implemented to allow greater integration and collaboration. Currently, no infrastructure exists to allow different engineering teams to share data or models. Data transparency may have the largest impact after the material has been deployed, due to the fact that every industry relies on materials as components of product design. A product designer who needs a material of certain specifications may not be aware that the material has already been designed because there is no standard method to search for it. Data transparency encourages cross-industry and multidisciplinary applications.
Ash dieback, caused by the fungus Chalara fraxinea, was found in the UK in October outside of plantations and nurseries in East Anglia, raising fears of a repeat of Dutch elm disease which killed 25 million mature elms in the 1970s and 80s. In an attempt to map and help prevent the spread of the disease across the country, a team of developers and academics worked through the weekend to create an app that smartphone owners can use to report suspected cases of infection. Infected ash trees are recognisable by lesions on their bark, dieback of leaves at the tree's crown, and leaves turning brown – though experts say the arrival of autumn makes the latter harder to accurately spot. zThe AshTag app for IOS and Android devices allows users to submit photos and locations of sightings to a team who will refer them on to the Forestry Commission, which is leading efforts to stop the disease's spread with the Department for Environment, Food and Rural Affairs (Defra).
Novel communication technologies permit modes of interaction that change the social dynamics of science and exploit the collective intelligence of the scientific community. Free online resources and search engines have become integral to science in ways that have replaced the library as a source of information, searches and cataloguing. New tools, for example, myExperiment, offer much more enhanced abilities to share and execute scientific workflows. Live and open debate played out via wikis and blogs have changed the dynamic of academic discussion – sometimes in extreme ways. In January 2009 Tim Gowers, an eminent mathematician and recipient of the Fields Medal, launched the Polymath Project , a blog serving as an open forum for contributors to work on a complex unsolved mathematical problem. He posed the question: “Is massively collaborative mathematics possible?” He then set out the problem, his ideas about it and an invitation for others to contribute to its solution. 27 people made more than 800 comments, rapidly developing or discarding emerging ideas. In just over a month, the problem was solved. Together they not only solved the core problem, but a harder generalisation of it. In describing this, Gowers said, “It felt like the difference between driving a car and pushing it”. Fold.it: The fold.it website offers participants a game in which players solve the intricate puzzles figuring out the ways in which amino acids fold to create different protein molecules. Galaxy Zoo: Galaxy Zoo enables users to participate in the analysis of the imagery of hundreds of thousands of galaxies drawn from NASA’s Hubble Space Telescope archive and the Sloane Digital Sky Survey. BOINC: Numerous Citizen Science projects employ so-called volunteer computing, where individuals provide the resources of their home computers to contribute to big science research. Today there are over 50 active projects based on the BOINC platform developed at the University of California Berkeley.
The changes that are needed go to the heart of the scientific enterprise and are much more than a requirement to publish or disclose more data. Realising the benefits of open data requires effective communication through a more intelligent openness: data must be accessible and readily located; they must be intelligible to those who wish to scrutinise them; data must be assessable so that judgments can be made about their reliability and the competence of those who created them; and they must be usable by others. For data to meet these requirements it must be supported by explanatory metadata (data about data). As a first step towards this intelligent openness, data that underpin a journal article should be made concurrently available in an accessible database. We are now on the brink of an achievable aim: for all science literature to be online, for all of the data to be online and for the two to be interoperable.
The Structural Genomics Consortium (SGC) is a not-for-profit public-private partnership to conduct basic science. Its main goal is to determine 3D protein structures which may be targets for drug discovery. Once such targets are discovered, they are placed in the public domain. By collaborating with the SGC, pharmaceutical companies save money by designing medicines that they know will ‘fit’ the target. The SGC was initiated through funding from the Wellcome Trust, the Canadian Institute of Health Research, Ontario Ministry of Research and Innovation and GlaxoSmithKline. More recently other companies (Novartis, Pfizer and Eli Lilly) have joined this public-private partnership. The group of funders recently committed over US$50 million to fund the SGC for another four years. OTHER EGs InnoCentive is a service for problem solving through crowdsourcing. Companies post challenges or scientific research problems on InnoCentive’s website, along with a prize for their solution. More than 140,000 people from 175 countries have registered to take part in the challenges, and prizes for more than 100 Challenges have been awarded. Institutions that have posed challenges include Eli Lilly, NASA, nature.com, Procter & Gamble, Roche and the Rockefeller Foundation. Rolls-Royce University Technology Centres Imanova is an innovative alliance between the UK’s Medical Research Council and three London Universities: Imperial College, King’s College and University College. It trains scientists and physicians, and hopes to become an international partner for pharmaceutical and biotechnology companies. Syngenta The Technology Strategy Board and Syngenta are building a system that helps scientists visualise the similarities between the molecules in ChEMBL, an openly accessible drug discovery database of over one million drug-like small compounds, and those in their own research.
The Structural Genomics Consortium (SGC) is a not-for-profit public-private partnership to conduct basic science. Its main goal is to determine 3D protein structures which may be targets for drug discovery. Once such targets are discovered, they are placed in the public domain. By collaborating with the SGC, pharmaceutical companies save money by designing medicines that they know will ‘fit’ the target. The SGC was initiated through funding from the Wellcome Trust, the Canadian Institute of Health Research, Ontario Ministry of Research and Innovation and GlaxoSmithKline. More recently other companies (Novartis, Pfizer and Eli Lilly) have joined this public-private partnership. The group of funders recently committed over US$50 million to fund the SGC for another four years. OTHER EGs InnoCentive is a service for problem solving through crowdsourcing. Companies post challenges or scientific research problems on InnoCentive’s website, along with a prize for their solution. More than 140,000 people from 175 countries have registered to take part in the challenges, and prizes for more than 100 Challenges have been awarded. Institutions that have posed challenges include Eli Lilly, NASA, nature.com, Procter & Gamble, Roche and the Rockefeller Foundation. Rolls-Royce University Technology Centres Imanova is an innovative alliance between the UK’s Medical Research Council and three London Universities: Imperial College, King’s College and University College. It trains scientists and physicians, and hopes to become an international partner for pharmaceutical and biotechnology companies. Syngenta The Technology Strategy Board and Syngenta are building a system that helps scientists visualise the similarities between the molecules in ChEMBL, an openly accessible drug discovery database of over one million drug-like small compounds, and those in their own research.
e The benefits of intelligently open data were powerfully illustrated by events following an outbreak of a severe gastro-intestinal infection in Hamburg in Germany in May 2011. This spread through several European countries and the US, affecting about 4000 people and resulting in over 50 deaths. All tested positive for an unusual and little-known Shiga-toxin–producing E. coli bacterium. The strain was initially analysed by scientists at BGI-Shenzhen in China, working together with those in Hamburg, and three days later a draft genome was released under an open data licence. This generated interest from bioinformaticians on four continents. 24 hours after the release of the genome it had been assembled. Within a week two dozen reports had been filed on an open-source site dedicated to the analysis of the strain. These analyses provided crucial information about the strain’s virulence and resistance genes – how it spreads and which antibiotics are effective against it. They produced results in time to help contain the outbreak. By July 2011, scientists published papers based on this work. By opening up their early sequencing results to international collaboration, researchers in Hamburg produced results that were quickly tested by a wide range of experts, used to produce new knowledge and ultimately to control a public health emergency. Wikipedia A novel strain of Escherichia coli O104:H4 bacteria caused a serious outbreak of foodborne illness focused in northern Germany in May through June 2011. The illness was characterized by bloody diarrhea, with a high frequency of serious complications, including hemolytic -uremic syndrome (HUS), a condition that requires urgent treatment. The outbreak was originally thought to have been caused by an enterohemorrhagic ( EHEC ) strain of E. coli , but it was later shown to have been caused by an enteroaggregative E. coli ( EAEC ) strain that had acquired the genes to produce Shiga toxins . Epidemiological fieldwork suggested fresh vegetables were the source of infection. The agriculture minister of Lower Saxony identified an organic farm [2] in Bienenbüttel , Lower Saxony, Germany, which produces a variety of sprouted foods , as the likely source of the E. coli outbreak. [3] The farm has since been shut down. [3] Although laboratories in Lower Saxony did not detect the bacterium in produce, a laboratory in North Rhine-Westphalia later found the outbreak strain in a discarded package of sprouts from the suspect farm. [4] A control investigation confirmed the farm as the source of the outbreak. [5] On 30 June 2011 the German Bundesinstitut für Risikobewertung (BfR) ( Federal Institute for Risk Assessment ), an institute of the German Federal Ministry of Food, Agriculture and Consumer Protection ), announced that seeds of fenugreek imported from Egypt were likely the source of the outbreak. [6] In all, 3,950 people were affected and 53 died, including 51 in Germany. [7] A handful of cases were reported in several other countries including Switzerland , [8] Poland , [8] the Netherlands , [8] Sweden , [8] Denmark , [8] the UK, [8] [9] Canada [10] and the USA. [10] [11] Essentially all affected people had been in Germany or France shortly before becoming ill. Initially German officials made incorrect statements on the likely origin and strain of Escherichia coli . [12] [13] [14] [15] The German health authorities, without results of ongoing tests, incorrectly linked the O104 serotype to cucumbers imported from Spain. [16] Later, they recognised that Spanish greenhouses were not the source of the E. coli and cucumber samples did not contain the specific E. coli variant causing the outbreak. [17] [18] Spain consequently expressed anger about having its produce linked with the deadly E. coli outbreak, which cost Spanish exporters 200M US$ per week. [19] Russia banned the import of all fresh vegetables from the European Union until 22 June. [20] A novel strain of Escherichia coli O104:H4 bacteria caused a serious outbreak of foodborne illness focused in northern Germany in May through June 2011. The illness was characterized by bloody diarrhea, with a high frequency of serious complications, including hemolytic -uremic syndrome (HUS), a condition that requires urgent treatment. The outbreak was originally thought to have been caused by an enterohemorrhagic ( EHEC ) strain of E. coli , but it was later shown to have been caused by an enteroaggregative E. coli ( EAEC ) strain that had acquired the genes to produce Shiga toxins . Epidemiological fieldwork suggested fresh vegetables were the source of infection. The agriculture minister of Lower Saxony identified an organic farm [2] in Bienenbüttel , Lower Saxony, Germany, which produces a variety of sprouted foods , as the likely source of the E. coli outbreak. [3] The farm has since been shut down. [3] Although laboratories in Lower Saxony did not detect the bacterium in produce, a laboratory in North Rhine-Westphalia later found the outbreak strain in a discarded package of sprouts from the suspect farm. [4] A control investigation confirmed the farm as the source of the outbreak. [5] On 30 June 2011 the German Bundesinstitut für Risikobewertung (BfR) ( Federal Institute for Risk Assessment ), an institute of the German Federal Ministry of Food, Agriculture and Consumer Protection ), announced that seeds of fenugreek imported from Egypt were likely the source of the outbreak. [6] In all, 3,950 people were affected and 53 died, including 51 in Germany. [7] A handful of cases were reported in several other countries including Switzerland , [8] Poland , [8] the Netherlands , [8] Sweden , [8] Denmark , [8] the UK, [8] [9] Canada [10] and the USA. [10] [11] Essentially all affected people had been in Germany or France shortly before becoming ill. Initially German officials made incorrect statements on the likely origin and strain of Escherichia coli .[12][13][14][15] The German health authorities, without results of ongoing tests, incorrectly linked the O104 serotype to cucumbers imported from Spain.[16] Later, they recognised that Spanish greenhouses were not the source of the E. coli and cucumber samples did not contain the specific E. coli variant causing the outbreak.[17][18] Spain consequently expressed anger about having its produce linked with the deadly E. coli outbreak, which cost Spanish exporters 200M US$ per week.[19] Russia banned the import of all fresh vegetables from the European Union until 22 June.[20]
e The benefits of intelligently open data were powerfully illustrated by events following an outbreak of a severe gastro-intestinal infection in Hamburg in Germany in May 2011. This spread through several European countries and the US, affecting about 4000 people and resulting in over 50 deaths. All tested positive for an unusual and little-known Shiga-toxin–producing E. coli bacterium. The strain was initially analysed by scientists at BGI-Shenzhen in China, working together with those in Hamburg, and three days later a draft genome was released under an open data licence. This generated interest from bioinformaticians on four continents. 24 hours after the release of the genome it had been assembled. Within a week two dozen reports had been filed on an open-source site dedicated to the analysis of the strain. These analyses provided crucial information about the strain’s virulence and resistance genes – how it spreads and which antibiotics are effective against it. They produced results in time to help contain the outbreak. By July 2011, scientists published papers based on this work. By opening up their early sequencing results to international collaboration, researchers in Hamburg produced results that were quickly tested by a wide range of experts, used to produce new knowledge and ultimately to control a public health emergency. Wikipedia A novel strain of Escherichia coli O104:H4 bacteria caused a serious outbreak of foodborne illness focused in northern Germany in May through June 2011. The illness was characterized by bloody diarrhea, with a high frequency of serious complications, including hemolytic -uremic syndrome (HUS), a condition that requires urgent treatment. The outbreak was originally thought to have been caused by an enterohemorrhagic ( EHEC ) strain of E. coli , but it was later shown to have been caused by an enteroaggregative E. coli ( EAEC ) strain that had acquired the genes to produce Shiga toxins . Epidemiological fieldwork suggested fresh vegetables were the source of infection. The agriculture minister of Lower Saxony identified an organic farm [2] in Bienenbüttel , Lower Saxony, Germany, which produces a variety of sprouted foods , as the likely source of the E. coli outbreak. [3] The farm has since been shut down. [3] Although laboratories in Lower Saxony did not detect the bacterium in produce, a laboratory in North Rhine-Westphalia later found the outbreak strain in a discarded package of sprouts from the suspect farm. [4] A control investigation confirmed the farm as the source of the outbreak. [5] On 30 June 2011 the German Bundesinstitut für Risikobewertung (BfR) ( Federal Institute for Risk Assessment ), an institute of the German Federal Ministry of Food, Agriculture and Consumer Protection ), announced that seeds of fenugreek imported from Egypt were likely the source of the outbreak. [6] In all, 3,950 people were affected and 53 died, including 51 in Germany. [7] A handful of cases were reported in several other countries including Switzerland , [8] Poland , [8] the Netherlands , [8] Sweden , [8] Denmark , [8] the UK, [8] [9] Canada [10] and the USA. [10] [11] Essentially all affected people had been in Germany or France shortly before becoming ill. Initially German officials made incorrect statements on the likely origin and strain of Escherichia coli . [12] [13] [14] [15] The German health authorities, without results of ongoing tests, incorrectly linked the O104 serotype to cucumbers imported from Spain. [16] Later, they recognised that Spanish greenhouses were not the source of the E. coli and cucumber samples did not contain the specific E. coli variant causing the outbreak. [17] [18] Spain consequently expressed anger about having its produce linked with the deadly E. coli outbreak, which cost Spanish exporters 200M US$ per week. [19] Russia banned the import of all fresh vegetables from the European Union until 22 June. [20] A novel strain of Escherichia coli O104:H4 bacteria caused a serious outbreak of foodborne illness focused in northern Germany in May through June 2011. The illness was characterized by bloody diarrhea, with a high frequency of serious complications, including hemolytic -uremic syndrome (HUS), a condition that requires urgent treatment. The outbreak was originally thought to have been caused by an enterohemorrhagic ( EHEC ) strain of E. coli , but it was later shown to have been caused by an enteroaggregative E. coli ( EAEC ) strain that had acquired the genes to produce Shiga toxins . Epidemiological fieldwork suggested fresh vegetables were the source of infection. The agriculture minister of Lower Saxony identified an organic farm [2] in Bienenbüttel , Lower Saxony, Germany, which produces a variety of sprouted foods , as the likely source of the E. coli outbreak. [3] The farm has since been shut down. [3] Although laboratories in Lower Saxony did not detect the bacterium in produce, a laboratory in North Rhine-Westphalia later found the outbreak strain in a discarded package of sprouts from the suspect farm. [4] A control investigation confirmed the farm as the source of the outbreak. [5] On 30 June 2011 the German Bundesinstitut für Risikobewertung (BfR) ( Federal Institute for Risk Assessment ), an institute of the German Federal Ministry of Food, Agriculture and Consumer Protection ), announced that seeds of fenugreek imported from Egypt were likely the source of the outbreak. [6] In all, 3,950 people were affected and 53 died, including 51 in Germany. [7] A handful of cases were reported in several other countries including Switzerland , [8] Poland , [8] the Netherlands , [8] Sweden , [8] Denmark , [8] the UK, [8] [9] Canada [10] and the USA. [10] [11] Essentially all affected people had been in Germany or France shortly before becoming ill. Initially German officials made incorrect statements on the likely origin and strain of Escherichia coli . [12] [13] [14] [15] The German health authorities, without results of ongoing tests, incorrectly linked the O104 serotype to cucumbers imported from Spain. [16] Later, they recognised that Spanish greenhouses were not the source of the E. coli and cucumber samples did not contain the specific E. coli variant causing the outbreak. [17] [18] Spain consequently expressed anger about having its produce linked with the deadly E. coli outbreak, which cost Spanish exporters 200M US$ per week. [19] Russia banned the import of all fresh vegetables from the European Union until 22 June. [20]
There are difficulties in ensuring access to data from developing countries. Whereas some are developing open access journals (for example the journal African Health Sciences32), others are uneasy at the prospect that those with greater scientific resources will benefit overseas interests, to the detriment of home researchers. Indonesia ceased providing access to their flu samples in 2007 because of worries that more scientifically developed countries would create flu vaccines based on their data, with no benefit to Indonesia. This policy was reversed only after the World Health Organisation put in place protocols for equitable access to vaccines and medicines in future pandemics.
Recent developments at the OPERA collaboration at CERN illustrate how data openness can help in the scrutiny of scientific results. The OPERA team fired a beam of muon neutrinos from CERN to the Gran Sasso National Laboratory, 730 km away in central Italy. In September 2011, and to the surprise of the experiment’s scientists, the neutrinos seemed to travel faster than the speed of light – understood to be a universal speed limit. Hoping for ideas to explain this apparent violation of physical law CERN opened the result to broader scrutiny, uploading the results in unprecedented detail to the physics pre-print archive, arXiv.org. More than 200 papers appeared on arXiv.org attempting to debunk or explain the effect. A large group of papers focused on the technique used to time the neutrinos’ flight path. On 23 February 2012, the OPERA collaborators announced two potential sources of timing error. There was a delay in the stop and start signals sent via GPS to the clock at Gran Sasso due to a faulty fibre optic cable, and there was a fault inside the master clock at Gran Sasso. It was announced in June 2012 that attempts to replicate the original result with four separate instruments at Gran Sasso found that neutrinos respected the universal speed limit, confirming the suspected experimental error.
Privacy The use of datasets containing personal information is vital for a lot of research in the medical and social sciences, but poses considerable challenges for information governance because of the potential to compromise individual privacy. Citizens have a legitimate interest in safeguarding their privacy by avoiding personal data being used to exploit, stigmatise or discriminate against them or to infringe on their personal autonomy (see box 3.4).159 The legal framework for the “right to respect for private and family life” is based on article 8 of the European Convention on Human Rights (ECHR)160 for member states of the Council of Europe. Some aspects of privacy rights are codified by the EU Data Protection Directive (95/46/EC) and implemented in the UK by the Data Protection Act 1998 (DPA). There is a live issue with the EU Data Protection Regulation, specifically that the research community are very concerned that amendments proposed by the rapporteur of the LIBE committee will prevent or severely impair scientific research studies using personal data. BIS and DH are working with MoJ (who are in the lead on negotiations) to address. Series of public perception flashpoints e.g. 2 CDS 25m items of child benefit data (2007), Google Maps, Street View.
Improving access for research and policy The Administrative Data Taskforce (ADT) was formed in December 2011 by the Economic and Social Research Council (ESRC), the Medical Research Council (MRC) and Wellcome Trust, and chaired by Sir Alan Langlands. The ADT has been working with a range of government departments, academic experts, the funding agencies and representatives from all four nations in the UK to examine the best procedures and mechanisms to make administrative data available for research safely. The report from the ADT (available below) was published in December 2012. The ADT recommendations propose a UK Administrative Data Research Network that would be responsible for linking data between government departments. The proposed network will provide a single governance structure that will allow for consistent and robust decision-making. "Our recommendations would allow research that is already technically feasible to be undertaken with integrity in a much more consistent, reliable and efficient manner. This would be of huge value to academic research, but would also benefit research and policy evaluation within government departments, whose researchers are also constrained by the existing arrangements." - Professor Paul Boyle, Chief Executive of the ESRC
Improving access for research and policy The Administrative Data Taskforce (ADT) was formed in December 2011 by the Economic and Social Research Council (ESRC), the Medical Research Council (MRC) and Wellcome Trust, and chaired by Sir Alan Langlands. The ADT has been working with a range of government departments, academic experts, the funding agencies and representatives from all four nations in the UK to examine the best procedures and mechanisms to make administrative data available for research safely. The report from the ADT (available below) was published in December 2012. The ADT recommendations propose a UK Administrative Data Research Network that would be responsible for linking data between government departments. The proposed network will provide a single governance structure that will allow for consistent and robust decision-making. "Our recommendations would allow research that is already technically feasible to be undertaken with integrity in a much more consistent, reliable and efficient manner. This would be of huge value to academic research, but would also benefit research and policy evaluation within government departments, whose researchers are also constrained by the existing arrangements." - Professor Paul Boyle, Chief Executive of the ESRC