African Open Science Platform
Geoffrey Boulton
CODATA
Pretoria
December 2016
19
Exabytes280Exabytes
Based on:
http://www.martinhilbert.net/WorldOnfoCapacity.html 1 Exabyte=1018 bytes
The digital revolution
storage – analysis – communication
Global information storage capacity
In optimally compressed bytes
Digital
Storage
Analogue Storage
Explosion of the
Digital revolution
1986
1993
2000
2007
2014-4000Exabytes
The technological bases
for open science
if we choose to use them!
Why Open Data/Open Science?
The international context
What is a Platform,
how should it be structured
what is its value?
How is it governed?
Role of CODATA & ICSU
Key Questions
• Identifies the opportunities and
challenges of the data revolution as
the dominant issue of policy for
science
• Sets out 12 guiding principles for the
practice of open data
• Outlines the responsibilities of all
stakeholders in supporting such
practice
• Addresses the boundaries of
openness, concluding that open data
should be the default position for
publicly funded science
First statement on Open Data by the
International Scientific Community
International Accord on Open Data
EMBL-EBI services
Labs around the
world send us
their data and
we…
Archive it
Classify it
Share it with
other data
providers
Analyse, add
value and
integrate it
…provide
tools to help
researchers
use it
A collaborative
enterprise
Discipline-driven Government-driven
International Systemic Platforms/Commons
European science cloud
CODATA – ICSU COMMISSION
ON ONTOLOGIES & METADATA
FOR SCIENCE & TECHNOLOGY
International Union of Crystallography
DECADE OF DATA?
The Open Data Iceberg
The Technical Challenge
The Consent Challenge
The Ecosystem Challenge
The Funding Challenge
The Support Challenge
The Skills Challenge
The Incentives Challenge
The Mindset Challenge
Processes &
Organisation
People
motivation and ethos. National/Regional Infrastructure
Technology
African Open Science Platform
Purpose:
• To provide a federated virtual space for scientists to find, deposit,
manage, share and reuse data, software and metadata
Functions:
• Establishing common principles, policies and practices for data
acquisition and use and Providing the facilitating tools in ways that
are adapted to varying national, disciplinary and application
priorities and approaches.
• Recognising the roles and developing responsibilities of different
actors at all levels in national scientific ecosystems.
• Developing the technical capacities of researchers and data
professionals.
• Creating meaning from data: awareness and access to developing
a) Principles – Policies – Practices – Tools
• Shared open data principles (Science International Accord on Open Data)
• A computational environment for access, utilisation and storage
• Common digital data compliance model that describes the properties of data
that enable them to be Findable, Accessible, Interoperable and Reproducible
(FAIR)
• Publicly available datasets that adhere to accepted principles and practices
• Software services and tools to facilitate access to data and their responsible
use
!
b) National Science Ecosystems
• Governments: enunciate policy, create incentives
• Funders: costs of open data as the costs of doing science; require FAIR data
deposition from the projects they fund; collaborate in Platform evolution
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
c) Capacity building amongst researchers and data professionals
a) Principles – Policies – Practices – Tools
• Shared open data principles (Science International Accord on Open Data)
• A computational environment for access, utilisation and storage
• Common digital data compliance model that describes the properties of data
that enable them to be Findable, Accessible, Interoperable and Reproducible
(FAIR)
• Publicly available datasets that adhere to accepted principles and practices
• Software services and tools to facilitate access to data and their responsible
use
!
b) National Science Ecosystems
• Governments: enunciate policy, create incentives
• Funders: costs of open data as the costs of doing science; require FAIR data
deposition from the projects they fund; collaborate in Platform evolution
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
c) Capacity building amongst researchers and data professionals
Functions - 1
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
d) Creating Meaning from Data
· Ensuring access to cutting edge analytic tools
· Matching analytic tools for big data to project purpose
· Using machine learning
· Applying semantic methods to data integration
· Developing/using relevant ontologies and vocabularies for discovery and
integration
· Linking with international efforts in data science and application areas
c) Capacity building amongst researchers and data professionals
· Coordination of technical capacity building exercises
· Including scaled-up versions of existing CODATA training workshops
and CODATA/RDA School of Research Data Science in Africa
· Collaboration with disciplinary bodies in offering discipline-specific workshops
· Discussions with universities about their longer-term adoption of data science
curricula
!
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
d) Creating Meaning from Data
· Ensuring access to cutting edge analytic tools
· Matching analytic tools for big data to project purpose
· Using machine learning
· Applying semantic methods to data integration
· Developing/using relevant ontologies and vocabularies for discovery and
integration
· Linking with international efforts in data science and application areas
c) Capacity building amongst researchers and data professionals
· Coordination of technical capacity building exercises
· Including scaled-up versions of existing CODATA training workshops
and CODATA/RDA School of Research Data Science in Africa
· Collaboration with disciplinary bodies in offering discipline-specific workshops
· Discussions with universities about their longer-term adoption of data science
curricula
!
Functions - 2
Pilot Phase
Funded by DST/NRF - Managed by Assaf - Directed by CODATA
Interim Governance
Advisory Council
Technical Advisory Board
Pilot Phase Priorities
Developing the Partnership
Developing Governance
Creating a Roadmap
Stimulating engagement

African Open Science Platform

  • 1.
    African Open SciencePlatform Geoffrey Boulton CODATA Pretoria December 2016
  • 2.
    19 Exabytes280Exabytes Based on: http://www.martinhilbert.net/WorldOnfoCapacity.html 1Exabyte=1018 bytes The digital revolution storage – analysis – communication Global information storage capacity In optimally compressed bytes Digital Storage Analogue Storage Explosion of the Digital revolution 1986 1993 2000 2007 2014-4000Exabytes The technological bases for open science if we choose to use them!
  • 3.
    Why Open Data/OpenScience? The international context What is a Platform, how should it be structured what is its value? How is it governed? Role of CODATA & ICSU Key Questions
  • 4.
    • Identifies theopportunities and challenges of the data revolution as the dominant issue of policy for science • Sets out 12 guiding principles for the practice of open data • Outlines the responsibilities of all stakeholders in supporting such practice • Addresses the boundaries of openness, concluding that open data should be the default position for publicly funded science First statement on Open Data by the International Scientific Community International Accord on Open Data
  • 5.
    EMBL-EBI services Labs aroundthe world send us their data and we… Archive it Classify it Share it with other data providers Analyse, add value and integrate it …provide tools to help researchers use it A collaborative enterprise Discipline-driven Government-driven International Systemic Platforms/Commons European science cloud CODATA – ICSU COMMISSION ON ONTOLOGIES & METADATA FOR SCIENCE & TECHNOLOGY International Union of Crystallography DECADE OF DATA?
  • 6.
    The Open DataIceberg The Technical Challenge The Consent Challenge The Ecosystem Challenge The Funding Challenge The Support Challenge The Skills Challenge The Incentives Challenge The Mindset Challenge Processes & Organisation People motivation and ethos. National/Regional Infrastructure Technology
  • 7.
    African Open SciencePlatform Purpose: • To provide a federated virtual space for scientists to find, deposit, manage, share and reuse data, software and metadata Functions: • Establishing common principles, policies and practices for data acquisition and use and Providing the facilitating tools in ways that are adapted to varying national, disciplinary and application priorities and approaches. • Recognising the roles and developing responsibilities of different actors at all levels in national scientific ecosystems. • Developing the technical capacities of researchers and data professionals. • Creating meaning from data: awareness and access to developing
  • 8.
    a) Principles –Policies – Practices – Tools • Shared open data principles (Science International Accord on Open Data) • A computational environment for access, utilisation and storage • Common digital data compliance model that describes the properties of data that enable them to be Findable, Accessible, Interoperable and Reproducible (FAIR) • Publicly available datasets that adhere to accepted principles and practices • Software services and tools to facilitate access to data and their responsible use ! b) National Science Ecosystems • Governments: enunciate policy, create incentives • Funders: costs of open data as the costs of doing science; require FAIR data deposition from the projects they fund; collaborate in Platform evolution • Universities and Institutes: research data management; capacity building; research support; incentives for researchers • Publishers: require concurrent FAIR data deposition • Researchers: changing the mindset – data custodians not owners c) Capacity building amongst researchers and data professionals a) Principles – Policies – Practices – Tools • Shared open data principles (Science International Accord on Open Data) • A computational environment for access, utilisation and storage • Common digital data compliance model that describes the properties of data that enable them to be Findable, Accessible, Interoperable and Reproducible (FAIR) • Publicly available datasets that adhere to accepted principles and practices • Software services and tools to facilitate access to data and their responsible use ! b) National Science Ecosystems • Governments: enunciate policy, create incentives • Funders: costs of open data as the costs of doing science; require FAIR data deposition from the projects they fund; collaborate in Platform evolution • Universities and Institutes: research data management; capacity building; research support; incentives for researchers • Publishers: require concurrent FAIR data deposition • Researchers: changing the mindset – data custodians not owners c) Capacity building amongst researchers and data professionals Functions - 1
  • 9.
    • Universities andInstitutes: research data management; capacity building; research support; incentives for researchers • Publishers: require concurrent FAIR data deposition • Researchers: changing the mindset – data custodians not owners d) Creating Meaning from Data · Ensuring access to cutting edge analytic tools · Matching analytic tools for big data to project purpose · Using machine learning · Applying semantic methods to data integration · Developing/using relevant ontologies and vocabularies for discovery and integration · Linking with international efforts in data science and application areas c) Capacity building amongst researchers and data professionals · Coordination of technical capacity building exercises · Including scaled-up versions of existing CODATA training workshops and CODATA/RDA School of Research Data Science in Africa · Collaboration with disciplinary bodies in offering discipline-specific workshops · Discussions with universities about their longer-term adoption of data science curricula ! • Universities and Institutes: research data management; capacity building; research support; incentives for researchers • Publishers: require concurrent FAIR data deposition • Researchers: changing the mindset – data custodians not owners d) Creating Meaning from Data · Ensuring access to cutting edge analytic tools · Matching analytic tools for big data to project purpose · Using machine learning · Applying semantic methods to data integration · Developing/using relevant ontologies and vocabularies for discovery and integration · Linking with international efforts in data science and application areas c) Capacity building amongst researchers and data professionals · Coordination of technical capacity building exercises · Including scaled-up versions of existing CODATA training workshops and CODATA/RDA School of Research Data Science in Africa · Collaboration with disciplinary bodies in offering discipline-specific workshops · Discussions with universities about their longer-term adoption of data science curricula ! Functions - 2
  • 10.
    Pilot Phase Funded byDST/NRF - Managed by Assaf - Directed by CODATA Interim Governance Advisory Council Technical Advisory Board Pilot Phase Priorities Developing the Partnership Developing Governance Creating a Roadmap Stimulating engagement