iODaV Data Workshop.
iPiC Centre, JKUAT Main Campus, JUJA 19th September 2017
Open Data & JORD Policy
Prof Joseph Muliaro Wafula PhD, FCCS, FCSK.
Chair, iODaV & Director, iCEOD
Jomo Kenyatta University of Agriculture and Technology
Kenya
Towards Open Data/Science
Climate Change Conference- Kyoto University
Japan 2016
Why Data especially in this digital era?
• Science demands that you support your arguments with
evidence/data.
• Open research data are essential for reproducibility, self-
correction.
• Academic publishing has not kept up with age of digital data.
• Danger of an replication / evidence / credibility gap.
• Open data foster innovation and accelerate scientific discovery
through reuse of data.
 Data for research should be intelligently open: accessible, assessible,
intelligible, useable.
 FAIR: Findable, Accessible, Interoperable, Reusable.
 Publications and data should be Open and available concurrently:
argues that not to make data concurrently open is scientific
malpractice
 Science International Accord on Open Data in a Big Data World:
http://www.science-international.org/ (JKUAT has signed this
accord)
Open Data Guiding Principles-FAIR
• FAIR Data
• Findable: have sufficiently rich metadata and a unique and persistent identifier.
• Accessible: retrievable by humans and machines through a standard protocol;
open and free; authentication and authorization where necessary.
• Interoperable: metadata use a ‘formal, accessible, shared, and broadly
applicable language for knowledge representation’.
• Reusable: metadata provide rich and accurate information; clear usage license;
detailed provenance.
• FAIR Guiding Principles for scientific data management and
stewardship, http://dx.doi.org/10.1038/sdata.2016.18
• Guiding Principles for FAIR Data: https://www.force11.org/node/6062
FAIR Principles
• To be Findable:
• F1. (meta)data are assigned a globally unique and
persistent identifier
• F2. data are described with rich metadata (defined by R1
below)
• F3. metadata clearly and explicitly include the identifier
of the data it describes
• F4. (meta)data are registered or indexed in a searchable
resource
• To be Accessible:
• A1. (meta)data are retrievable by their identifier using a
standardized communications protocol
• A1.1 the protocol is open, free, and universally
implementable
• A1.2 the protocol allows for an authentication and
authorization procedure, where necessary
• A2. metadata are accessible, even when the data are no
longer available
• (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, http://dx.doi.org/10.1038/sdata.2016.18)
• To be Interoperable:
• I1. (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge representation.
• I2. (meta)data use vocabularies that follow FAIR principles
• I3. (meta)data include qualified references to other
(meta)data
• To be Reusable:
• R1. meta(data) are richly described with a plurality of
accurate and relevant attributes
• R1.1. (meta)data are released with a clear and accessible
data usage license
• R1.2. (meta)data are associated with detailed provenance
• R1.3. (meta)data meet domain-relevant community
standards
• (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, http://dx.doi.org/10.1038/sdata.2016.18)
Opportunities & Challenges for JKUAT/PAUSTI
Open and FAIR Research Data Presents Major Opportunities for Universities:
 Research intensive universities will be data intensive universities.
 Supporting researchers’ use of data is a key strategic mission and enabler: world
class research environment includes support for data stewardship.
 A university’s reputation is increasingly built on all research outputs and wider
societal and economic impact: data is core to this.
 Development of significant data collections of research intensive universities.
Leading departments / research groups will be characterised by excellence in
data, by Open FAIR data collections.
 The way in which the contribution to research of both the individual researcher
and the institution will increasingly be measured on the basis of data outputs as
well as research articles.
 Policies less and less ambiguous – data stewardship, RDM is necessary for grant
funding success.
 Avoid reputational damage through data loss.
Challenges:
 Policy development: unpicking Open
and FAIR data (JKUAT has JORD)
 Supporting data through the
lifecycle.
 Culture and incentives: what’s in it
for us?
 Skills gaps: training and support.
 Technical systems and infrastructure.
 Developing culture of conscious data
stewardship: what to keep and what
to discard.
 Supporting the long term
stewardship of research data
 Sustainability and finance..
Boundaries of Open
 For data created with public funds or where there is a strong
demonstrable public interest, Open should be the default.
 As Open as Possible as Closed as Necessary.
 Proportionate exceptions for:
 Legitimate commercial interests (sectoral variation)
 Privacy (‘safe data’ vs Open data – the anonymisation problem)
 Public interest (e.g. endangered species, archaeological sites)
 Safety, security and dual use (impacts contentious)
 All these boundaries are fuzzy and need to be understood better!
 There is a need to evolve policies, practices and ethics around
closed, shared, and open data.
Plan
Create
Use
AppraisePublish
Discover
Reuse
Store
Annotate
Select
DiscardDescribe
Identify Hand Over?
Access
Supporting the Research Data Lifecycle
Incentives: Data Citation
Out of Cite, Out of Mind
http://bit.ly/out_of_cite
Joint Declaration of Data Citation
Principles:
https://www.force11.org/datacitation
Background and Developments:
http://bit.ly/data_citation_principles
International Series of Data Citation
Workshops
http://bit.ly/data-citation-workshops
CODATA Task Group on
Data Citation
Principles and Practices
If publications are the stars and
planets of the scientific universe,
data are the ‘dark matter’ –
influential but largely unobserved
in our mapping process
Open Data Policy
Key Objectives:
1. Promote Data publication, preservation and reuse.
2. Promote multi-disciplined research capabilities and activities that are
ICT enabled
3. Accelerate ICT innovation through equipping innovators with
requisite skills and credible and quality data
4. Change culture of keeping data private to public by default
The long end of the tail…..has individual scientists data
• Much of this revolution is taking place at the top end
– at the head and neck
• Although ‘big data’ is all the rage….the vast majority
of data sets created through research fall into the
“Long Tail”
Source – Wagging the Long Tail, Kathleen Shearer et al, 2014
12
Best practices for open research data adopted by JKUAT
Data-driven
Innovation
successfully capitalizing on data
revolution requires public policies
and strategies designed to allow
data-driven innovation to
flourish(2013 WB).
These policies and strategies will
remove barriers , stimulate release,
use and impact assessment of
open data (Rininta et al., 2015).
Open
Data
Policy
Strategy
Action
Plan
Open Data Initiative(ODI) 1: JORD Policy
http://www.jkuat.ac.ke/directorates/iceod/wp-content/uploads/2017/06/JORD-Policy-ISO-ref-April-2016.pdf
JKUAT with the support of
CODATA, developed and
implemented an open research
data policy (JORD) Policy
(February 2016)
14
ROI
Encouragement of
diverse studies
and opinion
Promotion of new
areas of work not
envisioned by the
initial investigators
Development of
new products and
services
Strengthen the
credibility of
scholarly
publications
Development of
new products and
services
ODI 2: Innovative Open Data and Visualization (iODaV)-JKUAT and PAUST
The specific objectives of AFRICA
ai JAPAN Project Sub-Task Force
are as follows:
i See Link http://www.jkuat.ac.ke/wp-
content/uploads/2017/02/Innovation-
Research-Grants-AFRICA-ai-JAPAN-
Project.pdf
iODaV
Open
Research
Data-based
Innovation
Data
Analytics
Data,Info &
Scientific
Visualization
Smart
Learning-
ThinkBoard
S/W
Open Data
Principles,
Stds & JORD
Reuse of
Research
Data
15
ODI 3: DRAFT NATIONAL INFORMATION & COMMUNICATIONS TECHNOLOGY
(ICT) POLICY JUNE 2016 (http://icta.go.ke/pdf/National-ICT-Policy-20June2016.pdf)
• Article 5.10 –Data Centre:The government will:
 Promote Data Centre infrastructure buildout carried out in cognizance of globally
approved standards for purposes of ensuring quality of service under open access,
carrier neutral model;
 (b) Develop incentives to ensure and protect investment in the field of data centre;
(c) Facilitate the development and enactment of legislation on localization to support
growth in IT service consumption – as an engine to spur data centre growth;
(d) Ensure that Data is processed fairly and lawfully in accordance with the rights of
citizens and obtained only for specific, lawful purposes
In support of Kenya Open Data Initiative (http://www.opendata.go.ke/)
16
ODI 3….2
• Article 7.1- Digital Content
 (a) Adopting Open Data principles: - in order to share historical/archive data that can be
a rich source for the creative and broadcast industry;
(b) Promoting Animation Labs (A-Lab):- Government will support incubation labs focused
on animation & film production that is largely computer generated;
(c) Content Ratings: - The Government will, develop policies and legislation that take into
consideration age appropriate content that upholds national values.
(d) Copyright Protection:- Government will recognise digital content as copyright
material and will actively protect the rights of copyright owners through law
enforcement to prevent digital content piracy.
17
ODI 3….3
• 15.4 Information Security
The government will develop information security policies and
guidelines to ensure protection of the confidentiality, integrity and
availability of information
18
ODI 4: DIGITAL HEALTH APPLIED RESEARCH CENTRE
(DHARC) -JKUAT
• DHARC is one of the deliverables of HIGDA Project funded by USAID 5 yr project
started Oct 2016
• DHARC -implementation of interoperability solutions informed by Open Data
Principles and Stds.
• DHARC will join a network of interoperability labs which have been established in
Canada (2007), South Africa (2010), and the Philippines (2016)
• It will provide examples of how key components (DHIS2, DATIM, MFL ver2, AMRS
and other mHealth solutions) interoperate, providing guidelines-based care
workflows, policies, and M&E mandates.
19
In 6: Open Data Policy Development
• Open Data policy development need to be based on the following three pillars:
1. C-context
2. C-content
3. I-impact
20
Policy Context Pillar
Key factors include:
Level of Gov organization
Key motivations, policy objectives
Open data platform launch
Resource allocation & economic context
Legislation
Social, cultural & Political context
Drivers for open data
Forces against Opening data
21
Policy Content Pillar
Key factors include:
Licensing
Access fee
Data restriction
Data presentation
Contact with user
Amount published
Processing before publishing
22
Cost of opening
Types of Data
Data Formats & stds
Data quality
Provision of metadata
Policy Impact Pillar
Key factors include:
Re-use of published data
Possible predicted risks
Benefits aligned with motivation
Public value
Transparency & accountability
Economic growth
Entrepreneurial open data use/ innovation
Efficiency
Environmental sustainability
Inclusion of marginalized 23
I Key Strategic Pillars of Sustainable Open Data
Programs
Support open data infrastructure build based on open data
policies standards and supportive legal and licensing frameworks
Make data publishing and access available and easy
Create feedback channels for data users
Prioritize dataset that users want
Address quality issues of datasets
Protect privacy rights
Provide clear, consistent, and useful metadata
24
Open data implementation best practices
• Have an open data policy ( e.g. JORD-JKUAT)
• Ensure easy to understand content & formatting
• Release high-value and high-impact data first
• Ensure compatibility and interoperability of systems (e.g. Kenya
Health sector DHARC project –USAID/JKUAT)
• Establish data ownership
• Involve stakeholders
• Plan for open data advocacy (e.g. KALRO)
• Implement interaction and feedback mechanism
• Build communities of data producers and users
• Organize training programs
• Organize hackathons( eg CODATA, JAPAN ai AFRICA Project, IBM,
JKUAT, USAID have sponsored hackathon on Agriculture and Health
sector open data to promote innovations and data use in Kenya) 25
Data Quality Principles
Designed in
consultation
with IBM Expert
(Ben Mann)
28
JKUAT Open Data Platform
Thank you-Asante Sana

I o dav data workshop prof wafula final 19.9.17

  • 1.
    iODaV Data Workshop. iPiCCentre, JKUAT Main Campus, JUJA 19th September 2017 Open Data & JORD Policy Prof Joseph Muliaro Wafula PhD, FCCS, FCSK. Chair, iODaV & Director, iCEOD Jomo Kenyatta University of Agriculture and Technology Kenya
  • 2.
    Towards Open Data/Science ClimateChange Conference- Kyoto University Japan 2016
  • 3.
    Why Data especiallyin this digital era? • Science demands that you support your arguments with evidence/data. • Open research data are essential for reproducibility, self- correction. • Academic publishing has not kept up with age of digital data. • Danger of an replication / evidence / credibility gap. • Open data foster innovation and accelerate scientific discovery through reuse of data.  Data for research should be intelligently open: accessible, assessible, intelligible, useable.  FAIR: Findable, Accessible, Interoperable, Reusable.  Publications and data should be Open and available concurrently: argues that not to make data concurrently open is scientific malpractice  Science International Accord on Open Data in a Big Data World: http://www.science-international.org/ (JKUAT has signed this accord)
  • 4.
    Open Data GuidingPrinciples-FAIR • FAIR Data • Findable: have sufficiently rich metadata and a unique and persistent identifier. • Accessible: retrievable by humans and machines through a standard protocol; open and free; authentication and authorization where necessary. • Interoperable: metadata use a ‘formal, accessible, shared, and broadly applicable language for knowledge representation’. • Reusable: metadata provide rich and accurate information; clear usage license; detailed provenance. • FAIR Guiding Principles for scientific data management and stewardship, http://dx.doi.org/10.1038/sdata.2016.18 • Guiding Principles for FAIR Data: https://www.force11.org/node/6062
  • 5.
    FAIR Principles • Tobe Findable: • F1. (meta)data are assigned a globally unique and persistent identifier • F2. data are described with rich metadata (defined by R1 below) • F3. metadata clearly and explicitly include the identifier of the data it describes • F4. (meta)data are registered or indexed in a searchable resource • To be Accessible: • A1. (meta)data are retrievable by their identifier using a standardized communications protocol • A1.1 the protocol is open, free, and universally implementable • A1.2 the protocol allows for an authentication and authorization procedure, where necessary • A2. metadata are accessible, even when the data are no longer available • (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, http://dx.doi.org/10.1038/sdata.2016.18) • To be Interoperable: • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. • I2. (meta)data use vocabularies that follow FAIR principles • I3. (meta)data include qualified references to other (meta)data • To be Reusable: • R1. meta(data) are richly described with a plurality of accurate and relevant attributes • R1.1. (meta)data are released with a clear and accessible data usage license • R1.2. (meta)data are associated with detailed provenance • R1.3. (meta)data meet domain-relevant community standards • (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, http://dx.doi.org/10.1038/sdata.2016.18)
  • 6.
    Opportunities & Challengesfor JKUAT/PAUSTI Open and FAIR Research Data Presents Major Opportunities for Universities:  Research intensive universities will be data intensive universities.  Supporting researchers’ use of data is a key strategic mission and enabler: world class research environment includes support for data stewardship.  A university’s reputation is increasingly built on all research outputs and wider societal and economic impact: data is core to this.  Development of significant data collections of research intensive universities. Leading departments / research groups will be characterised by excellence in data, by Open FAIR data collections.  The way in which the contribution to research of both the individual researcher and the institution will increasingly be measured on the basis of data outputs as well as research articles.  Policies less and less ambiguous – data stewardship, RDM is necessary for grant funding success.  Avoid reputational damage through data loss. Challenges:  Policy development: unpicking Open and FAIR data (JKUAT has JORD)  Supporting data through the lifecycle.  Culture and incentives: what’s in it for us?  Skills gaps: training and support.  Technical systems and infrastructure.  Developing culture of conscious data stewardship: what to keep and what to discard.  Supporting the long term stewardship of research data  Sustainability and finance..
  • 7.
    Boundaries of Open For data created with public funds or where there is a strong demonstrable public interest, Open should be the default.  As Open as Possible as Closed as Necessary.  Proportionate exceptions for:  Legitimate commercial interests (sectoral variation)  Privacy (‘safe data’ vs Open data – the anonymisation problem)  Public interest (e.g. endangered species, archaeological sites)  Safety, security and dual use (impacts contentious)  All these boundaries are fuzzy and need to be understood better!  There is a need to evolve policies, practices and ethics around closed, shared, and open data.
  • 8.
  • 9.
    Incentives: Data Citation Outof Cite, Out of Mind http://bit.ly/out_of_cite Joint Declaration of Data Citation Principles: https://www.force11.org/datacitation Background and Developments: http://bit.ly/data_citation_principles International Series of Data Citation Workshops http://bit.ly/data-citation-workshops CODATA Task Group on Data Citation Principles and Practices If publications are the stars and planets of the scientific universe, data are the ‘dark matter’ – influential but largely unobserved in our mapping process
  • 10.
    Open Data Policy KeyObjectives: 1. Promote Data publication, preservation and reuse. 2. Promote multi-disciplined research capabilities and activities that are ICT enabled 3. Accelerate ICT innovation through equipping innovators with requisite skills and credible and quality data 4. Change culture of keeping data private to public by default
  • 11.
    The long endof the tail…..has individual scientists data • Much of this revolution is taking place at the top end – at the head and neck • Although ‘big data’ is all the rage….the vast majority of data sets created through research fall into the “Long Tail” Source – Wagging the Long Tail, Kathleen Shearer et al, 2014
  • 12.
    12 Best practices foropen research data adopted by JKUAT
  • 13.
    Data-driven Innovation successfully capitalizing ondata revolution requires public policies and strategies designed to allow data-driven innovation to flourish(2013 WB). These policies and strategies will remove barriers , stimulate release, use and impact assessment of open data (Rininta et al., 2015). Open Data Policy Strategy Action Plan
  • 14.
    Open Data Initiative(ODI)1: JORD Policy http://www.jkuat.ac.ke/directorates/iceod/wp-content/uploads/2017/06/JORD-Policy-ISO-ref-April-2016.pdf JKUAT with the support of CODATA, developed and implemented an open research data policy (JORD) Policy (February 2016) 14 ROI Encouragement of diverse studies and opinion Promotion of new areas of work not envisioned by the initial investigators Development of new products and services Strengthen the credibility of scholarly publications Development of new products and services
  • 15.
    ODI 2: InnovativeOpen Data and Visualization (iODaV)-JKUAT and PAUST The specific objectives of AFRICA ai JAPAN Project Sub-Task Force are as follows: i See Link http://www.jkuat.ac.ke/wp- content/uploads/2017/02/Innovation- Research-Grants-AFRICA-ai-JAPAN- Project.pdf iODaV Open Research Data-based Innovation Data Analytics Data,Info & Scientific Visualization Smart Learning- ThinkBoard S/W Open Data Principles, Stds & JORD Reuse of Research Data 15
  • 16.
    ODI 3: DRAFTNATIONAL INFORMATION & COMMUNICATIONS TECHNOLOGY (ICT) POLICY JUNE 2016 (http://icta.go.ke/pdf/National-ICT-Policy-20June2016.pdf) • Article 5.10 –Data Centre:The government will:  Promote Data Centre infrastructure buildout carried out in cognizance of globally approved standards for purposes of ensuring quality of service under open access, carrier neutral model;  (b) Develop incentives to ensure and protect investment in the field of data centre; (c) Facilitate the development and enactment of legislation on localization to support growth in IT service consumption – as an engine to spur data centre growth; (d) Ensure that Data is processed fairly and lawfully in accordance with the rights of citizens and obtained only for specific, lawful purposes In support of Kenya Open Data Initiative (http://www.opendata.go.ke/) 16
  • 17.
    ODI 3….2 • Article7.1- Digital Content  (a) Adopting Open Data principles: - in order to share historical/archive data that can be a rich source for the creative and broadcast industry; (b) Promoting Animation Labs (A-Lab):- Government will support incubation labs focused on animation & film production that is largely computer generated; (c) Content Ratings: - The Government will, develop policies and legislation that take into consideration age appropriate content that upholds national values. (d) Copyright Protection:- Government will recognise digital content as copyright material and will actively protect the rights of copyright owners through law enforcement to prevent digital content piracy. 17
  • 18.
    ODI 3….3 • 15.4Information Security The government will develop information security policies and guidelines to ensure protection of the confidentiality, integrity and availability of information 18
  • 19.
    ODI 4: DIGITALHEALTH APPLIED RESEARCH CENTRE (DHARC) -JKUAT • DHARC is one of the deliverables of HIGDA Project funded by USAID 5 yr project started Oct 2016 • DHARC -implementation of interoperability solutions informed by Open Data Principles and Stds. • DHARC will join a network of interoperability labs which have been established in Canada (2007), South Africa (2010), and the Philippines (2016) • It will provide examples of how key components (DHIS2, DATIM, MFL ver2, AMRS and other mHealth solutions) interoperate, providing guidelines-based care workflows, policies, and M&E mandates. 19
  • 20.
    In 6: OpenData Policy Development • Open Data policy development need to be based on the following three pillars: 1. C-context 2. C-content 3. I-impact 20
  • 21.
    Policy Context Pillar Keyfactors include: Level of Gov organization Key motivations, policy objectives Open data platform launch Resource allocation & economic context Legislation Social, cultural & Political context Drivers for open data Forces against Opening data 21
  • 22.
    Policy Content Pillar Keyfactors include: Licensing Access fee Data restriction Data presentation Contact with user Amount published Processing before publishing 22 Cost of opening Types of Data Data Formats & stds Data quality Provision of metadata
  • 23.
    Policy Impact Pillar Keyfactors include: Re-use of published data Possible predicted risks Benefits aligned with motivation Public value Transparency & accountability Economic growth Entrepreneurial open data use/ innovation Efficiency Environmental sustainability Inclusion of marginalized 23
  • 24.
    I Key StrategicPillars of Sustainable Open Data Programs Support open data infrastructure build based on open data policies standards and supportive legal and licensing frameworks Make data publishing and access available and easy Create feedback channels for data users Prioritize dataset that users want Address quality issues of datasets Protect privacy rights Provide clear, consistent, and useful metadata 24
  • 25.
    Open data implementationbest practices • Have an open data policy ( e.g. JORD-JKUAT) • Ensure easy to understand content & formatting • Release high-value and high-impact data first • Ensure compatibility and interoperability of systems (e.g. Kenya Health sector DHARC project –USAID/JKUAT) • Establish data ownership • Involve stakeholders • Plan for open data advocacy (e.g. KALRO) • Implement interaction and feedback mechanism • Build communities of data producers and users • Organize training programs • Organize hackathons( eg CODATA, JAPAN ai AFRICA Project, IBM, JKUAT, USAID have sponsored hackathon on Agriculture and Health sector open data to promote innovations and data use in Kenya) 25
  • 26.
  • 28.
    Designed in consultation with IBMExpert (Ben Mann) 28 JKUAT Open Data Platform
  • 29.