VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
Opendata repository-v2
1. Open data repository for scientific data
sharing with the southern countries
Desconnets J-C, Aventurier P., Banon S., Doucouré C., CoupinT., Soumaré A.
Gaborone, on 8th November 2018
2. Working mainly in partnership with Mediterranean and inter-tropical
countries on the science of global development issues.
Multidisciplinary research: health and society, climate change,
humanitarian and political crises, agriculture and biodiversity,
IRD : French National Research Institute for Sustainable
Development
Publications : 1300/Year
65 Research Unit
52 % co-publications with
South Countries
2048 Agents
Budget : 230 M€/Year
3. 3
1- Generalizing Open Access to
publications
2 - Structure research data and
make it available through Open
Access
3- Be part of a sustainable European
and international Open Science
dynamic
http://cache.media.enseignementsup-recherche.gouv.fr/file/Recherche/50/1/SO_A4_2018_EN_01_leger_982501.pdf
OLInFER'2018. IRD Open Science, nov 2018
General context
french national plan for open science
4. sharing
new papers
new citations
reuse
IdentificationDescriptionDiscovery
Storage
deposit
Current practises in research community
science data life cycle
4
Research project
design
Start of project
Data
acquisition
Data analysis
scientific paper(s)
End of project
data cycle during the project
data cycle in data repository
project bounds memory lapse
data destruction
add value of research work
Data
acquisition
Research project
design
Start of project Data analysis end of project
Data
management
plan
scientific paper(s)
data cycle during the project
data cycle in data repository
project bounds
add value of research work
5. 5
IRD data repository objectives
First piece for the ecosystem « data management for open
science »
Short term objective (2019-2020)
Provide a service (platform + support + curation) to researchers to control
the dissemination of their data and their preservation
Mid-term objective (2020-2025)
Ensure the discovery of data archived in other repositories, data centres
or research infrastructures (directory function)
6. 6
Internally
Responding to the national plan for open science
Improve the knowledge and management of datasets
A first "concrete" element towards an open science policy at the IRD
Specific objectives
Institutional issues
For our southern partners
Define and control the data governance instead of private repositrories
Improve the accessibility of our data to Southern partners
Support open science initiatives in the South (replication of the data
repository, capacity building)
Europe and international level
Meeting the requirements of European programmes
Integrate into European EOSC (European Open Science Cloud)
infrastructures
7. 7
Data repository scope
Targeted data
Unstructured, undigitised data,
stored in PCs, historical data not
linked to internally or externally
accessible databases
Observatories, data
centres, online
databases
Statistical distribution of research data
(Ferguson et al., 2014)
8. 8
Coming from various scientific domains with differents characteristics
Genomics
Exploited marine ecosystems
Marine and agro biodiversity
Health
Environment sciences
Targeted data and needs
With various needs and expectations from researchers
Data rescue, data preservation
Reproducibility of experiments
DOI allocation, data papers
Data Sharing over the data producers
9. •Limit to data discovery and access facilities
•Make available FAIR data
•Support metadata harvesting from OAI-PMH protocol
9
Principles for design
Metadata core
model
(DublinCore)
Standard
Identification
system (DOI,
ORCID…)
Domain
categorie
s
Harvest
Control
metadata value
Use
Extend
Specific metadata
domain standard
Spatial
location
Species
taxonomies
extended discovery facilities interoperability
10. •Digital Object identifierAllocation for each dataset
• Flexibility to describe a dataset (enrich core metadata model with
metadata standard coming from a specific domain)
• Possibility to put data management in the hands of researchers:
each data folder can be managed by a different administrator
• Publication workflow which allows
• Obtaining a secure temporary link for reviewers of an article
related to unpublished data
• Data versioning
• Metrics (download, view, guest book)
10
And key user requirements
11. • Open source software, created in 2006 by Harvard university
• Set up a local Dataverse instance and participate in the Dataverse
network (CIRAD, INRA, Science Po...)
• Integrate an "ecosystem" of interoperable data repositories
11
Software for data repository
dataverse.org
https://dataverse.org/
12. Data serie
Data set
Full text search
Faceted search
Dataverse
Web user interface – data discovery
14. Typical use of the data repository
14
• Creation of a repository folder and training of a referent person
• research units, projects can create a customizable repository folder "Dataverse", ...)
Description
and data
deposit by
researcher
Validation of
the deposit
by a general
administrator
Publication of
the dataset FAIR data
Researchers deposit, in accordance with
the data management plan, a data set in
their repository folder dedicated using
standardised formats and metadata to
describe their data
(open/ closed /embargo/
only metadata )
15. 15
IRD Open data
repository
Spring 2019
Data management
plan (DMPs). Design
and availability of IRD
DMPTemplate
Training courses for researchers
.
Train scientific data experts and offer
real careers in these professions, which
will be the bridge between IT and
thematic research
Reform the research
assessment to encourage
data sharing
First element for a open science policy at IRD
My talk is about an on going project conducted by IRD to set up an open ……
After introducing the context of the project, I will give some elements about the objectives and the scope in term of data
I introduce also the design and user requirements which guide its implementation
Few words about IRD. IRD stands for …
The main role of IRD is to work …
The research unit of IRD are involved on multi disciplinary challenges such as ….
Some key informations (si je ne suis pas trop long)
The general context of this project is the open science dynamic in general and particular the open science promotes by the european comission and since this summer, officially b the french ministery of the research.
French goverment proposes a national plan for open science with a roadmap in three points :
- The data repository is an official answerof the IRD to the point number two of the roadmap
other element of the context are the current practises in scientific community, in term of data sharing and openess
currently, the practises in research are often orthogonal with open data initiatives
The schema describles in simple way science data cycle
Generally, a research begins with the design of research project. After funding allocation, the project starts and one of the first phase is the data acquisition or data simulation, the second phase is data analysis which is concluded generally by a scientific paper. In most of the case, at the end of the project, the data produced could be forget or unuseable for an other purpose.
A data repository service could introduce better practise for sustainable data management and will provide functionalities to identiffy, describe, store and make findable the data for others researchers than the data producer.
So the setting up of data repository could really increase the visibility of research work and finally data can use for other research projects.
The main objectives that data repository is to build the first piece
In detail we have a short term objective which is :
provide…including web plateform, user support and data curation
The mid term objective is to… that we can call directory function of our platform
Horizontal axis
Vertical axis
-Genomics which provides large amount of data coming from sequencing of rice genom or palm tree genom, for instance
-Marine : Coral data collection
and agro biodiversity
-Health could be epidemilogical data, clinical data, ebola or HIV sequencing
-Data rescue : a researcher is retiring, what’s happen with the database accumulated during this career ?
-Data preservation the issue is to provide secure and sufficient storage capabilities
- Data Sharing over the data producers for valorization through new citations in data papers or scientific communication
The principles which guide the implementation of the data repository are quite simple :
We want to limit …
Make …according the Fair principles
Support metadata harvesting from OAI-PMH protocol in the perspectives of directory data and to be harvestable by other repositories or search engine
The schema below shows the approach that we choose in term of interoperability and discovery facilities. The key principles is to have a generic and concise metadata model based on dublin core metadata elements
To ensure semantic consistency of data description , metadata value should be valuated using organisation, person and document standard identification system
To ensure semantic discovery, metadata value will be controlled for instance by domain categories list, location places or species taxonomy according to scientific domain
After several end-users meeting,, we are gathered (collected) a first set of requirements.
The main are the following :
Possibility to put data management in the hands of researchers: each data space can be managed by a different administrator : the description of dataset could be done by other person than the producer
Obtaining a secure temporary link for reviewers of an article related to unpublished data : it is a requirement which is more frequent during the review process of an scientific publication
- > Integrate an "ecosystem" of interoperable data repositories. It is essential to be harvest by french, european data repository or international search engines such as re3data.org, OpenAire, data cite search and so on..
Dataverse user interface propose classical functionalities to discover, filter the data repository with full text search and faceted search.
About faceted search. Interisting detail.:
Each domain community could customized the faceted search by the selection of own faceted category such as spatial coverage for enviromental data or in situ observed property such meteorological variables
A second screen shot of dataverse user interface. When you have chose a dataset.
Associated data files to download.
When is necessary, Dataverse propose various way to restrict data download or oblige the user to fill a guestbook to give the reason why he want to download the dataset
In complement of dataverse plateform, we are working on organisation around repository data cycle
The objective is to have an efficient data management cycle and deliver Fair data
The typical use of the data repository that we define is the following :
- Publication could be done under various status : fully open, closed, put an embargo of few months, years, put conditions on data use and so on.
depending on the scientific, legal restriction on dataset
Finally, the content of data repository will harvest by search engines or other data repositories
To conclude the presentation,, i want to give some perspectives to this on going work
Firstable, IRD data repository will open the next Spring (nineteen nine)
At short term, the data repository will be completed by
- Design and support to researchers and partners to build data Management plan
- We also plan to work on Training courses catalog for Researchers on Computer literacy and data management challenges
we should stenghten data scientist presence at IRD in research team which are the mediator between computer science and classical research thematics
At mid term
We have to work to reform the research assessement system in compliance with the San fransisco Declaration on Research assessment to introduce a better value of data sharing.
it will be a very long term work