Good afternoon! My name is Yulia. I am a researcher and a PhD student at the University of Porto. Also, I am part of the TAIL project related to research data management. As the name suggests is a project focused in the long-tail of science.
Today I would like to present continuation of our work with B2Share and B2FIND: a use case with B2NOTE at the end of the RDM Workflow.
Currently, there are numerous initiatives to improve research data management and promote the FAIR data principles.
In this context, data reuse is influenced by contextual information that researchers associate with their data, so the description process plays a fundamental role.
However, there are many data reuse issues, such as lack of time, knowledge or support tools to create good quality metadata.
So, the TAIL project is currently developing the Dendro platform, at INESC TEC and the University of Porto, to support researchers organize, describe and prepare their data to be transferred for publication on data repositories, for long-term preservation ...
…...whether institutional, based on the CKAN, or external,
(CL) supported by international initiatives such as EUDAT B2SHARE.
In the context of the Data Pilot established between the TAIL team and EUDAT, an RDM workflow is proposed to integrate different tools, namely the Dendro platform, the EUDAT B2SHARE and B2NOTE.
Data is organized and described in Dendro; published on B2SHARE and annotated using the B2NOTE service.
In our use case the data was deposited on the training instance of B2SHARE, because the B2NOTE service was only available for testing and was under development.
First of all, the data was described on the Dendro platform, that has four main areas: user area, the file manager area, the description zone and the descriptor selection zone.
For describe the data the researcher can choose generic descriptors from Dublin Core or specific descriptors loaded by ontologies developed for domains like Hydrogen Production, Vehicle simulation and others. Combination of these descriptors help researcher provide metadata records with more detail.
In our use case, the researcher described his data using descriptors from Dublin Core, namely Title, Keywords, Description, License, Type, Format and Language.
When researcher decided that the data is ready for publication the data and its metadata are transferred to the B2SHARE repository
making them more visible to the scientific community, following the FAIR data principles.
In the process, all the descriptors recognized by the B2Share metadata schema are automatically filled in and displayed in the B2Share interface.
The remaining descriptors are included in an RDF file with the complete metadata record.
Now the data are published.
Anyone can locate the data, register on B2NOte service and annotate them. This can be done by author the data, collaborators of the projects, but also by some interested researchers, found the dataset and wants to reuse them.
In general this service enriches the information associated to datasets, with some informal metadata, like addition of missing information, comments about the use of dataset or limitations; in other words, any information that is considered useful to communicate to others.
Moreover, annotations are saved in a machine-readable format, according to the W3C Web Annotation model, to improve visibility and findability.
(as you seen) B2note have three types of annotations: (semantic tags, free text keywords, and comments).
In our use case the researcher needed to add the detail information about tool used in his project and created the comment with description of the OpenNLP tool.
(CL) All annotations created through B2Note are visible to all registered users and they can create their own annotations for each file in this dataset.
(Cl) Therefore, it is possible to view both annotations created by the author of this dataset and all annotations in that file created by other users.
(as you seen) In addition, registered users can search the annotated file and
(CL) export all annotations to JSON-LD or RDF file for their own purposes.
Although users can (AND) add their own information about each file in this dataset,
(CL) they are identified by username and can only edit their own annotations.
(Cl) The author of this dataset at any time can view all additional information about the dataset and reply.
Data reuse is strongly influenced by the contextual information that data creators provide.
The Dendro + B2SHARE + B2NOTE workflow covers important stages of the data lifecycle, making the data easy interpreted by others, consequently more reusable.
Thank You for your attention! I would like to invite you to read our paper of this topic.
Moreover, we have a demo-version of Dendro, for you to explore!