Leonard&Dhollander_OpenScienceBelgium.pptx

•Download as PPTX, PDF•

0 likes•21 views

Presentation at the “Open Science: connecting the actors” event on the 21st of November 2022: Share best practices, foster community, and encourage knowledge-sharing on Open Science. At the heart of the Open Access Belgium community is the ambition to open up the way we organize and conduct scientific research. The Open Science teams of the Belgian universities have developed and tested a wide range of training methods, training materials, networking activities and data solutions to facilitate and foster Open Science. Achievements, tools and lessons learned by different institutions will be shared in this networking event. Programme can be found here: https://openaccess.be/2022/10/04/open-science-connecting-the-actors/

Science

POST-INGEST
CURATION:
CURATING WITHOUT AN
INSTITUTIONAL REPOSITORY
Kevin Leonard - Evelien Dhollander

Data Curation Requires Datasets
• Data curation: adding value to (meta)data for long-term preservation
• Imagined (ideal) workflow:
1. Researcher provides data to
curator for curation
• Voluntary submission
• Automatic part of ingest in
institutional repository
2. Curator makes changes and
recommendations
3. Data is put online for long-term
preservation
• Is that realistic for many institutions?

What Often Happens to Datasets,
Really?
Scientist Curator
+
+
+
+

Proposed Workflow
1. Find datasets online
• Employ existing data linking architectures
• Use repository APIs
2. Produce (meta)data augmentation plan for discovered
datasets
• Develop plan based on current best practices for FAIR metadata
• Recommend changes that maintain existing DOI networks
3. Provide researchers with an easily actionable curation plan

Step 1: Where Are The Datasets?
• Difficulties:
• Datasets are broadly distributed
• Affiliation information is not located in a consistent location (or format!)
• Existing data linking systems (e.g., Scholix, DataCite) have limited
coverage
• Solution:
• Use repository APIs to search for institutional datasets
• Search outside of just <creator><affiliation> field

Example Python Code
• Python code to search for institutional records
• searchQuery can include multiple items
• Universiteit Gent
• UGent
• Ghent University
• 00cv9y106 (ROR id)
• Saves DOIs of all datasets to csv
• Can use OAI-PMH to extract more metadata information
• Focused on several popular repositories, easily extended
• Zenodo, OSF, Dryad, Figshare, PANGAEA

Step 2: What To Do With What You’ve
Found
• Repositories often allow metadata fields to be edited
• WITHOUT triggering the creation of a new version (and therefore a new DOI)
• Editable fields vary by repository:
STRI
CT
LENIEN
T
• Editing any
metadata fields
creates a new
version
• Most fields can be
edited
• Title, authors,
relatedTO

Develop Recommendation Plan
• Is the title clear?
• Are keywords provided?
• Are there links to related publications?
• Do the authors have linked ORCIDs or affiliations?
• Is there sufficient documentation in a README?
• Can this information be provided in <description
descriptionType=“Abstract”>?

Step 3: Communicating the
Recommendations
• Implementation relies on participation of the researcher
• Curation plan must be easily actionable with clearly articulated benefits
• Reduce burden on researcher to interpret instructions
Metadata Field Current Value
Recommended
Changes
Rationale
Title
Abstract
…

Current results
• Currently, the code harvests >2000 total records
• Frequently encountered issues:
• Abstracts redundant with publication
• No direct contact information
• Missing keywords
Source
Number of Records
Found
DataCite 236
Dryad 196
Figshare 302
OSF 186
Pangaea 724
Zenodo 710

Conclusions
• Relatively simple method to provide value to existing datasets
• Benefits even if author declines to make recommended edits:
• Helps institution find their research outputs
• Provides researchers with FAIRness recommendations that they can
implement for future datasets
• Communicates the existence (and utility!) of data support staff

More information?
Contact us:
Kevin Leonard
kevinmichael.leonard@ugent.be
Evelien Dhollander
evelien.dhollander@ugent.be

The current repository infrastructure, which consists of thousands of repositories, does not make effective use of research indicators largely exploited by commercial players in the area. Research indicators, including citation counts and Mendeley reader counts, enable the development and improvement of functionality researchers use on a daily basis. For example, they make it possible to increase the performance in information retrieval and recommendation tasks and serve as an enabler for the development of research analytics & metrics functionality, such as the analysis of research trends or collaboration networks. We believe that there is a strong case for making a better use of these indicators within the repositories infrastructure to improve the functionality of services users rely on.

DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx

randyburney60861

DATA SCIENCE AND BIG DATA ANALYTICS CHAPTER 2: DATA ANALYTICS LIFECYCLE DATA ANALYTICS LIFECYCLE • Data science projects differ from BI projects • More exploratory in nature • Critical to have a project process • Participants should be thorough and rigorous • Break large projects into smaller pieces • Spend time to plan and scope the work • Documenting adds rigor and credibility DATA ANALYTICS LIFECYCLE • Data Analytics Lifecycle Overview • Phase 1: Discovery • Phase 2: Data Preparation • Phase 3: Model Planning • Phase 4: Model Building • Phase 5: Communicate Results • Phase 6: Operationalize • Case Study: GINA 2.1 DATA ANALYTICS LIFECYCLE OVERVIEW • The data analytic lifecycle is designed for Big Data problems and data science projects • With six phases the project work can occur in several phases simultaneously • The cycle is iterative to portray a real project • Work can return to earlier phases as new information is uncovered 2.1.1 KEY ROLES FOR A SUCCESSFUL ANALYTICS PROJECT KEY ROLES FOR A SUCCESSFUL ANALYTICS PROJECT • Business User – understands the domain area • Project Sponsor – provides requirements • Project Manager – ensures meeting objectives • Business Intelligence Analyst – provides business domain expertise based on deep understanding of the data • Database Administrator (DBA) – creates DB environment • Data Engineer – provides technical skills, assists data management and extraction, supports analytic sandbox • Data Scientist – provides analytic techniques and modeling 2.1.2 BACKGROUND AND OVERVIEW OF DATA ANALYTICS LIFECYCLE • Data Analytics Lifecycle defines the analytics process and best practices from discovery to project completion • The Lifecycle employs aspects of • Scientific method • Cross Industry Standard Process for Data Mining (CRISP-DM) • Process model for data mining • Davenport’s DELTA framework • Hubbard’s Applied Information Economics (AIE) approach • MAD Skills: New Analysis Practices for Big Data by Cohen et al. https://en.wikipedia.org/wiki/Scientific_method https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining http://www.informationweek.com/software/information-management/analytics-at-work-qanda-with-tom-davenport/d/d-id/1085869? https://en.wikipedia.org/wiki/Applied_information_economics https://pafnuty.wordpress.com/2013/03/15/reading-log-mad-skills-new-analysis-practices-for-big-data-cohen/ OVERVIEW OF DATA ANALYTICS LIFECYCLE 2.2 PHASE 1: DISCOVERY 2.2 PHASE 1: DISCOVERY 1. Learning the Business Domain 2. Resources 3. Framing the Problem 4. Identifying Key Stakeholders 5. Interviewing the Analytics Sponsor 6. Developing Initial Hypotheses 7. Identifying Potential Data Sources 2.3 PHASE 2: DATA PREPARATION 2.3 PHASE 2: DATA PREPARATION • Includes steps to explore, preprocess, and condition data • Create robust environment – analytics sandbox • Data preparation tends to be t.

Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...

The University of Edinburgh

Research Data Mangagement Essentials, 5th July 2017

Research Data Leeds

Data Privacy at Scale

DataWorks Summit

Member privacy is of paramount importance to LinkedIn. The company must protect the sensitive data users provide. On the other hand, our members join LinkedIn to find each other, necessitating the sharing of certain data. This privacy paradox can only be addressed by giving users control over where and how their data is used. While this approach is extremely important, it also presents scaling challenges. In this talk, we will discuss the challenges behind enforcing compliance at scale as well as LinkedIn's solution. Our comprehensive record-level offline compliance framework includes schema metadata tracking, alternate read-time views of the same dataset, physical purging of data on HDFS, and features for users to define custom filtering rules using SQL, assigning such customizations to specific datasets, groups of datasets, or use cases. We achieve this using many open-source projects like Hadoop, Hive, Gobblin, and Wherehows, as well as a homegrown data access layer called Dali. We also show how the same Hadoop-powered framework can be used for enforcing compliance on other stores like Pinot, Salesforce, and Espresso. While there is no one-size fits all solution to guaranteeing user data privacy, this talk will provide a blueprint and concrete example of how to enforce compliance at scale, which we hope proves useful to organizations working to improve their privacy commitments. ISSAC BUENROSTRO, Staff Software Engineer, LinkedIn and ANTHONY HSU, Staff Software Engineer, LinkedIn

A Data Citation Roadmap for Scholarly Data Repositories

LIBER Europe

Data Management for librarians

C. Tobin Magle

Are you interesting in offering data management services at your library but aren’t sure where to start? Then this class is for you! During this session, we will • Outline the data management topics that are commonly offered in libraries • Present strategies for how to determine what services might be most useful on your campus and create synergistic partnerships with other university entities • Dive into how to offer support with data management plans • Present a case study for using an institutional repository to archive and share research data • Identify additional training opportunities and open educational resources you can use to develop robust DM services The class will consist of a mix of presentations, hands on activities, and discussion. So come ready to participate!

Huge genomic datasets are being created all around the world, and their scale is accelerating. But these data gain greater meaning when analyzed in concert with other datasets stored in institutions around the world. Due to data residency restrictions, regulatory barriers, and sheer data volume, it is impossible to effectively centralize all of these data in one place. In order to achieve regional and global use of many data sets in concert, we must overcome these challenges with a new approach to managing, analyzing and sharing sequencing data: Federated Computing. Federated Computing is difficult from a technical perspective because of the variety of IT infrastructures and workflow engines available, which makes reproducibility across environments nearly impossible, and from a practical perspective because of privacy and competitive concerns among researchers. Federated Computing becomes easier with a scalable, open source, multi-platform, standards-based biomedical big data computing platform that can be deployed in public cloud, private cloud, and HPC environments, and enables bit-for-bit reproducibility of analyses across every deployment. We present Arvados (http://arvados.org), a free and open source platform for managing and processing biomedical data designed for scale, reproducibility, and federation. Workflows and queries can travel across multiple Arvados clusters, running exactly the same way on each one, regardless of the underlying compute & storage infrastructure.

Effective research data management

Catherine Gold

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

Tony Ross-Hellauer

OpenAIRE and EUDAT co-present this webinar which aims to introduce researchers and others to the concept of research data management (RDM). As well as presenting the benefits of taking an active approach to research data management – including increased speed and ease of access, efficiency (fund once, reuse many times), and improved quality and transparency of research – the webinar will advise on strategies for successful RDM, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management, stewardship and archiving. Webinar recording available: http://www.instantpresenter.com/eifl/EB57D6888147

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

OpenAIRE

Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |

EUDAT

A FAIR Approach to Publishing and Sharing Machine Learning Models

Ben Blaiszik

While there has been a significant increase in the amount of machine learning research across various domains of science, the processes to publish the results and make the resulting models and code available for reuse has been lacking. In this talk, we discuss FAIR data principles applied to machine learning models and how the Data and Learning Hub for Science (DLHub) can help make models more easily discoverable and usable in common scientific workflows. Visit https://www.dlhub.org for more information.

Crossref LIVE US Online

Crossref

The webinar held 6 October 2020. The webinar is relevant for new and existing Crossref members, publishers, editors, researchers, service providers, hosting platforms, funders, librarians; really anyone interested in finding out a bit more about what Crossref is and does. This webinar covers: • How to register content with Crossref • How to make updates to your metadata in order to make changes, corrections, or to add more detail • Participation reports • Additional services and where to find help. Sessions presented in English by Crossref staff.

L07 metadata

thplayer127

NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...

National Information Standards Organization (NISO)

5_UGent_TrainingCoP_Emilie_v2.pptx

OpenAccessBelgium

2022-11-21_FRDN_open access Belgium FINAL.pptx

OpenAccessBelgium

Similar to Leonard&Dhollander_OpenScienceBelgium.pptx

2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...

datacite

Revolutionising the Journal through Big Data Computational ResearchAmye Kenall

Biehl (2012) implementing a healthcare data warehouse

rbiehl

Love Your Data Locally

Erin D. Foster

Incentivising the uptake of reusable metadata in the survey production process

Louise Corti

Data Description Registry Interoperability WG at Research Data Alliance Third...

amiraryani

Lawrence-f1000-publishing with data-nfdp13

DataDryad

Data accessibilityandchallenges

jyotikhadake

Big data in action

Chad Richeson

Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum

Anita de Waard

RDM Roadmap to the Future, or: Lords and Ladies of the Data

Robin Rice

Curoverse Presentation at ICG-11 (November 2016)

Arvados

Effective research data management

Catherine Gold

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

Tony Ross-Hellauer

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

OpenAIRE

Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |

EUDAT

A FAIR Approach to Publishing and Sharing Machine Learning Models

Ben Blaiszik

Crossref LIVE US Online

Crossref

L07 metadata

thplayer127

NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...

National Information Standards Organization (NISO)

Similar to Leonard&Dhollander_OpenScienceBelgium.pptx (20)

2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...

Revolutionising the Journal through Big Data Computational Research

Biehl (2012) implementing a healthcare data warehouse

Love Your Data Locally

Incentivising the uptake of reusable metadata in the survey production process

Data Description Registry Interoperability WG at Research Data Alliance Third...

Lawrence-f1000-publishing with data-nfdp13

Data accessibilityandchallenges

Big data in action

Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum

RDM Roadmap to the Future, or: Lords and Ladies of the Data

Curoverse Presentation at ICG-11 (November 2016)

Effective research data management

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT

Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |

A FAIR Approach to Publishing and Sharing Machine Learning Models

Crossref LIVE US Online

L07 metadata

NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...

More from OpenAccessBelgium

5_UGent_TrainingCoP_Emilie_v2.pptx

OpenAccessBelgium

2022-11-21_FRDN_open access Belgium FINAL.pptx

OpenAccessBelgium

7_2022 11 21 OA support_KU Leuven.pptx

OpenAccessBelgium

20221121_OABE_DAFWB_JBiernaux.pptx

OpenAccessBelgium

6_ULiege_presentation.pdf

OpenAccessBelgium

20221121_KU Leuven Research Data Repository_OpenScienceBelgium.pptx

OpenAccessBelgium

1_OA Network Day 2022_Martijn Van Roie_YUFE.pptx

OpenAccessBelgium

3_OAweek2022_ULB_FVandooren.pdf

OpenAccessBelgium

2_ConnectingTheActors_VUB_LA_21_11_2022.pdf

OpenAccessBelgium

4_Open Access policy UHasselt.pptx

OpenAccessBelgium

Open science policy in flanders

OpenAccessBelgium

Belgium webinar - openAIRE Research Graph

OpenAccessBelgium

OpenAIRE – The path from OpenAIRE to EOSC in Belgium

OpenAccessBelgium

Open access Belgium

OpenAccessBelgium

Zenodo - The catch-all repository

OpenAccessBelgium

The OpenAIRE project, in the vanguard of the open access and open data movements in Europe was commissioned by the EC to support their nascent Open Data policy by providing a catch-all repository for EC funded research. CERN, an OpenAIRE partner and pioneer in open source, open access and open data, provided this capability and Zenodo was launched in May 2013. In support of its research programme CERN has developed tools for Big Data management and extended Digital Library capabilities for Open Data. Through Zenodo these Big Science tools could be effectively shared with the long-tail of research.

open peer review at BMC

OpenAccessBelgium

To address problems with the peer-review process, many journals have experimented with open_science_logodifferent types of peer-review models. Open peer review was adopted by several journals in order to encourage transparency in the process, and there are now a number of different ways in which this is implemented. By Axel Cleeremans (ULB), Chief Editor for Frontiers in Psychology, Louisa Flintoft, Executive Editor, BMC In-House Journals.

Open peer review : Introductuion

OpenAccessBelgium

Open access requirements F.N.R.S.

OpenAccessBelgium

20181024 oa week_rdm_myriam_mertens

OpenAccessBelgium

Gdrp pres oct_2018_niels_hen

OpenAccessBelgium

More from OpenAccessBelgium (20)

5_UGent_TrainingCoP_Emilie_v2.pptx

2022-11-21_FRDN_open access Belgium FINAL.pptx

7_2022 11 21 OA support_KU Leuven.pptx

20221121_OABE_DAFWB_JBiernaux.pptx

6_ULiege_presentation.pdf

20221121_KU Leuven Research Data Repository_OpenScienceBelgium.pptx

1_OA Network Day 2022_Martijn Van Roie_YUFE.pptx

3_OAweek2022_ULB_FVandooren.pdf

2_ConnectingTheActors_VUB_LA_21_11_2022.pdf

4_Open Access policy UHasselt.pptx

Open science policy in flanders

Belgium webinar - openAIRE Research Graph

OpenAIRE – The path from OpenAIRE to EOSC in Belgium

Open access Belgium

Zenodo - The catch-all repository

open peer review at BMC

Open peer review : Introductuion

Open access requirements F.N.R.S.

20181024 oa week_rdm_myriam_mertens

Gdrp pres oct_2018_niels_hen

Recently uploaded

Nucleic Acid-its structural and functional complexity.

Nistarini College, Purulia (W.B) India

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf

TinyAnderson

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...

Abdul Wali Khan University Mardan,kP,Pakistan

hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills

DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...

Wasswaderrick3

In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...

Sérgio Sacani

Since volcanic activity was first discovered on Io from Voyager images in 1979, changes on Io’s surface have been monitored from both spacecraft and ground-based telescopes. Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images show that a plume deposit from a powerful eruption at Pillan Patera has covered part of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive optics at visible wavelengths.

20240520 Planning a Circuit Simulator in JavaScript.pptx

Sharon Liu

Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf

frank0071

Lateral Ventricles.pdf very easy good diagrams comprehensive

silvermistyshot

Nucleophilic Addition of carbonyl compounds.pptx

SSR02

Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general. Carbonyls undergo addition reactions with a large range of nucleophiles. Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen. Electronic effects (inductive effects, electron donation) have a large impact on reactivity. Large groups adjacent to the carbonyl will slow the rate of reaction. Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.

Leaf Initiation, Growth and Differentiation.pdf

RenuJangid3

Richard's aventures in two entangled wonderlands

Richard Gill

Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.

Orion Air Quality Monitoring Systems - CWS

Columbia Weather Systems

ESR spectroscopy in liquid food and beverages.pptx

PRIYANKA PATEL

With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样

yqqaatn0

原版纸张【微信：741003700 】【(uvic毕业证书)维多利亚大学毕业证】【微信：741003700 】学位证，留信认证（真实可查，永久存档）offer、雅思、外壳等材料/诚信可靠,可直接看成品样本，帮您解决无法毕业带来的各种难题！外壳，原版制作，诚信可靠，可直接看成品样本。行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备。十五年致力于帮助留学生解决难题，包您满意。本公司拥有海外各大学样板无数，能完美还原海外各大学 Bachelor Diploma degree, Master Degree Diploma 1:1完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。材料咨询办理、认证咨询办理请加学历顾问Q/微741003700 留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才

Toxic effects of heavy metals : Lead and Arsenic

sanjana502982

Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...

frank0071

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx

MAGOTI ERNEST

Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024). Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).

Nutraceutical market, scope and growth: Herbal drug technology

Lokesh Patil

As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.

3D Hybrid PIC simulation of the plasma expansion (ISSS-14)

David Osipyan

SAR of Medicinal Chemistry 1st by dk.pdf

KrushnaDarade1

Recently uploaded (20)

Nucleic Acid-its structural and functional complexity.

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...

DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...

20240520 Planning a Circuit Simulator in JavaScript.pptx

Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf

Lateral Ventricles.pdf very easy good diagrams comprehensive

Nucleophilic Addition of carbonyl compounds.pptx

Leaf Initiation, Growth and Differentiation.pdf

Richard's aventures in two entangled wonderlands

Orion Air Quality Monitoring Systems - CWS

ESR spectroscopy in liquid food and beverages.pptx

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样

Toxic effects of heavy metals : Lead and Arsenic

Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx

Nutraceutical market, scope and growth: Herbal drug technology

3D Hybrid PIC simulation of the plasma expansion (ISSS-14)

SAR of Medicinal Chemistry 1st by dk.pdf

Leonard&Dhollander_OpenScienceBelgium.pptx

1. POST-INGEST CURATION: CURATING WITHOUT AN INSTITUTIONAL REPOSITORY Kevin Leonard - Evelien Dhollander

2. Data Curation Requires Datasets • Data curation: adding value to (meta)data for long-term preservation • Imagined (ideal) workflow: 1. Researcher provides data to curator for curation • Voluntary submission • Automatic part of ingest in institutional repository 2. Curator makes changes and recommendations 3. Data is put online for long-term preservation • Is that realistic for many institutions?

3. What Often Happens to Datasets, Really? Scientist Curator + + + +

4. Proposed Workflow 1. Find datasets online • Employ existing data linking architectures • Use repository APIs 2. Produce (meta)data augmentation plan for discovered datasets • Develop plan based on current best practices for FAIR metadata • Recommend changes that maintain existing DOI networks 3. Provide researchers with an easily actionable curation plan

5. Step 1: Where Are The Datasets? • Difficulties: • Datasets are broadly distributed • Affiliation information is not located in a consistent location (or format!) • Existing data linking systems (e.g., Scholix, DataCite) have limited coverage • Solution: • Use repository APIs to search for institutional datasets • Search outside of just <creator><affiliation> field

6. Example Python Code • Python code to search for institutional records • searchQuery can include multiple items • Universiteit Gent • UGent • Ghent University • 00cv9y106 (ROR id) • Saves DOIs of all datasets to csv • Can use OAI-PMH to extract more metadata information • Focused on several popular repositories, easily extended • Zenodo, OSF, Dryad, Figshare, PANGAEA

7. Step 2: What To Do With What You’ve Found • Repositories often allow metadata fields to be edited • WITHOUT triggering the creation of a new version (and therefore a new DOI) • Editable fields vary by repository: STRI CT LENIEN T • Editing any metadata fields creates a new version • Most fields can be edited • Title, authors, relatedTO

8. Develop Recommendation Plan • Is the title clear? • Are keywords provided? • Are there links to related publications? • Do the authors have linked ORCIDs or affiliations? • Is there sufficient documentation in a README? • Can this information be provided in <description descriptionType=“Abstract”>?

9. Step 3: Communicating the Recommendations • Implementation relies on participation of the researcher • Curation plan must be easily actionable with clearly articulated benefits • Reduce burden on researcher to interpret instructions Metadata Field Current Value Recommended Changes Rationale Title Abstract …

10. Current results • Currently, the code harvests >2000 total records • Frequently encountered issues: • Abstracts redundant with publication • No direct contact information • Missing keywords Source Number of Records Found DataCite 236 Dryad 196 Figshare 302 OSF 186 Pangaea 724 Zenodo 710

11. Conclusions • Relatively simple method to provide value to existing datasets • Benefits even if author declines to make recommended edits: • Helps institution find their research outputs • Provides researchers with FAIRness recommendations that they can implement for future datasets • Communicates the existence (and utility!) of data support staff

12. More information? Contact us: Kevin Leonard kevinmichael.leonard@ugent.be Evelien Dhollander evelien.dhollander@ugent.be

Editor's Notes

Thank you. Today we are going to talk about post-ingest curation, and what we at UGent have been considering to curate research data despite not having an institutional repository.
As I’m sure you are all aware, research institutions are becoming increasingly aware of the importance and utility of data curation, which, for our purposes, we can broadly define as any activity that adds value to data or metadata prior to its long-term preservation in a data repository. It is usually conceptualized as an ideal workflow, wherein the researcher provides their data to a curator for curation. This can either be a voluntary submission, as envisioned here in this diagram from the Data Curation Network, with the researcher actively seeking out and requesting the assistance of a curator, or it can be automatic, such as when a researcher deposits their data in an institutional repository and that institution’s curators can immediately begin to work on the data, making the curation a necessary part of the data’s path towards preservation. Regardless, the curator is then able to make changes and recommendations, ideally through some kind of back and forth dialogue with the researcher, before finally the curated data is put online for long-term preservation. Therefore, in this conception, curators always get to curate the data BEFORE it is published online. What we asked in this project is how realistic that workflow is for most researchers and institutions, and whether an alternate model might be necessary for cases in which such curation is not so automatic.
Looking outside the ideal, we turned our focus on what often actually happens to datasets in the research data lifecycle. The scientist completes some research project, generating a manuscript for publication and some associated data. They submit their manuscript to a journal and (as it approaches acceptance) have to publish their data online. Even if the researcher knows about curation services available to them at their institution, they might not feel that they have time to go through rounds of curation as they need their datasets online NOW, and so they circumvent the data curators and deposit the datasets directly into the general or domain-specific repository of their choice. They annotate the dataset with metadata according to their own understanding of best practices and what little time they have available to dedicate to documentation, and the data then sits online in the repository without ever having the opportunity to have value added to it by data curation specialists. Note that, while this workflow is possible for researchers from any institution, it’s especially likely for institutions that don’t have their own institutional repository, as the data curators will never have datasets automatically pass through their desks on the way to the institutional repository. What we aimed to do in this project is to define an alternate workflow for curators, wherein they can go out and find these datasets where they are posted online. Then, once they have knowledge of these datasets that are associated with their institution, they can develop individualized recommendation plans for the creators of those datasets, with the hope that the researcher implements those changes, thereby improving the FAIRness of those datasets.
Our proposed workflow comes in three steps: First, we find the datasets that have already been posted online. To do so, we first looked at existing data linking architectures, such as Scholix, but ultimately ended up relying on repository APIs for the most popular repositories for researchers from our institution. Then, once we’ve found the datasets through these various methods, we can develop augmentation plans for these uncovered datasets. This is because many popular repositories actually allow users to edit the metadata of their published datasets without triggering the generation of a new version, therefore preserving existing DOI link networks. So, it should not be considered “too late” to curate a dataset just because it has already been hosted online. There are still things that can be done to improve its FAIRness. Finally, once we’ve developed a set of recommendations for a given dataset in an online repository, the last step is to create an action plan that can be communicated to the researcher, providing them with an easily actionable way to improve the FAIRness of their own datasets.
So, the first step is to find the datasets online. This is more difficult than it sounds, because datasets are broadly distributed across many different repositories. To make matters worse, the affiliation information is not consistent, in location or format. Some records have the affiliation information associated with the creators. Some use the name of the institution written out in full, whereas others use the ROR, a specific id for institutions. For these and other complicated reasons, existing data linking systems end up missing a lot of the datasets that are out there. This can be easily verified… If you compare the results from using these services to just going onto one of these repository pages and entering the name of your institution into the search bar, you’ll find many records that these systems fail to pick up. Our solution then is to use the APIs to find as many additional institutional datasets as possible, and wherever possible, by searching outside of the CREATOR:AFFILIATION field.
We’ve written some python code which harnesses the APIs of popular repositories to search for institutional datasets. Importantly for us, and probably for many institutions in Belgium, is our institution is known by many names, all of which we see authors freely use when tagging their datasets. As currently implemented, the code saves the DOIs of all the datasets it finds to a csv because that is most important to ingest into the systems that we use, but you could easily use OAI-PMH or alternate systems to extract more metadata. Lastly, we focused on the main repositories which are used by researchers from Ghent University, but this could easily be extended to focus on other repositories, insofar as they have APIs to plug into.
Once the dataset records have been located, the next step is to figure out what to do with what you’ve found. The first part of that is determining what you can edit without triggering the creation of a new version (and therefore a new DOI). Even though these new DOIs are typically linked to the DOIs of the older versions, our thought was that it is avoid these potential issues. Different repositories vary with respect to which metadata fields are editable without triggering a new version, from very strict repositories (like Dryad) which allow essentially no editing, to very lenient repositories like Zenodo, for which you can edit almost anything, including the title, abstract, and authors.
Once you’ve decided which metadata fields are in principle editable, you can then develop an individualized recommendation plan for that record. What exact recommendations you provide will depend on your institution’s priorities, current best practices, but we’ve collected here a few of the major items that could be included in such a plan: is the title clear? Are there keywords? Has it been linked to a publication? Are there ORCIDs linked? Have the authors provided something like an ROR? Did they provide a detailed README, and if not, could that information be provided in the abstract field?
The last step is to communicate the recommendation plan to the researcher. Because the actual implementation of the plan relies on the participation of the researcher, steps should be taken to maximize the likelihood that they cooperate. For this, we envision a clearly articulated plan like in the table shown here, which outlines the metadata field in question, what that value currently contains, what the curators believe should be changed for that field, and their rationale. Anything that can reduce the burden on the researcher and lets them clearly see the reasoning and benefit behind the recommendations.
This is all very interesting off course but as the saying goes: ‘the proof of the pudding is in the eating’, so here are some of our results. By running the code we gathered over 2000 dataset records from five major repositories and DataCite. We analysed a subset of these records to get some idea of the issues we will encounter in the future. A first issue is the redundancy of abstracts: most datasets have the same abstract as their corresponding publication. This isn’t necessarily a big issue when a datasheet or README is provided for the dataset but when the abstract is the only information on the content of the dataset, not enough information might be present for researchers to reuse the data. A possible solution to thids was mentioned a few slides ago: a README could be provided in the abstract metadata field. A second issue is the absence of contact information in the dataset metadata. In most cases the contact information is found via the linked publication, so contact information can be found but this is not a good practice, we want to encourage researchers to provide contact information in their dataset metadata as well. A third issue concerns keywords as in: there are no keywords provided. The dataset should at least have some of the related publication’s keywords and ideally have its own specific keywords to improve FAIRness.
This is the basis of our proposal to provide curatorial benefits after a dataset has been already uploaded to a repository. Of course, we don’t want to suggest that this solves all problems, and it will not find ALL datasets, but it still has several benefits. Even if the author declines to make the recommended edits, the integration with repository APIs helps institutions find their research outputs. And the emails to the researchers, which include the detailed plan of how to improve the FAIRness of their datasets, provides the researchers with knowledge that they can carry with them in the future, and works as a way to let them know of the curatorial services that your institution might offer, and how they can help improve their online data.
If you would like more information, please contact us. Thank you for your attention!

Leonard&Dhollander_OpenScienceBelgium.pptx

Recommended

Recommended

More Related Content

Similar to Leonard&Dhollander_OpenScienceBelgium.pptx

Similar to Leonard&Dhollander_OpenScienceBelgium.pptx (20)

More from OpenAccessBelgium

More from OpenAccessBelgium (20)

Recently uploaded

Recently uploaded (20)

Leonard&Dhollander_OpenScienceBelgium.pptx

Editor's Notes