Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop III, November 8, 2021.
This document discusses challenges around scholarly data, including fragmented and poorly described data. It emphasizes the importance of experimental details, data availability, and data publication for reproducibility. Springer Nature's Scientific Data is highlighted as a new open-access journal for detailed data descriptors. The Scientific Data ISA-explorer is presented as a web application for discovering, exploring and visualizing data descriptors.
Increased access to the data generated is fuelling increased consumption and accelerating the cycle of discovery. But the successful integration and re-use of heterogeneous data from multiple providers and scientific domains is a major challenge within academia and industry, often due to incomplete description of the study details or metadata about the study. Using the BioSharing, ISA Commons and the STATistics Ontology (STATO) projects as exemplar community efforts, in this breakout session we will discuss the evolving portfolio of community-based standards and methods for structuring and curating datasets, from experimental descriptions to the results of analysis.
http://www.methodsinecologyandevolution.org/view/0/events.html#Data_workshop
Westminster Higher Education Forum policy conference Open research data in the UK: https://www.westminsterforumprojects.co.uk/conference/open-research-data-20
Role of BioSharing in domain disciplinary ‘research data management protocols’ (RDMP): life science use case.
RDA IG Domain Repositories IG session on "Community-driven Research Data Management: Towards Domain Protocols for Research Data Management" https://www.rd-alliance.org/ig-domain-repositories-rda-9th-plenary-meeting
Breif overview of the FAIR Cookbook for the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop III, November 8, 2021.
This document discusses challenges around scholarly data, including fragmented and poorly described data. It emphasizes the importance of experimental details, data availability, and data publication for reproducibility. Springer Nature's Scientific Data is highlighted as a new open-access journal for detailed data descriptors. The Scientific Data ISA-explorer is presented as a web application for discovering, exploring and visualizing data descriptors.
Increased access to the data generated is fuelling increased consumption and accelerating the cycle of discovery. But the successful integration and re-use of heterogeneous data from multiple providers and scientific domains is a major challenge within academia and industry, often due to incomplete description of the study details or metadata about the study. Using the BioSharing, ISA Commons and the STATistics Ontology (STATO) projects as exemplar community efforts, in this breakout session we will discuss the evolving portfolio of community-based standards and methods for structuring and curating datasets, from experimental descriptions to the results of analysis.
http://www.methodsinecologyandevolution.org/view/0/events.html#Data_workshop
Westminster Higher Education Forum policy conference Open research data in the UK: https://www.westminsterforumprojects.co.uk/conference/open-research-data-20
Role of BioSharing in domain disciplinary ‘research data management protocols’ (RDMP): life science use case.
RDA IG Domain Repositories IG session on "Community-driven Research Data Management: Towards Domain Protocols for Research Data Management" https://www.rd-alliance.org/ig-domain-repositories-rda-9th-plenary-meeting
Breif overview of the FAIR Cookbook for the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
Brief introduction
to the FAIRsharing collaborative, including work with DataCite and publishers on "Repository Selection: Criteria that Matter", and with COS on applying TOP guidelines to data policies registered in FAIRsharing
This document summarizes the international FAIR movement, which developed principles to enhance the value of digital resources. The Findable, Accessible, Interoperable, Reusable (FAIR) principles were developed in 2014 and endorsed by researchers, funders, and other stakeholders in 2016. The principles aim to make data and other digital resources discoverable, accessible, interoperable, and reusable for both humans and machines. Implementing FAIR requires effort across technical, social, and cultural dimensions and long-term investment but helps ensure better science and more efficient research.
The FAIR Cookbook poster, as presented at the ELIXIR-UK Node and the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
Impact Narrative; Research Librarian Support Day February 8th 2016SusanMRob
This document summarizes a presentation about increasing emphasis on research impact outside of academia. It discusses why stakeholders are focusing more on impact, how to describe impact in funding applications, and examples of impact statements. Specifically, it notes that governments and other funders want evidence that research provides benefits beyond academic publications. Applicants should describe both realized and aspirational impacts in various sections of funding proposals. Impact statements for top publications should be 30 words and explain the significance or influence of the work.
Open access - where are we now and where to from here?SusanMRob
This document summarizes Virginia Barbour's presentation on open access publishing. It discusses where open access is now, with many institutions and funders adopting green open access policies that support archiving publications in institutional repositories. However, policies still vary in strength. It also discusses the developing open access publishing ecosystem, which includes preprints, journals, archiving, and innovations circumventing traditional publishers. Going forward, it argues that coordinated high-level policy action is needed regarding licensing standards, funding flows, and making open access a formal part of research infrastructure globally. Recent policy developments in various countries show the field is poised for significant changes in 2016.
The FAIR Cookbook poster, as presented at the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
FAIRsharing presentation at the Japan Science and Technology AgencyPeter McQuilton
A 30 minute seminar presented at the National Bioscience Database Center, part of the Japanese Science and Technology Agency, based in Tokyo, Japan. This presentation covers the FAIR Principles, the aims, methodology and use of FAIRsharing, related projects such as Bioschemas, and international initiatives such as ELIXIR and EOSC.
Susanna-Assunta Sansone is a data consultant and honorary academic editor who works on several projects related to making data FAIR (Findable, Accessible, Interoperable, Reusable). She is the associate director of Scientific Data, a peer-reviewed journal focused on publishing data descriptors to describe and provide access to scientifically valuable datasets. The goal of Scientific Data is to help promote open science and data reuse by publishing structured metadata and narratives about datasets alongside traditional research articles.
FAIRsharing is a registry of standards, databases, repositories, and data policies that are curated and interlinked to accelerate discovery and reuse. It provides tools and services to guide users in selecting these resources for describing and sharing research data. The registry classifies over 1,000 records to maximize findability and tracks the evolution of resources over time. FAIRsharing is a core part of the FAIR data ecosystem and enables the FAIR principles by ensuring these resources are findable, accessible, interoperable, and reusable.
The document discusses problems with the traditional scholarly publishing model and how scholarship is being transformed through open access. It summarizes that under the traditional model, commercial publishers profit while libraries face rising subscription costs and authors sign away their rights. This limits access to scholarly work. However, open access provides a solution by making research freely available online under open licenses. The document recommends authors publish in open access journals, deposit work in open repositories, understand their copyright options, and advocate for open access to maximize distribution and impact of their research.
FAIRsharing - manually curated metadata on standards, repositories and data p...Peter McQuilton
A 10 minute presentation on FAIRsharing, highlighting the manually curated metadata we provide on domain specific and cross-domain standards (ontologies, reporting guidelines, identifier schema, models and formats), databases (both knowledgebases and repositories) and data policies from funders and journal publishers. Presented at the RDA P14 meeting in Helsinki, Finland (October 2019).
The document discusses the University of Kentucky Libraries' efforts to build a digital repository by leveraging partnerships across campus. It outlines how the library advocated for a campus-wide repository model in 2007 and began populating the UKnowledge repository. As new data management requirements emerged from funders like NSF and NIH, the library explored technical options and settled on a microservices-based approach using Hydra, Archivematica, and CDL microservices. The library's roles include technical leadership, metadata, and data management plans, while IT provides storage and infrastructure and research provides policies and proposal support. The initial scope is serving research data needs, with potential future expansion to an enterprise repository.
This presentation was provided by Jill Emery of Portland State University during a NISO webinar on the topic of OA and acquisitions, delivered on Sept 7, 2016
A 10 minute presentation for the virtual ELIXIR All Hands Meeting 2020 - FAIRification mini symposium. In this presentation I talk about some of the community work we do in FAIRsharing, from sharing our metadata with other resources to research on data policy repository criteria.
2021 04 Introduction to FAIRsharing - cinecaAllyson Lister
Part of the The “How FAIR are you” webinar series and hackathon, which aim at increasing and facilitating the uptake of FAIR approaches into software, training materials and cohort data, to facilitate responsible and ethical data and resource sharing and implementation of federated applications for data analysis.
More information at
* the webinar page: https://www.cineca-project.eu/news-events-all/how-fair-are-you-hackathon
* the recording of the talk: https://www.youtube.com/watch?v=UdGZOynyuGo
The Australian National Data Service (ANDS) Vocabulary Service aims to make Australia's research data more valuable by organizing controlled vocabularies. It provides a portal for discovering, learning about, and integrating shared vocabularies. Organizations can also upload their own vocabularies to the service to share and receive automated updates. The service helps reduce duplication of effort by advocating for existing vocabularies and allowing multiple users to access and adapt shared vocabularies in a machine-readable format. Support is available to help users set up and integrate vocabularies through the ANDS liaison and online documentation.
The Growing Call for Open Access - Heather Joseph (2007)faflrt
Heather Joseph, formerly of BioOne and currently the Executive Director of SPARC (Scholarly Publishing & Academic Resources Coalition) discussed her group’s advocacy efforts related to Open Access and the Federal Research Public Access Act of 2006. Sponsored by ALA Federal and Armed Forces Libraries Roundtable (FAFLRT). Presented on June 25, 2007 at ALA Annual Conference in Washington, DC.
Presentation at the Workshop on Open Citations, University of Bologna, Bologna, Italy, September 4, 2018.
I will demonstrate the use of the VOSviewer software (www.vosviewer.com), of which I am one of the developers, for creating bibliometric visualizations of science based on openly available bibliographic data sources. Both the use of Crossref data and the use of data from the OpenCitations Corpus will be demonstrated. In addition, I will show how data from Dimensions can be used. The possibilities and limitations of the currently available open data sources will be discussed, also in comparison with more established data sources such as Web of Science and Scopus. Finally, I will provide my perspective on future developments, focusing especially on the integration of open data sources and visual analysis tools.
Discovery event stuart lee (the humanities researcher)RDTF-Discovery
The document discusses the challenges faced by humanities researchers in the digital age. It outlines four phases of research: project planning, data creation, storage and retrieval, and safeguarding knowledge. Researchers now have opportunities to access and analyze large quantities of data from different sources using new tools. However, challenges remain around lack of standardization, technical skills and long-term sustainability. The document calls for training researchers in data management, promoting open data and metadata, providing better analysis tools, and rethinking how research is disseminated and preserved for the long term.
Training in Data Curation as Service in aFederated Data Infrastructure - the...Andrea Scharnhorst
The document discusses DANS, an institute in the Netherlands that promotes and provides permanent access to digital research information. DANS operates an electronic archiving system called EASY that allows researchers to self-deposit publications, theses, datasets, and other research materials. It also operates NARCIS, a portal that makes research information discoverable. DANS provides data curation and consulting services and conducts research on long-term data availability through its eResearch program. It advocates for a "front office back office" model where national organizations provide front-facing services to researchers while technical infrastructure and support is handled by back office organizations. The document raises questions about training needs, responsibilities for research data archiving, and how to organize professional compet
Brief introduction
to the FAIRsharing collaborative, including work with DataCite and publishers on "Repository Selection: Criteria that Matter", and with COS on applying TOP guidelines to data policies registered in FAIRsharing
This document summarizes the international FAIR movement, which developed principles to enhance the value of digital resources. The Findable, Accessible, Interoperable, Reusable (FAIR) principles were developed in 2014 and endorsed by researchers, funders, and other stakeholders in 2016. The principles aim to make data and other digital resources discoverable, accessible, interoperable, and reusable for both humans and machines. Implementing FAIR requires effort across technical, social, and cultural dimensions and long-term investment but helps ensure better science and more efficient research.
The FAIR Cookbook poster, as presented at the ELIXIR-UK Node and the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
Impact Narrative; Research Librarian Support Day February 8th 2016SusanMRob
This document summarizes a presentation about increasing emphasis on research impact outside of academia. It discusses why stakeholders are focusing more on impact, how to describe impact in funding applications, and examples of impact statements. Specifically, it notes that governments and other funders want evidence that research provides benefits beyond academic publications. Applicants should describe both realized and aspirational impacts in various sections of funding proposals. Impact statements for top publications should be 30 words and explain the significance or influence of the work.
Open access - where are we now and where to from here?SusanMRob
This document summarizes Virginia Barbour's presentation on open access publishing. It discusses where open access is now, with many institutions and funders adopting green open access policies that support archiving publications in institutional repositories. However, policies still vary in strength. It also discusses the developing open access publishing ecosystem, which includes preprints, journals, archiving, and innovations circumventing traditional publishers. Going forward, it argues that coordinated high-level policy action is needed regarding licensing standards, funding flows, and making open access a formal part of research infrastructure globally. Recent policy developments in various countries show the field is poised for significant changes in 2016.
The FAIR Cookbook poster, as presented at the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
FAIRsharing presentation at the Japan Science and Technology AgencyPeter McQuilton
A 30 minute seminar presented at the National Bioscience Database Center, part of the Japanese Science and Technology Agency, based in Tokyo, Japan. This presentation covers the FAIR Principles, the aims, methodology and use of FAIRsharing, related projects such as Bioschemas, and international initiatives such as ELIXIR and EOSC.
Susanna-Assunta Sansone is a data consultant and honorary academic editor who works on several projects related to making data FAIR (Findable, Accessible, Interoperable, Reusable). She is the associate director of Scientific Data, a peer-reviewed journal focused on publishing data descriptors to describe and provide access to scientifically valuable datasets. The goal of Scientific Data is to help promote open science and data reuse by publishing structured metadata and narratives about datasets alongside traditional research articles.
FAIRsharing is a registry of standards, databases, repositories, and data policies that are curated and interlinked to accelerate discovery and reuse. It provides tools and services to guide users in selecting these resources for describing and sharing research data. The registry classifies over 1,000 records to maximize findability and tracks the evolution of resources over time. FAIRsharing is a core part of the FAIR data ecosystem and enables the FAIR principles by ensuring these resources are findable, accessible, interoperable, and reusable.
The document discusses problems with the traditional scholarly publishing model and how scholarship is being transformed through open access. It summarizes that under the traditional model, commercial publishers profit while libraries face rising subscription costs and authors sign away their rights. This limits access to scholarly work. However, open access provides a solution by making research freely available online under open licenses. The document recommends authors publish in open access journals, deposit work in open repositories, understand their copyright options, and advocate for open access to maximize distribution and impact of their research.
FAIRsharing - manually curated metadata on standards, repositories and data p...Peter McQuilton
A 10 minute presentation on FAIRsharing, highlighting the manually curated metadata we provide on domain specific and cross-domain standards (ontologies, reporting guidelines, identifier schema, models and formats), databases (both knowledgebases and repositories) and data policies from funders and journal publishers. Presented at the RDA P14 meeting in Helsinki, Finland (October 2019).
The document discusses the University of Kentucky Libraries' efforts to build a digital repository by leveraging partnerships across campus. It outlines how the library advocated for a campus-wide repository model in 2007 and began populating the UKnowledge repository. As new data management requirements emerged from funders like NSF and NIH, the library explored technical options and settled on a microservices-based approach using Hydra, Archivematica, and CDL microservices. The library's roles include technical leadership, metadata, and data management plans, while IT provides storage and infrastructure and research provides policies and proposal support. The initial scope is serving research data needs, with potential future expansion to an enterprise repository.
This presentation was provided by Jill Emery of Portland State University during a NISO webinar on the topic of OA and acquisitions, delivered on Sept 7, 2016
A 10 minute presentation for the virtual ELIXIR All Hands Meeting 2020 - FAIRification mini symposium. In this presentation I talk about some of the community work we do in FAIRsharing, from sharing our metadata with other resources to research on data policy repository criteria.
2021 04 Introduction to FAIRsharing - cinecaAllyson Lister
Part of the The “How FAIR are you” webinar series and hackathon, which aim at increasing and facilitating the uptake of FAIR approaches into software, training materials and cohort data, to facilitate responsible and ethical data and resource sharing and implementation of federated applications for data analysis.
More information at
* the webinar page: https://www.cineca-project.eu/news-events-all/how-fair-are-you-hackathon
* the recording of the talk: https://www.youtube.com/watch?v=UdGZOynyuGo
The Australian National Data Service (ANDS) Vocabulary Service aims to make Australia's research data more valuable by organizing controlled vocabularies. It provides a portal for discovering, learning about, and integrating shared vocabularies. Organizations can also upload their own vocabularies to the service to share and receive automated updates. The service helps reduce duplication of effort by advocating for existing vocabularies and allowing multiple users to access and adapt shared vocabularies in a machine-readable format. Support is available to help users set up and integrate vocabularies through the ANDS liaison and online documentation.
The Growing Call for Open Access - Heather Joseph (2007)faflrt
Heather Joseph, formerly of BioOne and currently the Executive Director of SPARC (Scholarly Publishing & Academic Resources Coalition) discussed her group’s advocacy efforts related to Open Access and the Federal Research Public Access Act of 2006. Sponsored by ALA Federal and Armed Forces Libraries Roundtable (FAFLRT). Presented on June 25, 2007 at ALA Annual Conference in Washington, DC.
Presentation at the Workshop on Open Citations, University of Bologna, Bologna, Italy, September 4, 2018.
I will demonstrate the use of the VOSviewer software (www.vosviewer.com), of which I am one of the developers, for creating bibliometric visualizations of science based on openly available bibliographic data sources. Both the use of Crossref data and the use of data from the OpenCitations Corpus will be demonstrated. In addition, I will show how data from Dimensions can be used. The possibilities and limitations of the currently available open data sources will be discussed, also in comparison with more established data sources such as Web of Science and Scopus. Finally, I will provide my perspective on future developments, focusing especially on the integration of open data sources and visual analysis tools.
Discovery event stuart lee (the humanities researcher)RDTF-Discovery
The document discusses the challenges faced by humanities researchers in the digital age. It outlines four phases of research: project planning, data creation, storage and retrieval, and safeguarding knowledge. Researchers now have opportunities to access and analyze large quantities of data from different sources using new tools. However, challenges remain around lack of standardization, technical skills and long-term sustainability. The document calls for training researchers in data management, promoting open data and metadata, providing better analysis tools, and rethinking how research is disseminated and preserved for the long term.
Training in Data Curation as Service in aFederated Data Infrastructure - the...Andrea Scharnhorst
The document discusses DANS, an institute in the Netherlands that promotes and provides permanent access to digital research information. DANS operates an electronic archiving system called EASY that allows researchers to self-deposit publications, theses, datasets, and other research materials. It also operates NARCIS, a portal that makes research information discoverable. DANS provides data curation and consulting services and conducts research on long-term data availability through its eResearch program. It advocates for a "front office back office" model where national organizations provide front-facing services to researchers while technical infrastructure and support is handled by back office organizations. The document raises questions about training needs, responsibilities for research data archiving, and how to organize professional compet
OSGIS Conference: report on RDA/MPG Science workshopHerman Stehouwer
1) A workshop on data was held in Munich with 15 leading scientists and data practitioners to discuss challenges around data sharing, reuse, and persistence.
2) Key topics discussed included incentives for sharing data, establishing trust in data quality, and ensuring long-term access to data.
3) Participants made recommendations for the Research Data Alliance (RDA) to develop interoperable standards and specifications to facilitate more widespread and sustainable data sharing across disciplines.
This document summarizes a workshop on open science and open data for librarians. The workshop covered introducing open science and open data, how data can inform the library profession and support research, tools and applications for working with data, and developing a data strategy for libraries. It discussed stakeholders in research data, why librarians are important data partners, the role of librarians in advocating for open data and managing repositories. The workshop also covered data skills needed by librarians and introducing trusted data repositories.
This document is a learning material for a workshop on research data management. It introduces key concepts in research data management (RDM) such as defining RDM and digital curation. It discusses the research process and challenges around managing large amounts of diverse research data. It also covers drivers for taking RDM seriously such as funder mandates, benefits of open data, and the strategic context of RDM as an emerging field. Learners are guided through reflection activities to relate RDM concepts to their own research interests and roles as information professionals.
Why should I care about information literacy? nmjb
This document summarizes a workshop on improving researchers' competency in information handling and data management. The workshop covered how information literacy relates to researcher development, defined information literacy using the 7 Pillars model, and discussed national initiatives and case studies in applying information literacy. Participants engaged in group work applying information literacy concepts to the Researcher Development Framework and discussed motivation and examples of good practice in supporting information literacy development.
This document summarizes challenges and efforts around managing research data in the arts and humanities. It discusses how "data" is not clearly defined in these domains as it is in STEM fields. Universities like UAL and GSA are working to educate researchers on identifying, organizing, and sharing their diverse research outputs and formats. This includes developing data repositories, training, and communities of practice to establish best practices and support researchers in meeting new data management policies and obligations. While there are fewer external funder requirements compared to STEM, these universities are using collaborative approaches to engage arts and humanities researchers in responsible research data management.
This document summarizes a course for doctoral students on research data management and open data. It discusses:
- The complexity and diversity of research methodologies and data types.
- An open data project in Slovenia that aimed to establish national policies through stakeholder interviews and workshops.
- The research and data lifecycles, highlighting key roles and responsibilities at different stages for researchers, institutions, libraries, and funders.
- The role of data services in managing data through the lifecycle, from depositors to curation to access for users.
Presentation - First International Library Staff Exchange Week, ZagrebIva Vrkic
Librarians at the Faculty of Science in Zagreb provide information literacy courses for graduate students and scholars. Topics covered include using plagiarism detection software, changes in scientific publishing, and copyright issues. Plans exist to expand offerings to include workshops for freshmen. Librarians look to colleagues at the University of Zagreb for inspiration on developing robust education programs.
Libraries at Harvard and Oxford offer diverse information literacy instruction through workshops, seminars, and online/hybrid courses. Common topics are using library resources, research skills like literature reviews, data management, reference management software, and open scholarship issues. Both institutions dedicate over 50% of instruction to online formats, with the remainder split between in-person and hybrid
Research Data Management in the Humanities and Social SciencesCelia Emmelhainz
This document provides an introduction to research data management for humanities and social sciences librarians. It discusses why data management is an important part of a librarian's role in supporting faculty research, and some key concepts in data management including data formats, storage, security, preservation, and sharing. The document emphasizes that while librarians do not need to be data experts, having a basic understanding of data management concepts can help librarians better serve faculty research needs and expand their role on campus.
Research data management during and after your research ; an introduction / L...Leon Osinski
This document outlines a workshop on research data management for PhD students. The workshop covers managing data during research to ensure integrity and allow replication, as well as archiving or publishing data after research. During the workshop, presentations will discuss scientific integrity and data management during research, and data management after research. Discussions will explore topics like dealing with failed experiments, accessibility of data during research, and archiving data after a project is finished. The goal is to provide insight on responsible data practices during and after research.
IFLA ARL Webinar Series: Research Ethics in an Open Research EnvironmentIFLAAcademicandResea
The document summarizes a presentation on institutional data support in the open research environment at Nanyang Technological University in Singapore. It discusses NTU's policies on research integrity and data governance, as well as the support provided through its research data infrastructure, education and training programs, and recognition initiatives. The presentation highlights lessons learned around making it easier for researchers to practice FAIR data sharing principles and clarifying language around data classification and use. It also emphasizes the importance of awareness and recognition activities to promote open data practices.
Lecture at an event "SEEDS Kick-off meeting", FORS, Lausanne, Switzerland.
Related page: http://www.snf.ch/en/funding/programmes/scopes/Pages/default.aspx
http://seedsproject.ch/?p=1
Presenter: Peter Burnhill, Director, EDINA national academic data centre, University of Edinburgh, Scotland UK
Presentation given at Beyond Books: What STM & Social Science publishing should learn from each other Marriott Hotel/Kensington, London, 22 April 2010
presentation at Electronic Resources & Libraries, April 5, 2016
http://erl2016.sched.org/event/5ZQN/s45-sharing-and-reuse-of-scientific-and-research-data-risky-for-privacy
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017ARDC
The Australian National Data Service (ANDS) aims to make Australian research data more valuable by partnering with research organizations and funding data projects. In 2015, ANDS conducted over 100 workshops and events with over 4,000 participants and developed online resources. ANDS provides guides on topics like data management and the FAIR data principles. ANDS also advocates for practices like data citation and publishing to ensure research data is preserved and reusable over time. The presentation outlines ANDS' role in supporting good research data management practices and sharing to ensure the integrity and impact of research evidence.
The document provides an overview of a presentation on open science and open data for librarians. It includes:
- An introduction to open science/open data concepts and the library's role in research data services.
- Examples of activities working with research data, including data collection, visualization, cleaning, analysis and preservation.
- A discussion of the benefits of open data, challenges researchers face in opening their data, and the role of data repositories and standards.
- An overview of the African Open Science Platform project which aims to promote open science on the continent.
This document summarizes a presentation given by Susanna Sansone at the GSC 23rd meeting education day in Bangkok, Thailand on August 7, 2023. The presentation discussed standards across life sciences, including definitions of different types of standards and over 1,600 identified standards. It covered standard organizations and grassroots groups, as well as the FAIRsharing database which catalogs over 2,885 standards and databases and aims to promote their use and value across research.
The FAIRsharing journey in RDA document discusses:
1) FAIRsharing's growth and involvement with RDA since 2011, including its Working Group established in 2015 to curate standards, databases, and policies to promote FAIR data.
2) FAIRsharing's current activities and impact, such as its registry of over 4,000 records from many disciplines and usage in various tools and services.
3) Opportunities for further engagement with RDA, such as leveraging their expertise for contributions to the FAIR Cookbook, an open resource providing technical recipes for applying FAIR principles to life science data.
Overview of metadata standards, and how FAIRsharing and the FAIR Cookbook help selecting and using them. Presentation to the What is metadata? Common standards and properties. EHP Workshop, November 9, 2022: https://ephconference.eu/pre-conference-programme-441
Pharmas and academia are joining forces to make data FAIR (Findable, Accessible, Interoperable, and Reusable) through the development of the FAIR Cookbook. The FAIR Cookbook provides over 70 recipes and growing that give step-by-step guidance on improving the FAIRness of different data types through the use of tools, technologies, and best practices. It aims to provide practical examples and guidelines to support researchers, data managers, and others in managing data according to FAIR principles. The FAIR Cookbook is an open, community-developed resource overseen by an editorial board, with contributions from nearly 100 life sciences professionals.
FAIR, community standards and data FAIRification: components and recipesSusanna-Assunta Sansone
Overview of FAIR, FAIRsharing and the FAIR Cookbook at the ATI event on Knowledge Graphs: https://github.com/turing-knowledge-graphs/meet-ups/blob/main/symposium-2022.md
Presentation to the EOSC workshop on policies (https://www.google.com/url?q=https://eoscfuture.eu/eventsfuture/monitoring-eosc-readiness-fair-data-policies) on what FAIRsharing does for policies, including providing registration, discovery, flexible and clearer descriptions, relationships, machine readability and comparability.
The document summarizes how FAIRsharing assists others with promoting FAIR data principles without directly assessing FAIRness compliance. It does this by (1) providing a lookup service for standards and repositories via its API, (2) serving as a registry for FAIRness tests and indicators to make them discoverable, and (3) enabling communities to create profiles declaring which standards and repositories they use. The document also outlines FAIRsharing's operations, advisory boards, and future plans to further support assessment and tracking of FAIRness improvements over time.
ELIXIR is a European infrastructure that brings together life science resources from across Europe. It offers databases, tools, computing capabilities, and training opportunities. ELIXIR nodes provide these services and connect national data infrastructures. ELIXIR communities connect infrastructure experts to drive service developments. ELIXIR is funded through a mixed model including public sources. It works to sustain important biological data resources and make data FAIR through recommended standards and interoperability resources. ELIXIR also aims to develop a sustainable tools ecosystem and provides training through its portal.
Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop I, October 11, 2021.
Brief introduction to FAIRsharing work with industry (publishers, pharmas) and the FAIR Cookbook (for the Life Science): https://www.opensciencefair.eu/2021/workshops/applying-fair-principles-to-open-science-and-industry-to-drive-innovation-challenges-and-opportunities
This document summarizes Susanna-Assunta Sansone's presentation on FAIRsharing, a global resource for data repositories, standards, and policies that promotes FAIR data principles. FAIRsharing guides users to discover and select these resources and helps data producers make their resources more visible, widely adopted and cited. It contains over 3,500 indexed resources and has a dedicated collection of COVID-19 data sharing platforms and registries. The presentation discusses using FAIRsharing to map relationships between repositories in the collection and to external repositories and standards. It also notes the importance of stronger data policies from publishers to ensure access and reuse of COVID-19 research data.
Presentation to the "FAIRification put into practice: Characterization of energy data and development of workflows" event by https://www.eeradata.eu => https://www.eeradata.eu/event/2857:online-discussion-fairification-put-into-practice-characterization-of-energy-data-and-development-of-workflows.html#
Presented at http://mcbios-maqc.org. The FAIR Principles have propelled the global debate in all disciplines about better RDM, transparent and reproducible data worldwide, and in all disciplines. FAIR has de facto become a global norm for good RDM, a prerequisite for data science, since their endorsement by global and intergovernmental leaders. Funding bodies are consolidating FAIR into their funding agreements; publishers have united behind FAIR as a way to remain at the forefront of open research; and in the private sector FAIR is adopted and enshrined in policy in major biopharmas, libraries, and unions. FAIR is changing the culture of data science, but work is needed to turn the principles into reality. I will use the work of the FAIRplus project as examplar to illustrate challenges and progresses.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"sameer shah
Embark on a captivating financial journey with 'Financial Odyssey,' our hackathon project. Delve deep into the past performance of two companies as we employ an array of financial statement analysis techniques. From ratio analysis to trend analysis, uncover insights crucial for informed decision-making in the dynamic world of finance."
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
DTP2016
1. Introduction to the Data Management Day
Friday Dec 9th, 2016
Data Consultant, Founding Academic Editor
Susanna-Assunta Sansone, PhD.
Associate Director,
Principal Investigator
Philippe Rocca-Serra, PhD.
Senior Research Lecturer
Alejandra Gonzalez-Beltran, PhD.
Research Lecturer
www.slideshare.net/SusannaSansone/DTP2016
3. Manage and re-use of various research
digital objects, beside data
Datasets
Papers
SOPs
Figures
Workflows
Slides
Codes
Tools
Databases
Algorithms
4. • Susanna - research digital objects
o Findability, Accessibility, Interoperability, Reusability (FAIR) concept
o Importance of FAIR objects in science and what is in for you
o Metadata or content standards, why should you care
o Data management, not a service but R&D and a possible career path
• Alejandra - information retrieval
o Databases and repositories of digital objects
o Focus on ontologies, a type of content standard
o Application of ontologies in searches
• Philippe - reporting experimental metadata
o Experimental design
o Statistical results
Outline – morning session: lectures
5. Outline – afternoon session: exercise
• Alejandra and Philippe – introductions
o The Investigation, Study, Assays (ISA) format
o The ISA tools: focus on the Google spreadsheet-based OntoMaton
• You will structure and describe an
experiment using terms from ontologies
Notes in Lab Books
(information for humans)
Spreadsheets andTables
( the compromise)
Facts as RD
(information
From free text descriptions To structured FAIR representations