Presentation by Susan Reilly at Bibsys2013 on the opportunties for libraries and their role in the collaborative data infrastructure. Looks at data sharing, authentication, preservation and advocacy.
This document summarizes an presentation about opportunities for data exchange and optimizing data sharing conditions. It discusses several projects by LIBER, including Europeana which aims to make cultural content available online. It notes that with proper infrastructure, researchers can collaborate on shared data sets across locations. However, challenges include authentication, skills, and managing large amounts of data being generated. Overall, the presentation argues that data sharing can advance scientific inquiry if barriers are addressed and key stakeholders work together.
Data sharing and data management – what are they all about?Belinda Weaver
This document discusses the importance of data sharing and management in research. It provides several reasons why data sharing is important, including that data is needed to understand research findings, large datasets require integration across disciplines, and publicly-funded research should benefit the public. However, researchers often face barriers to sharing data such as lack of incentives, time, concerns about losing control or confidentiality of their data. While data sharing is increasingly expected, researchers have flexibility in how and when they share based on funder policies, confidentiality, and use of repositories. The benefits of data sharing include enabling new research, collaboration, and preserving data.
This document summarizes Rob Grim's presentation on e-Science, research data, and the role of libraries. It discusses the Open Data Foundation's work in promoting metadata standards like DDI and SDMX. It also outlines the research data lifecycle and how metadata management can help libraries support research through services like data registration, archiving, discovery and access. Finally, it provides examples of how Tilburg University library supports research data through services aligned with data availability, discovery, access and delivery.
This document summarizes key points from a presentation given at the Entomological Collections Network meeting about the Biodiversity Information Standards (TDWG) Conference 2013. The presentation discussed iDigBio's goals of building an accessible database of US specimen data and facilitating digitization. It provided an overview of TDWG topics like data quality, semantics, and standards. Researchers, collections managers, and others were encouraged to get involved in TDWG to help bridge the gap between research data and databases and avoid duplicating efforts.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document summarizes text mining techniques for information retrieval, extraction, and indexing. It discusses common information retrieval techniques like inverted indices and signature files. It also covers stemming, domain dictionaries, exclusion lists, and research directions in text mining like finding better representations for extracted information, enabling multilingual analysis, and integrating domain knowledge. The key techniques discussed are text indexing, query processing, and information extraction from text.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
Providing support and services for researchers in good data governanceRobin Rice
The University of Edinburgh provides support and services to help researchers with good data governance. This includes a research data policy, research data service with various tools across the data lifecycle, and a data safe haven for sensitive data. The research data service offers centralized storage, version control, collaboration tools, and repositories for sharing data openly or long-term retention. Training and outreach aim to educate researchers on topics like data management plans, sensitive data, and GDPR compliance.
This document summarizes an presentation about opportunities for data exchange and optimizing data sharing conditions. It discusses several projects by LIBER, including Europeana which aims to make cultural content available online. It notes that with proper infrastructure, researchers can collaborate on shared data sets across locations. However, challenges include authentication, skills, and managing large amounts of data being generated. Overall, the presentation argues that data sharing can advance scientific inquiry if barriers are addressed and key stakeholders work together.
Data sharing and data management – what are they all about?Belinda Weaver
This document discusses the importance of data sharing and management in research. It provides several reasons why data sharing is important, including that data is needed to understand research findings, large datasets require integration across disciplines, and publicly-funded research should benefit the public. However, researchers often face barriers to sharing data such as lack of incentives, time, concerns about losing control or confidentiality of their data. While data sharing is increasingly expected, researchers have flexibility in how and when they share based on funder policies, confidentiality, and use of repositories. The benefits of data sharing include enabling new research, collaboration, and preserving data.
This document summarizes Rob Grim's presentation on e-Science, research data, and the role of libraries. It discusses the Open Data Foundation's work in promoting metadata standards like DDI and SDMX. It also outlines the research data lifecycle and how metadata management can help libraries support research through services like data registration, archiving, discovery and access. Finally, it provides examples of how Tilburg University library supports research data through services aligned with data availability, discovery, access and delivery.
This document summarizes key points from a presentation given at the Entomological Collections Network meeting about the Biodiversity Information Standards (TDWG) Conference 2013. The presentation discussed iDigBio's goals of building an accessible database of US specimen data and facilitating digitization. It provided an overview of TDWG topics like data quality, semantics, and standards. Researchers, collections managers, and others were encouraged to get involved in TDWG to help bridge the gap between research data and databases and avoid duplicating efforts.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document summarizes text mining techniques for information retrieval, extraction, and indexing. It discusses common information retrieval techniques like inverted indices and signature files. It also covers stemming, domain dictionaries, exclusion lists, and research directions in text mining like finding better representations for extracted information, enabling multilingual analysis, and integrating domain knowledge. The key techniques discussed are text indexing, query processing, and information extraction from text.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
Providing support and services for researchers in good data governanceRobin Rice
The University of Edinburgh provides support and services to help researchers with good data governance. This includes a research data policy, research data service with various tools across the data lifecycle, and a data safe haven for sensitive data. The research data service offers centralized storage, version control, collaboration tools, and repositories for sharing data openly or long-term retention. Training and outreach aim to educate researchers on topics like data management plans, sensitive data, and GDPR compliance.
The document provides information about MANTRA, a free online course for research data management created by the University of Edinburgh. MANTRA teaches best practices for managing research data through open educational modules aligned with the research data lifecycle. It is available for reuse and repurposing under an open license. The course covers topics like data planning, organization, documentation, storage, security, and sharing.
Merritt’s micro-services-based architecture provides a number of options for easy integration with diverse external discovery services with specific disciplinary focus on scientific data sharing. By removing many of the barriers faced by researchers interested in data publication, the integrations of Merritt with DataShare and Research Hub exemplify a new service model for cooperative and distributed data sharing. The widespread adoption of such sharing is critical to open scientific inquiry and advancement.
This document summarizes statistical disclosure control techniques for protecting private data, specifically microaggregation. Microaggregation involves clustering individual records into small groups to anonymize the data before release. It aims to minimize information loss while preventing re-identification of individuals. The document discusses challenges with multivariate microaggregation and reviews different heuristic approaches. It also covers related topics like k-anonymity algorithms, various clustering techniques for microaggregation like k-means, and using genetic algorithms to handle large datasets.
This document provides an overview of research data management and outlines the steps for creating a data management plan. It discusses why research data management is important, including enabling data reuse and sharing and meeting funder requirements. The document then walks through creating a data management plan, covering topics like the types and formats of data that will be generated, ethical and intellectual property issues, how data will be stored and backed up, and long-term preservation and deposition of data. It emphasizes that planning early helps ensure accurate, complete and secure data, and avoids problems down the line.
Creating a sustainable business model for a digital repository: the Dryad exp...ASIS&T
Creating a sustainable business model for a digital repository: the Dryad experience
Peggy Schaeffer
Datadryad.org
Presentation at Research Data Access & Preservation Summit
22 March 2012
Scientific discovery and innovation in an era of data-intensive science
William (Bill) Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.
SEAD is a NSF DataNet project that aims to provide cyberinfrastructure for long tail data in sustainability science research. It develops tools for active and social curation of data including an Active Curation Repository (ACR) and VIVO profiles. It also creates a Virtual Archive to facilitate long-term access and preservation of datasets across multiple institutional repositories. The presentation provides an overview of SEAD's approach and highlights pilots with the National Center for Earth Surface Dynamics, including ingesting their data collections into the ACR and Virtual Archive and building a social network in VIVO.
Data Equivalence
Mark Parsons, Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
The document discusses data sharing policies and mandates from various organizations including federal funding agencies in the US and internationally, journals, and a paradigm shift toward more transparent and collaborative research that integrates publications and data. Key points include requirements for data management plans from NIH and NSF, expectations of funding agencies in other countries to maximize access to research data, a journal policy requiring data to be made available, and challenges around measuring the impact of shared data given the lack of common practices and standards for citing data.
University of Bath Research Data Management training for researchersJez Cope
Slides from a workshop on Research Data Management for research staff and students at the University of Bath.
Part of the Research360 project (http://blogs.bath.ac.uk/research360).
Authors: Cathy Pink and Jez Cope, University of Bath
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
DataCite and Campus Data Services
Paul Bracke, Associate Dean for Digital Programs and Information Services, Purdue University
Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
This document provides a gentle introduction to research data management and the research data lifecycle. It defines key terms like research, research data, and research data lifecycles. It discusses the benefits of data sharing, including enabling new research and testing new hypotheses. Research data can be complex with data, metadata, transformations, and combinations. The document outlines the research data lifecycle from collection to archiving and roles in data management.
Preparing your data for sharing and publishingVarsha Khodiyar
This document provides information on preparing data for sharing and publishing. It discusses organizing data through clear file and folder labeling, including additional context about methods and instruments. It also describes publishing data through journals like Scientific Data, which provide peer review and credit. Sensitive data requires careful handling and may be suitable for controlled access repositories. Overall the document offers guidance on effective data organization, documentation, sharing and receiving credit for shared data.
Translational Research Intelligence - Beyond Traditional Bishc66columbia
I. The document discusses the challenges of using traditional business intelligence (BI) approaches for translational research due to the complex nature of clinical and translational data.
II. It proposes a translational research intelligence (TRI) approach with key components like a centralized data hub, data exploration and analysis tools, and an iterative process for data cleaning, standardization, and scientific insights.
III. The TRI approach emphasizes rapid iterations, collaboration between informatics and research teams, understanding the data landscape, and gaining insights from "less-than-perfect" data earlier in the process.
Research Commons @ Stellenbosch University presented at RLC Academy, Mont Fle...Reed Elsevier
The Research Commons at Stellenbosch University aims to empower postgraduate research through specialized facilities, assistance, partnerships and events. It provides (1) dedicated work spaces, equipment and reference materials for postgraduate students; (2) research assistance from librarians and peer advisors on topics like referencing, bibliographic tools and technical writing; and (3) workshops and seminars on research skills. Initial feedback indicates the Research Commons is motivating students and received positive comments on its services, while also identifying opportunities to expand assistance and resources. Usage statistics show growing utilization of its study areas, seminar rooms and research support services.
Making the link: the library’s role in facilitating research collaboration. P...Reed Elsevier
The document discusses the trend of collaboration in science and how it aligns with human nature, as well as Stellenbosch University's goals of facilitating research collaboration. The Stellenbosch University Library and Information Service has a plan to strengthen collaboration by providing resources like seminar rooms, social networking sites, and tools to analyze researchers' outputs and connections. The library aims to identify opportunities for new collaborations and fill in "weak links" by connecting researchers with similar interests but who have not previously collaborated.
What's in the research librarian's tool shed?Reed Elsevier
This document provides an overview of bibliometric tools and metrics that can be used to measure and evaluate research impact and productivity. It discusses common metrics like publication counts, citations, h-index, g-index and m-index which are calculated using bibliographic databases like Scopus and Web of Science. It also explores more novel altmetric tools that can measure impact of research outputs beyond publications. The document emphasizes that bibliometrics should be used cautiously and as a supplement to peer review, to avoid perverting incentives for researchers.
Part 2: Research support in a 21st century academic library: a case study of ...Reed Elsevier
The document outlines the partnership between the library and research office at a university. It describes various services provided to support the different phases of the research lifecycle from preparing a project, gathering resources, creating research outputs, preserving and sharing results, and measuring impact. These include access to journals and books, workshops on topics like writing and career development, research repositories, and tools for identifying collaborators and tracking metrics. The goal is to provide dynamic support for academic excellence through all stages of research.
The document provides information about MANTRA, a free online course for research data management created by the University of Edinburgh. MANTRA teaches best practices for managing research data through open educational modules aligned with the research data lifecycle. It is available for reuse and repurposing under an open license. The course covers topics like data planning, organization, documentation, storage, security, and sharing.
Merritt’s micro-services-based architecture provides a number of options for easy integration with diverse external discovery services with specific disciplinary focus on scientific data sharing. By removing many of the barriers faced by researchers interested in data publication, the integrations of Merritt with DataShare and Research Hub exemplify a new service model for cooperative and distributed data sharing. The widespread adoption of such sharing is critical to open scientific inquiry and advancement.
This document summarizes statistical disclosure control techniques for protecting private data, specifically microaggregation. Microaggregation involves clustering individual records into small groups to anonymize the data before release. It aims to minimize information loss while preventing re-identification of individuals. The document discusses challenges with multivariate microaggregation and reviews different heuristic approaches. It also covers related topics like k-anonymity algorithms, various clustering techniques for microaggregation like k-means, and using genetic algorithms to handle large datasets.
This document provides an overview of research data management and outlines the steps for creating a data management plan. It discusses why research data management is important, including enabling data reuse and sharing and meeting funder requirements. The document then walks through creating a data management plan, covering topics like the types and formats of data that will be generated, ethical and intellectual property issues, how data will be stored and backed up, and long-term preservation and deposition of data. It emphasizes that planning early helps ensure accurate, complete and secure data, and avoids problems down the line.
Creating a sustainable business model for a digital repository: the Dryad exp...ASIS&T
Creating a sustainable business model for a digital repository: the Dryad experience
Peggy Schaeffer
Datadryad.org
Presentation at Research Data Access & Preservation Summit
22 March 2012
Scientific discovery and innovation in an era of data-intensive science
William (Bill) Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.
SEAD is a NSF DataNet project that aims to provide cyberinfrastructure for long tail data in sustainability science research. It develops tools for active and social curation of data including an Active Curation Repository (ACR) and VIVO profiles. It also creates a Virtual Archive to facilitate long-term access and preservation of datasets across multiple institutional repositories. The presentation provides an overview of SEAD's approach and highlights pilots with the National Center for Earth Surface Dynamics, including ingesting their data collections into the ACR and Virtual Archive and building a social network in VIVO.
Data Equivalence
Mark Parsons, Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
The document discusses data sharing policies and mandates from various organizations including federal funding agencies in the US and internationally, journals, and a paradigm shift toward more transparent and collaborative research that integrates publications and data. Key points include requirements for data management plans from NIH and NSF, expectations of funding agencies in other countries to maximize access to research data, a journal policy requiring data to be made available, and challenges around measuring the impact of shared data given the lack of common practices and standards for citing data.
University of Bath Research Data Management training for researchersJez Cope
Slides from a workshop on Research Data Management for research staff and students at the University of Bath.
Part of the Research360 project (http://blogs.bath.ac.uk/research360).
Authors: Cathy Pink and Jez Cope, University of Bath
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
DataCite and Campus Data Services
Paul Bracke, Associate Dean for Digital Programs and Information Services, Purdue University
Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
This document provides a gentle introduction to research data management and the research data lifecycle. It defines key terms like research, research data, and research data lifecycles. It discusses the benefits of data sharing, including enabling new research and testing new hypotheses. Research data can be complex with data, metadata, transformations, and combinations. The document outlines the research data lifecycle from collection to archiving and roles in data management.
Preparing your data for sharing and publishingVarsha Khodiyar
This document provides information on preparing data for sharing and publishing. It discusses organizing data through clear file and folder labeling, including additional context about methods and instruments. It also describes publishing data through journals like Scientific Data, which provide peer review and credit. Sensitive data requires careful handling and may be suitable for controlled access repositories. Overall the document offers guidance on effective data organization, documentation, sharing and receiving credit for shared data.
Translational Research Intelligence - Beyond Traditional Bishc66columbia
I. The document discusses the challenges of using traditional business intelligence (BI) approaches for translational research due to the complex nature of clinical and translational data.
II. It proposes a translational research intelligence (TRI) approach with key components like a centralized data hub, data exploration and analysis tools, and an iterative process for data cleaning, standardization, and scientific insights.
III. The TRI approach emphasizes rapid iterations, collaboration between informatics and research teams, understanding the data landscape, and gaining insights from "less-than-perfect" data earlier in the process.
Research Commons @ Stellenbosch University presented at RLC Academy, Mont Fle...Reed Elsevier
The Research Commons at Stellenbosch University aims to empower postgraduate research through specialized facilities, assistance, partnerships and events. It provides (1) dedicated work spaces, equipment and reference materials for postgraduate students; (2) research assistance from librarians and peer advisors on topics like referencing, bibliographic tools and technical writing; and (3) workshops and seminars on research skills. Initial feedback indicates the Research Commons is motivating students and received positive comments on its services, while also identifying opportunities to expand assistance and resources. Usage statistics show growing utilization of its study areas, seminar rooms and research support services.
Making the link: the library’s role in facilitating research collaboration. P...Reed Elsevier
The document discusses the trend of collaboration in science and how it aligns with human nature, as well as Stellenbosch University's goals of facilitating research collaboration. The Stellenbosch University Library and Information Service has a plan to strengthen collaboration by providing resources like seminar rooms, social networking sites, and tools to analyze researchers' outputs and connections. The library aims to identify opportunities for new collaborations and fill in "weak links" by connecting researchers with similar interests but who have not previously collaborated.
What's in the research librarian's tool shed?Reed Elsevier
This document provides an overview of bibliometric tools and metrics that can be used to measure and evaluate research impact and productivity. It discusses common metrics like publication counts, citations, h-index, g-index and m-index which are calculated using bibliographic databases like Scopus and Web of Science. It also explores more novel altmetric tools that can measure impact of research outputs beyond publications. The document emphasizes that bibliometrics should be used cautiously and as a supplement to peer review, to avoid perverting incentives for researchers.
Part 2: Research support in a 21st century academic library: a case study of ...Reed Elsevier
The document outlines the partnership between the library and research office at a university. It describes various services provided to support the different phases of the research lifecycle from preparing a project, gathering resources, creating research outputs, preserving and sharing results, and measuring impact. These include access to journals and books, workshops on topics like writing and career development, research repositories, and tools for identifying collaborators and tracking metrics. The goal is to provide dynamic support for academic excellence through all stages of research.
Elevating research librarianship to a new level, presented at LIASA Conferenc...Reed Elsevier
The document summarizes discussions from a research librarianship conference in South Africa. It covers several topics:
1) An overview of the Research Libraries Consortium (RLC) in South Africa and an internship program at US universities to learn from their research support practices.
2) Different aspects of research support discussed, including services at various US universities like research commons, data services, and support for the entire research process.
3) Information literacy instruction from a research perspective, emphasizing integrating it within courses, using collaborative activities, and teaching critical thinking.
4) Challenges and opportunities in collection building to support researchers, such as embracing digital formats, institutional repositories, and data management.
Confessions of an ex-librarian: research support across divisional bordersReed Elsevier
This document discusses the stages a librarian went through in adapting to changing research needs and environments. It begins with the librarian getting acquainted with new factors like the Research Libraries Consortium project and literature on the evolving research process. The librarian then pursued understanding new modes of research like Mode 2 knowledge production and the triple helix model. This led to a stage of commitment to support the entire research lifecycle. However, disagreements with colleagues on the librarian's role led to a stage of disillusionment. Finally, the librarian reached a stage of finding balance by acknowledging changes, collaborating with stakeholders, and empowering researchers through the correct use of tools.
Networked Science, And Integrating with DataverseAnita de Waard
This document discusses the growing interconnectedness of research data and tools in a networked science environment. It summarizes Elsevier's current and potential future connections to the Dataverse platform, including exporting data from the Hivebench ELN to Dataverse, linking articles to datasets in Dataverse through frameworks like Scholix, indexing Dataverse through Elsevier's data search tools, and tracking metrics on Dataverse datasets through analytics platforms like PlumX. The author expresses interest in further strengthening integration between these systems to advance open sharing of research data.
David Shotton - Research Integrity: Integrity of the published recordJisc
The document discusses several issues related to publishing research data and proposes solutions to address them. It describes projects that aim to make it easier for researchers to publish, archive, cite and reuse research data. This includes developing metadata standards, data repositories, and publishing data citations as linked open data to improve data discovery and attribution.
From metadata to data curation: the role of libraries in data exchangeLIBER Europe
This document summarizes a presentation on the role of libraries in data exchange. It discusses several projects by LIBER and the European Research Infrastructure to promote data sharing and preservation. A survey found that libraries need to develop skills and strategies for data citation, curation, and preservation. Moving forward, libraries should advocate for digital preservation policies, invest in workforce training, and help educate researchers on data management best practices. Overall, the presentation argues that libraries have an important role to play in supporting open data and research reproducibility.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
I gave this presentation to the STM Publishers Association Innovation Conference in London, 4-December-2009. It frames the data citation problem and introduces DataCite - the international data citation initiative.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
The role of libraries in data exchangeLIBER Europe
The document discusses the role of libraries in data exchange and management. It summarizes the results of a survey of over 800 librarians on their current practices and skills related to supporting data management and exchange. Key findings include:
- 81% of respondents see demand from researchers for support in data exchange, but current library support is not meeting this demand.
- Only 32% of libraries believe they have the necessary skills currently, but 53% are working to develop these skills.
- The best place for underlying research data is agreed to be in official data repositories and archives.
- Developing subject expertise and integrating data management training into professional education are seen as important ways for libraries to build their skills
Linking Data to Publications through Citation and Virtual ArchivesMicah Altman
This document discusses linking data to publications through citation and virtual archives. It argues that data citation and sharing infrastructure are necessary for scientific reproducibility and open data. It outlines elements of data management plans and requirements for data sharing infrastructure, including persistence, provenance, access control and incentives. The document advocates for data citations as first-class objects and emerging practices like assigning DOIs to datasets. It presents several use cases for the Dataverse network, a virtual archive designed for research data sharing through federated and organizational models.
This document discusses engaging researchers in research data management (RDM) through data reference interviews. It provides an overview of EDINA and the University of Edinburgh Data Library and their roles in assisting researchers. It then describes the data reference interview process, highlighting the importance of understanding the researcher's field and data. Recommendations are provided for interviewing researchers and tools for assessing data are introduced. The document concludes by discussing the University's RDM strategy and engagement tools.
Data Publishing at Harvard's Research Data Access SymposiumMerce Crosas
Data Publishing: The research community needs reliable, standard ways to make the data produced by scientific research available to the community, while giving credit to data authors. As a result, a new form of scholarly publication is emerging: data publishing. Data publishing - or making data reusable, citable, and accessible for long periods - is more than simply providing a link to a data file or posting the data to the researcher’s web site. We will discuss best practices, including the use of persistent identifiers and full data citations, the importance of metadata, the choice between public data and restricted data with terms of use, the workflows for collaboration and review before data release, and the role of trusted archival repositories. The Harvard Dataverse repository (and the Dataverse open-source software) provides a solution for data publishing, making it easy for researchers to follow these best practices, while satisfying data management requirements and incentivizing the sharing of research data.
Digital preservation of geoscience information is important to ensure continued access to valuable scientific data over long periods of time. The Viking Mars mission from 1975 illustrates this need, as the original magnetic tape data degraded and became unreadable after 20 years. Proper digital preservation strategies like the Open Archival Information System model help ensure long-term access through migration to new formats, technology emulation, and institutional repositories. The presentation outlines the OAIS model and its key elements, and proposes implementing a pilot institutional repository in India to test an OAIS-compliant preservation approach for geoscience data.
DataCite: the Perfect Complement to CrossRefCrossref
DataCite was created to address the lack of effective ways to link datasets to articles and identify datasets. It assigns digital object identifiers (DOIs) to datasets to allow them to be cited similarly to scholarly articles. Many research institutions and libraries around the world are members of DataCite, including organizations in Europe, North America, and Asia. DataCite helps establish datasets as legitimate contributions to the scientific record that can be identified and cited.
A Generic Scientific Data Model and Ontology for Representation of Chemical DataStuart Chalk
The current movement toward openness and sharing of data is likely to have a profound effect on the speed of scientific research and the complexity of questions we can answer. However, a fundamental problem with currently available datasets (and their metadata) is heterogeneity in terms of implementation, organization, and representation.
To address this issue we have developed a generic scientific data model (SDM) to organize and annotate raw and processed data, and the associated metadata. This paper will present the current status of the SDM, implementation of the SDM in JSON-LD, and the associated scientific data model ontology (SDMO). Example usage of the SDM to store data from a variety of sources with be discussed along with future plans for the work.
This document contains notes from a database fundamentals class taught by Eng. Javier Daza on April 8, 2024. The notes cover the history and evolution of databases, definitions of databases, types of databases including relational and NoSQL, and characteristics and advantages of databases. The class included activities on database history, a pre-test quiz, and a discussion of the top Gartner technology trends and technologies from CES 2024. The goal of the class was to provide context on relational databases by exploring related topics.
Similar to Where is the opportunity for libraries in the collaborative data infrastructure? (20)
LIBER Europe Covid-19 Research Libraries Survey - December 2020LIBER Europe
This document presents the results of a LIBER COVID-19 survey categorized by country and institution groups. It divides respondent institutions into three categories: Category A includes Western European countries, Category B includes Central and Eastern European countries, and Category C includes Southeastern European and Eastern European countries. The document consists of a series of graphs comparing survey responses across the different categories of institutions regarding the impact of the COVID-19 pandemic.
LIBER Webinar: Turning FAIR Data Into RealityLIBER Europe
These slides relate to a LIBER Webinar given on 23 April 2018. Turning FAIR Data Into Reality — Progress and Plans from the European Commission FAIR Data Expert Group.
In this webinar, Simon Hodson, Executive Director of CODATA and Chair of the FAIR Data Expert Group, and Sarah Jones, Associate Director at the Digital Curation Centre and Rapporteur, reported on the Group’s progress.
Copyright Reform: EU Legislative Process & LIBER AdvocacyLIBER Europe
LIBER's Copyright & Legal Matters Working Group met in Helsinki on 7 December 2017. This presentation, outlining the EU legislative process on copyright reform and LIBER advocacy, was given at the meeting by Helena Lovegrove, LIBER's Advocacy Adviser.
Applying Bourdieu's Field Theory to MLS Curricula Development. Charlotte Nord...LIBER Europe
The document discusses applying Pierre Bourdieu's field theory concept to analyze the changing positions of research librarians within university structures over time. It presents field theory concepts such as fields, doxa, habitus, and forms of capital. Diagrams show how positions within the university and library fields have changed, with research librarians previously higher in cultural capital now lower. Reasons for this include changes in client needs and other library staff professionalizing. It suggests ways for research librarians to reclaim prestige by ensuring services' value and combining domain knowledge with client needs. Finally, it outlines a new flexible master's program to help research librarians specialize in areas like project management, bibliometrics and data management
Growing a Culture for Change at The University of Manchester Library. Penny H...LIBER Europe
The University of Manchester Library underwent a culture change process to improve their strategy and leadership. Their initial strategy saw over 100 projects but lacked staff involvement which led to disconnect and resistance. To improve, they held meetings to get staff feedback and have staff self-elect involvement in developing a new strategy. For the new strategy, 30 staff were involved across 4 themes linked to the university's goals, compared to just 3 staff previously. Lessons learned included the importance of empowering staff, maintaining involvement, and regularly checking in with staff.
Enabling the Exchange and use of Data in AgricultureLIBER Europe
This presentation by Imma Subirats was part of the "Research Data Support Meets Disciplines: Opportunities & Challenges" workshop at LIBER's 2017 Annual Conference in Patras, Greece. For more information, see www.libereurope.eu
GDPR - Thoughts on the EU Data Protection Regulation, Research and LibrariesLIBER Europe
This presentation by Jonas Holm was part of the "Research Data Support Meets Disciplines: Opportunities & Challenges" workshop at LIBER's 2017 Annual Conference in Patras, Greece. For more information, see www.libereurope.eu
Research Data Services and Data Collections: Library Synergies for Economic R...LIBER Europe
This presentation by Thomas Bourke was part of the "Research Data Support Meets Disciplines: Opportunities & Challenges" workshop at LIBER's 2017 Annual Conference in Patras, Greece. For more information, see www.libereurope.eu
Research Data Services and Data Collections: Library Synergies for Economic R...
Where is the opportunity for libraries in the collaborative data infrastructure?
1. Where is the opportunity for
libraries in the collaborative
data infrastructure?
Susan Reilly
Project Manager
LIBER
susan.reilly@kb.nl
@skreilly
2. Contents
About LIBER
Some context
What is the collaborative data infrastructure?
Introducing the researcher to the CDI
Introducing the CDI to the researcher
Now and next?
3. LIBER: reinventing the library of the future
Largest network of European reseach libraries: 450 in over 40
countries
Mission:
To provide an information infrastructure to enable research
in LIBER institutions to be world class
4. Key performance areas
Scholarly communication and research infrastructures
Reshaping the research library
Advocacy
5. LIBER Projects
Reshaping
The
research library
Scholarly
Communication
Advocacy
&
Research
Infrastructure
6. So why am I here?
Reshaping Collaborative data
The infrastructure
research library
Scholarly
Communication
Advocacy &
Research
Infrastructure
7. What is the collaborative data infrastructure
(scientific data infrastructure)?
…it’s about data
8. Not just the 20+ petabytes that the LHC at CERN
produces every year
9. Libraries in the data deluge
Increasing amount of digitised and born digital content
in libraries
Increasing emphasis on open access publications and
data: mandates, institutional repositories
Demand for data management support
10.
11. What is the collaborative data infrastructure?
“a broad, conceptual framework for how different
companies, institutes, universities, governments and
individuals would interact with the system – what types of
data, privileges, authentication or performance metrics
should be planned. This framework would ensure the
trustworthiness of data, provide for its curation, and
permit an easy interchange among the generators and
users of data”
12. Now and Next
Authentication & authorisation
New skills
13. Introducing the researcher to the CDI
Current situation
ODE & linking data to publications
Demand for data management support
Advocacy
14.
15. Opportunities for data exchange (ODE)
identify, collate, interpret and deliver evidence of
emerging best practices in sharing, re-using, preserving
and citing data, the drivers for these changes and barriers
impeding progress, in forms suited to each audience
policy makers, funders, infrastructure operators, data
centres, data providers and users, libraries and publishers
16. Steps to creating the conditions for data
sharing
Understand data sharing today
Collection of "success stories”, “near misses” and “honourable
failures” in data sharing, re-use and preservation
Data & scholarly communications
Integrating data and publications
Best practice in data citation
New roles
Identify drivers and barriers
Interviews with stakeholder
to seek consensus
Foto "Bell", Noordewierweg 116, Amersfoort.
17.
18. Hypotheses
“Without the infrastructure
that helps scientists manage
their data in a convenient
and efficient way, no
culture of data sharing will
evolve.”
Stefan Winkler-Nees
(German Research Foundation, DFG)
20. The Data
Publication Pyramid (1) Data
contained and
explained within
the article
(2) Further data
explanations in
any kind of
supplementary (3) Data
files to articles referenced from
the article and
held in data
centers and
(4) Data
repositories
publications,
describing
available
datasets
(5) Data in
drawers and on
disks at the
institute
21. The Pyramid’s likely short term reality:
(1) Top of the
pyramid is stable
but small
(2) Risk that
supplements to
articles turn into
Data Dumping (3) Too many
places disciplines lack
a community
endorsed data
archive
(4) Estimates
are that at least
75 % of
research data is
never made
openly avaiable
21
22. (1) More
integration of text
and data, viewers
and seamless
links to interactive
datasets
The Ideal Pyramid
(2) Only if data
cannot be
integrated in (3) Seamless links
article, and only (bi-directional)
relevant extra between
explanations publications and
data, interactive
(4) More Data viewers within the
Journals that articles
describe
datasets, data
mgt plans and
data methods
22
23. Issues for researchers
Researchers need somewhere to put data and
make it safe for reuse
Researchers need to control its sharing and
access
Researchers need the ability to integrate data and
publication
Researchers need to get credit
for data as a first class research
object
Researchers need someone to
pay for the costs of data availability
and re-use
24. Library support for the researcher
Libraries and data centres must support…
data as first class research object: Availability
publishing, persistent identification/citation
of datasets
data description, metadata, standards Findability
documentation and retrieval
proper documentation of data
Interpretability
long-term data archiving including data
curation and preservation
Re-usability
25. Implications for libraries
Level of integration Implication for library
Data contained within the article Prepare for adequate preservation
strategies
Data published in supplementary files to Presentation and preservation
articles mechanisms
Persistent link
Datasets referenced from the articles Citability of dataset
Persistent link
Perpetual access to dataset
Data published independently from written Support publication process
publications (“data publication”) Curation of datasets
Metadata and documentation
Data in drawers and on disks at the Engage in data management
institute planning
27. Advocacy
“Many researchers do not appear to see the value and
benefits of data citation. There is a gap, which could be
filled by libraries, in advocacy for data sharing, the use of
subject specific repositories, and best practice in data
citation. These, if filled, would increase the number of
researchers sharing and reusing data.”
http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/downlo
28. Introducing the CDI to the researcher
Scoping the researcher’s requirements
Collaboration & policy development
29. The AAA Study: a research passport
“evaluate the feasibility of delivering an integrated
Authentication and Authorisation Infrastructure, AAI, to
help the emergence of a robust platform for access to and
preservation of scientific information within a Scientific
Data Infrastructure (SDI)”
30. Now and Next
Authentication & authorisation
New skills
33. Collaboration
“Networked science is on the rise, the researcher is no
longer working alone in his office, he is working virtually
with other researchers from around the world. For them it
is important that they can use the same software and
share and reuse the same content related objects, in a
trusted environment.”
Heinke Neuroth, Head of Innovation, Goettingen State &
University Library
34. Use Cases
1. Creating Data
2. Processing Data
3. Sharing Data
4. Preserving Data
5. Multi-disciplinary Data Services
6. Analysing Data
7. Accessing Data
8. Accessing Experiments and Data
35. Requirements…
Tracking of provenance, authenticity, integrity of the material
Integration of researcher ID with institutional credentials
Researchers’ self registration
Securely linking researcher and data identifiers for tracking
provenance
Delegation of identity management to home institute
Attribute provisioning for users participating in specific research
projects managed by the specific research groups (VOs)
Attribute aggregation
Unification and homogenisation of identity federations´ attributes and
agreed levels of assurance in order to facilitate authorisation
Accreditation of trusted identity Providers (IdPs), based on
international standards, depending on the required level of assurance
Entitlement management to minimise the occurrence of events where
license monies are being paid twice without necessity (e.g., for
access to scientific journals).
38. Collaboration & policy development
Policies for data sharing
Values & Ecosystems
Infrastructure & Technology
Legal & Ethical
Institutional Support
http://recodeproject.eu/
39. Now & next
What should our priorities be?
LIBER ten recommendations:
http://www.libereurope.eu/news/ten-recommendations-for-libraries-to-get-started-with-research-data
41. 2.Collaborate
Alliance for Permanent Access to the Record of Science
in Europe Network (APARSEN)
look across the excellent work in digital preservation which is
carried out in Europe and to try to bring it together under a
common vision
Trust! Sustainability! Usability! Access!
http://www.alliancepermanentaccess.org/
this figure suggests, in the broadest possible terms, how different actors, data types and services should interrelate in a global einfrastructure for science. Data generators and users gather, capture, transfer and process data - often, across the globe, in virtual research environments. they draw upon support services in their specific scientific communities - tools to help them find remote data, work with it, annotate it or interpret it. the support services, specific to each scientific domain and provided by institutes or companies, draw on a broad set of common data services that cut across the global system; these include systems to store and identify data, authenticate it, execute tasks, and mine it for unexpected insights. At every layer in the system, there are appropriate provisions to curate data - and to ensure its trustworthiness.
Libraries and data centres must support data publishing as a prerequisite for data availability, including persistent identification/citation of datasets, and solutions for data description and retrieval, which together facilitate findability. They must also ensure that data is properly documented as a condition for data interpretability and re-usability and prepare for long-term data archiving including data curation and preservation.
this figure suggests, in the broadest possible terms, how different actors, data types and services should interrelate in a global einfrastructure for science. Data generators and users gather, capture, transfer and process data - often, across the globe, in virtual research environments. they draw upon support services in their specific scientific communities - tools to help them find remote data, work with it, annotate it or interpret it. the support services, specific to each scientific domain and provided by institutes or companies, draw on a broad set of common data services that cut across the global system; these include systems to store and identify data, authenticate it, execute tasks, and mine it for unexpected insights. At every layer in the system, there are appropriate provisions to curate data - and to ensure its trustworthiness.