Library Partnerships--Oh, the possibilities!Joanne Romano
Library Partnerships with patron institutions are more important than ever before. Lack of staff and funding should not be a barrier to expanding research collaborations with your patrons. Find out how the Texas Medical Center Library used creativity and teamwork to successfully establish new institutional partnerships within the Texas Medical Center.
RDAP 16 Poster: Data Management Training ClearinghouseASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
JC Nelson, Upper Midwest Environmental Sciences Center, USGS
Nancy Hoebelheinrich, Knowledge Motifs, LLC
Tamar Norkin, Core Science Analytics, Synthesis & Libraries, USGS
Amber Budden, DataONE
Sophie Hou, University of Illinois Urbana-Champaign
Shelley Knuth, University of Colorado Boulder
Erin Robinson, Foundation of Earth Science / ESIP Federation
David Bassendine, Blue Dot Lab
RDAP 16 Poster: Librarian Research Data: Customizing the DMP Assistant for Pr...ASIS&T
The document discusses research standards and support structures for tenured and tenure-track librarians at the University of Saskatchewan. It mentions that librarians must meet research standards outlined in the university's 2015 promotion and tenure guidelines. It also discusses support structures established like the Librarians' Forum and Dean's Research Lecture. Finally, it references a 2010 survey that further outlined librarian research areas and where more support was needed.
The document provides an overview of research at the University of York, which was founded in 1963 and now has around 16,000 students including over 450 international students. The university's research strategy focuses on research excellence, innovation, international collaboration, impact, and interdisciplinary themes like creativity, health, and technologies. Implementing the strategy involves recruiting top researchers, supporting postgraduate students, and managing research data, which presents particular challenges for data in the arts and humanities like paper notes, archives, and media files.
Finding Insights in Article-Level Metrics for Research EvaluationRichard Cave
The use of Article-Level Metrics (ALMs) as an indicator of an article’s quality and impact has dramatically increased in the last year. Publishers continue to add ALMs to research articles and new organizations have been created to aggregate ALMs across multiple fields including usage, citations, and social media. Using ALMs, researchers, librarians, funders, and the general public are able to gain insight into research articles that are the most widely read and used. PLOS launched ALM Reports (http://almreports.plos.org/) which allow users to view ALMs for any set of PLOS articles and visualize the data results. This allows users to quickly explore and compare ALMs for a large number of articles by searching for papers published by researchers at their institutions, for papers funded by specific funding agencies, or by searching on generic terms within an article. The application can be used to access up-to-date information on research papers, to view data on the downstream impact of the research, and to measure evidence of wider engagement with the research. These insights provide a powerful way to evaluate impact of research across many articles in a single view.
This document provides guidance on conducting background research for a topic. It discusses defining the topic, using dictionaries and encyclopedias to understand the topic at a high level, and how to narrow the focus for a research project. Key aspects covered include using reference sources to understand the history, context and breadth of a topic. The document also recommends strategies for keyword mining and searching to find additional relevant sources, and provides an example of narrowing a broad topic to a more specific research question.
Data management basics, for UC Davis EDU 292Phoebe Ayers
This document provides information and guidance about data management for EDU 292. It lists resources for data management from UC Davis Libraries and highlights key reasons for properly managing research data such as reproducibility, credibility, and fulfilling requirements. It discusses metadata, storage options, backups, file formats, and security. It also covers citing data sources accurately and linking works together. The document encourages participants to consider aspects like long-term maintenance, access, and version control for research data and raises questions to facilitate planning proper data management practices.
Library Partnerships--Oh, the possibilities!Joanne Romano
Library Partnerships with patron institutions are more important than ever before. Lack of staff and funding should not be a barrier to expanding research collaborations with your patrons. Find out how the Texas Medical Center Library used creativity and teamwork to successfully establish new institutional partnerships within the Texas Medical Center.
RDAP 16 Poster: Data Management Training ClearinghouseASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
JC Nelson, Upper Midwest Environmental Sciences Center, USGS
Nancy Hoebelheinrich, Knowledge Motifs, LLC
Tamar Norkin, Core Science Analytics, Synthesis & Libraries, USGS
Amber Budden, DataONE
Sophie Hou, University of Illinois Urbana-Champaign
Shelley Knuth, University of Colorado Boulder
Erin Robinson, Foundation of Earth Science / ESIP Federation
David Bassendine, Blue Dot Lab
RDAP 16 Poster: Librarian Research Data: Customizing the DMP Assistant for Pr...ASIS&T
The document discusses research standards and support structures for tenured and tenure-track librarians at the University of Saskatchewan. It mentions that librarians must meet research standards outlined in the university's 2015 promotion and tenure guidelines. It also discusses support structures established like the Librarians' Forum and Dean's Research Lecture. Finally, it references a 2010 survey that further outlined librarian research areas and where more support was needed.
The document provides an overview of research at the University of York, which was founded in 1963 and now has around 16,000 students including over 450 international students. The university's research strategy focuses on research excellence, innovation, international collaboration, impact, and interdisciplinary themes like creativity, health, and technologies. Implementing the strategy involves recruiting top researchers, supporting postgraduate students, and managing research data, which presents particular challenges for data in the arts and humanities like paper notes, archives, and media files.
Finding Insights in Article-Level Metrics for Research EvaluationRichard Cave
The use of Article-Level Metrics (ALMs) as an indicator of an article’s quality and impact has dramatically increased in the last year. Publishers continue to add ALMs to research articles and new organizations have been created to aggregate ALMs across multiple fields including usage, citations, and social media. Using ALMs, researchers, librarians, funders, and the general public are able to gain insight into research articles that are the most widely read and used. PLOS launched ALM Reports (http://almreports.plos.org/) which allow users to view ALMs for any set of PLOS articles and visualize the data results. This allows users to quickly explore and compare ALMs for a large number of articles by searching for papers published by researchers at their institutions, for papers funded by specific funding agencies, or by searching on generic terms within an article. The application can be used to access up-to-date information on research papers, to view data on the downstream impact of the research, and to measure evidence of wider engagement with the research. These insights provide a powerful way to evaluate impact of research across many articles in a single view.
This document provides guidance on conducting background research for a topic. It discusses defining the topic, using dictionaries and encyclopedias to understand the topic at a high level, and how to narrow the focus for a research project. Key aspects covered include using reference sources to understand the history, context and breadth of a topic. The document also recommends strategies for keyword mining and searching to find additional relevant sources, and provides an example of narrowing a broad topic to a more specific research question.
Data management basics, for UC Davis EDU 292Phoebe Ayers
This document provides information and guidance about data management for EDU 292. It lists resources for data management from UC Davis Libraries and highlights key reasons for properly managing research data such as reproducibility, credibility, and fulfilling requirements. It discusses metadata, storage options, backups, file formats, and security. It also covers citing data sources accurately and linking works together. The document encourages participants to consider aspects like long-term maintenance, access, and version control for research data and raises questions to facilitate planning proper data management practices.
This document summarizes the history and evolution of scholarly publishing, including:
- Commercial publishers began acquiring journals in the 1960s, dominating the market with high profit margins.
- Journal prices skyrocketed in the 1990s-2000s beyond what libraries could afford, known as the "serials crisis".
- The rise of open access emerged due to the internet and high journal costs, allowing unrestricted online access to peer-reviewed research.
- Publishers were seen as only caring about profits, fueling a backlash against traditional models.
Get assistance with grant compliance (public access policies), copyright questions, publication agreements, and rights retention from U of Tennessee's Scholarly Communication & Publishing Librarian.
This document provides an overview of information resources available at Lipscomb University's Beaman Library. It discusses reference books, periodicals, indexes, databases, and websites that can be used for research. Evaluation of online sources is also covered, highlighting the importance of considering the author, date, purpose, and content of a website. Students needing help with research are encouraged to consult reference librarians.
This presentation was provided by Courtney R. Butler of The Federal Reserve Bank - Kansas City, during part two of the NISO two-part webinar "Building Data Science Skills: Strategic Support for the Work, Part Two," which was held on March 18, 2020.
Discover the Power Inside Web of ScienceMaira Bundza
The document summarizes research conducted to analyze Eastern Michigan University faculty publications and citations between 2005-2007. Researchers retrieved article and citation data for EMU authors from the Web of Science database. They surveyed EMU faculty authors and interviewed a sample. The analysis found that 121 EMU faculty published 244 articles in 209 unique journals, 87.56% of which the university library owned. The most frequently cited journals were also identified. Faculty responses indicated the library resources were generally adequate but could be improved with additional online journals and resources.
This document summarizes challenges and efforts around managing research data in the arts and humanities. It discusses how "data" is not clearly defined in these domains as it is in STEM fields. Universities like UAL and GSA are working to educate researchers on identifying, organizing, and sharing their diverse research outputs and formats. This includes developing data repositories, training, and communities of practice to establish best practices and support researchers in meeting new data management policies and obligations. While there are fewer external funder requirements compared to STEM, these universities are using collaborative approaches to engage arts and humanities researchers in responsible research data management.
This document discusses predatory publishing and provides tips to help researchers avoid predatory journals. It notes that gold open access models have allowed corrupt publishers to flourish by only charging publication fees after acceptance. It outlines characteristics of predatory publishers like using similar names to reputable journals, having grammatical errors on their websites, no legitimate peer-review processes, and charging high author fees after publication. The document provides advice on how to check publishers and journals, such as looking for valid contact details, reviewing previous papers, and checking peer-review processes. It also suggests using a university repository as an alternative open access option without fees.
This document provides an overview of literature searching and using databases to find veterinary journal articles. It discusses what databases are and how they index journal articles. Key databases for veterinary literature are identified as Medline, Science Citation Index, Science Direct. Search strategies are recommended, including defining your question and identifying relevant concepts and terms. Instructions are provided for accessing databases through the library website and conducting sample searches.
This presentation was provided by Jan Fransen of the University of Minnesota - Twin Cities during the NISO virtual conference, Research Information Systems: The Connections Enabling Collaboration, held on August 16, 2017.
Creation, Transformation, Dissemination and Preservation: Advocating for Scho...NASIG
This document discusses scholarly communication and research workflows. It defines scholarly communication as the creation, transformation, dissemination, and preservation of knowledge related to teaching, research, and scholarly endeavors. It notes trends toward increased inter-institutional collaboration and the use of social media and tools to support collaboration. Libraries are focusing on supporting discoverability, availability, and research management. Comparison is made of citation management tools like EndNote, Mendeley, and Zotero. The conclusion emphasizes that scholarly communication now involves multiple authorship, inter-institutional collaboration, and collaboration through social networks.
EBSCO Discovery Service @ Union Institute & UniversityTina Beis
This document discusses Tina Beis' role as the Technical Services and Electronic Resources Librarian at a small online university. It provides details about how the university library selected and implemented EBSCO Discovery Service (EDS) to provide a unified search across its resources. The library customized EDS settings and holdings to best serve its remote students. Usage statistics and feedback are reviewed regularly to improve EDS.
This document provides an overview of basic resources available through the Penn State University Libraries. It describes the library blog, website, online catalog, and key databases like ProQuest, CQ Researcher, Opposing Viewpoints Resource Center, and NewsBank. It also mentions research guides, citation and writing guides, and ways to get help from University librarians. The goal is to introduce students to the many tools and resources available to support their research needs.
Collaboration between libraries, archives and museums: Essential for maintain...tsoleau
This is a presentation a gave on the topic of my Master\'s portfolio at UCLA in Nov 2009. Most of the content was spoken and not included in the slides, but you can still get the idea.
Library databases are collections of published information from reliable sources that are only accessible through the school library homepage. They contain fact-checked references, newspapers, magazines, academic journals, and primary sources across many subject areas. Databases are valuable research tools as they go beyond regular web searches by providing focused, curated resources to help narrow topics, and recommend related information from experts in various fields.
BEng Product Design 1st year Session 2 Oct 2021EISLibrarian
This document provides an overview of different sources of information and inspiration for product design students, including books, journals, magazines, trade journals, objects, websites, and library resources. It discusses the purpose and strengths of each information source, and provides guidance on evaluating online information and using library search tools and subject guides to find relevant materials.
The vision for ‘the Research Paper of the Future’ promises
to make scholarship more discoverable, transparent,
inspectable, reusable and sustainable. Yet new forms
of scientific output also challenge authors, librarians,
publishers and service providers to register, validate,
disseminate and preserve them as elements of the scholarly
record. What constitutes authorship in a collaborative
process of GitHub pull requests and commits? When to
capture, reference and preserve dynamic data sets that
change over time? How to package and render complex
executable collections for review and delivery? This session
considers key challenges in operationalising the Research
Paper of the Future from the perspectives of a publisher,
a library administrator and a scientist/developer of a
collaborative authoring platform.
Short overview responding to the following 4 questions, as suggested by the RDA Long Tail Data IG:
1. Name and location of institution/service
2. What type of data do you collect and how do you acquire the data?
3. What services do you provide?
4. How do you intend to interoperate with a global ecosystem of research data?
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
Part of the SciDataCon14 workshop on "Data Papers and their applications" run by myself and Brian Hole to help attendees understand current data-publishing journals and trends and help them understand the editorial processes on NPG's Scientific Data and Ubiquity's Open Health Data.
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Susanna-Assunta Sansone
- The document discusses the need for open and accessible data in research. It notes that over 50% of studies are not published due to selective reporting of results.
- There is a movement for "FAIR data" in life and medical sciences, where data is findable, accessible, interoperable, and reusable. However, not much data currently meets these standards.
- Publishers can play a role in incentivizing data sharing by implementing policies requiring data availability and format standards for publishing research. This includes supporting data citations and data journals.
The document summarizes a presentation about making scientific data FAIR (Findable, Accessible, Interoperable, Reusable). It discusses the concept of FAIR data and several of the presenter's related projects. Examples are provided of using standards like ISA-Tab to structure metadata and make datasets interoperable. The presentation outlines the presenter's roles in data capture, publication, and standards development efforts to promote FAIR data principles. Scientific Data, a new journal for peer-reviewed data descriptions, is introduced as a way to make datasets more discoverable and reusable.
This document summarizes the history and evolution of scholarly publishing, including:
- Commercial publishers began acquiring journals in the 1960s, dominating the market with high profit margins.
- Journal prices skyrocketed in the 1990s-2000s beyond what libraries could afford, known as the "serials crisis".
- The rise of open access emerged due to the internet and high journal costs, allowing unrestricted online access to peer-reviewed research.
- Publishers were seen as only caring about profits, fueling a backlash against traditional models.
Get assistance with grant compliance (public access policies), copyright questions, publication agreements, and rights retention from U of Tennessee's Scholarly Communication & Publishing Librarian.
This document provides an overview of information resources available at Lipscomb University's Beaman Library. It discusses reference books, periodicals, indexes, databases, and websites that can be used for research. Evaluation of online sources is also covered, highlighting the importance of considering the author, date, purpose, and content of a website. Students needing help with research are encouraged to consult reference librarians.
This presentation was provided by Courtney R. Butler of The Federal Reserve Bank - Kansas City, during part two of the NISO two-part webinar "Building Data Science Skills: Strategic Support for the Work, Part Two," which was held on March 18, 2020.
Discover the Power Inside Web of ScienceMaira Bundza
The document summarizes research conducted to analyze Eastern Michigan University faculty publications and citations between 2005-2007. Researchers retrieved article and citation data for EMU authors from the Web of Science database. They surveyed EMU faculty authors and interviewed a sample. The analysis found that 121 EMU faculty published 244 articles in 209 unique journals, 87.56% of which the university library owned. The most frequently cited journals were also identified. Faculty responses indicated the library resources were generally adequate but could be improved with additional online journals and resources.
This document summarizes challenges and efforts around managing research data in the arts and humanities. It discusses how "data" is not clearly defined in these domains as it is in STEM fields. Universities like UAL and GSA are working to educate researchers on identifying, organizing, and sharing their diverse research outputs and formats. This includes developing data repositories, training, and communities of practice to establish best practices and support researchers in meeting new data management policies and obligations. While there are fewer external funder requirements compared to STEM, these universities are using collaborative approaches to engage arts and humanities researchers in responsible research data management.
This document discusses predatory publishing and provides tips to help researchers avoid predatory journals. It notes that gold open access models have allowed corrupt publishers to flourish by only charging publication fees after acceptance. It outlines characteristics of predatory publishers like using similar names to reputable journals, having grammatical errors on their websites, no legitimate peer-review processes, and charging high author fees after publication. The document provides advice on how to check publishers and journals, such as looking for valid contact details, reviewing previous papers, and checking peer-review processes. It also suggests using a university repository as an alternative open access option without fees.
This document provides an overview of literature searching and using databases to find veterinary journal articles. It discusses what databases are and how they index journal articles. Key databases for veterinary literature are identified as Medline, Science Citation Index, Science Direct. Search strategies are recommended, including defining your question and identifying relevant concepts and terms. Instructions are provided for accessing databases through the library website and conducting sample searches.
This presentation was provided by Jan Fransen of the University of Minnesota - Twin Cities during the NISO virtual conference, Research Information Systems: The Connections Enabling Collaboration, held on August 16, 2017.
Creation, Transformation, Dissemination and Preservation: Advocating for Scho...NASIG
This document discusses scholarly communication and research workflows. It defines scholarly communication as the creation, transformation, dissemination, and preservation of knowledge related to teaching, research, and scholarly endeavors. It notes trends toward increased inter-institutional collaboration and the use of social media and tools to support collaboration. Libraries are focusing on supporting discoverability, availability, and research management. Comparison is made of citation management tools like EndNote, Mendeley, and Zotero. The conclusion emphasizes that scholarly communication now involves multiple authorship, inter-institutional collaboration, and collaboration through social networks.
EBSCO Discovery Service @ Union Institute & UniversityTina Beis
This document discusses Tina Beis' role as the Technical Services and Electronic Resources Librarian at a small online university. It provides details about how the university library selected and implemented EBSCO Discovery Service (EDS) to provide a unified search across its resources. The library customized EDS settings and holdings to best serve its remote students. Usage statistics and feedback are reviewed regularly to improve EDS.
This document provides an overview of basic resources available through the Penn State University Libraries. It describes the library blog, website, online catalog, and key databases like ProQuest, CQ Researcher, Opposing Viewpoints Resource Center, and NewsBank. It also mentions research guides, citation and writing guides, and ways to get help from University librarians. The goal is to introduce students to the many tools and resources available to support their research needs.
Collaboration between libraries, archives and museums: Essential for maintain...tsoleau
This is a presentation a gave on the topic of my Master\'s portfolio at UCLA in Nov 2009. Most of the content was spoken and not included in the slides, but you can still get the idea.
Library databases are collections of published information from reliable sources that are only accessible through the school library homepage. They contain fact-checked references, newspapers, magazines, academic journals, and primary sources across many subject areas. Databases are valuable research tools as they go beyond regular web searches by providing focused, curated resources to help narrow topics, and recommend related information from experts in various fields.
BEng Product Design 1st year Session 2 Oct 2021EISLibrarian
This document provides an overview of different sources of information and inspiration for product design students, including books, journals, magazines, trade journals, objects, websites, and library resources. It discusses the purpose and strengths of each information source, and provides guidance on evaluating online information and using library search tools and subject guides to find relevant materials.
The vision for ‘the Research Paper of the Future’ promises
to make scholarship more discoverable, transparent,
inspectable, reusable and sustainable. Yet new forms
of scientific output also challenge authors, librarians,
publishers and service providers to register, validate,
disseminate and preserve them as elements of the scholarly
record. What constitutes authorship in a collaborative
process of GitHub pull requests and commits? When to
capture, reference and preserve dynamic data sets that
change over time? How to package and render complex
executable collections for review and delivery? This session
considers key challenges in operationalising the Research
Paper of the Future from the perspectives of a publisher,
a library administrator and a scientist/developer of a
collaborative authoring platform.
Short overview responding to the following 4 questions, as suggested by the RDA Long Tail Data IG:
1. Name and location of institution/service
2. What type of data do you collect and how do you acquire the data?
3. What services do you provide?
4. How do you intend to interoperate with a global ecosystem of research data?
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
Part of the SciDataCon14 workshop on "Data Papers and their applications" run by myself and Brian Hole to help attendees understand current data-publishing journals and trends and help them understand the editorial processes on NPG's Scientific Data and Ubiquity's Open Health Data.
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Susanna-Assunta Sansone
- The document discusses the need for open and accessible data in research. It notes that over 50% of studies are not published due to selective reporting of results.
- There is a movement for "FAIR data" in life and medical sciences, where data is findable, accessible, interoperable, and reusable. However, not much data currently meets these standards.
- Publishers can play a role in incentivizing data sharing by implementing policies requiring data availability and format standards for publishing research. This includes supporting data citations and data journals.
The document summarizes a presentation about making scientific data FAIR (Findable, Accessible, Interoperable, Reusable). It discusses the concept of FAIR data and several of the presenter's related projects. Examples are provided of using standards like ISA-Tab to structure metadata and make datasets interoperable. The presentation outlines the presenter's roles in data capture, publication, and standards development efforts to promote FAIR data principles. Scientific Data, a new journal for peer-reviewed data descriptions, is introduced as a way to make datasets more discoverable and reusable.
ScientificData is a data journal that provides concise summaries of research data in 3 sentences or less:
ScientificData publishes structured data descriptors and accompanying research data to promote open and reproducible science. Data descriptors provide detailed methods and validation to allow other researchers to understand and reuse shared data. Through peer review of data quality and reuse potential, as well as providing incentives like citations, ScientificData aims to help address issues like selective reporting and make shared research data more accessible and useful.
This document summarizes Susanna-Assunta Sansone's presentation on open access and open data at Nature Publishing Group. Some key points discussed include:
- The benefits of open data including reducing errors/fraud and increasing return on investment in research. However, barriers also exist such as lack of incentives and standards.
- Recent initiatives at NPG to improve data/reproducibility such as requiring data behind figures and expanding methods sections.
- The role of data journals in increasing credit/visibility for shared data and promoting standards/best practices.
- Market research found researchers want increased visibility, usability, and credit for sharing their data.
Susanna-Assunta Sansone is a data consultant and honorary academic editor who works on several projects related to making data FAIR (Findable, Accessible, Interoperable, Reusable). She is the associate director of Scientific Data, a peer-reviewed journal focused on publishing data descriptors to describe and provide access to scientifically valuable datasets. The goal of Scientific Data is to help promote open science and data reuse by publishing structured metadata and narratives about datasets alongside traditional research articles.
Scientific Data overview of Data Descriptors - WT Data-Literature integration...Susanna-Assunta Sansone
This document introduces Scientific Data, a new peer-reviewed journal for publishing data descriptors from Nature Publishing Group. It will provide structured metadata and narrative articles to describe datasets for reuse. The journal is now open for submissions and will launch in May 2014, featuring an advisory panel and sections for standardized data descriptor articles and experimental metadata. It aims to give proper credit for data sharing and promote open access, reuse and peer review of curated scientific datasets.
The document provides an overview of research data management and the importance of avoiding a "DATApocalypse" or data disaster. It discusses the definition of research data, why data management is important, questions to consider, best practices for data management planning, documentation, and long-term preservation. The goal is to help researchers and institutions properly manage data to enable sharing and preservation, as required by most major funders.
Alain Frey Research Data for universities and information producersIncisive_Events
Research data is growing exponentially but is disparate and challenging to understand fully. Universities face challenges in managing research data to meet funding and standards requirements. Thomson Reuters launched the Data Citation Index to make research data discoverable, accessible, and citable by bringing important data from diverse repositories into one searchable index. This addresses the need for a single access point for quality research data across disciplines and locations.
This document describes a web-based registry that aims to provide guidance for researchers, developers, and curators on selecting content standards and databases. There are almost 600 content standards in the life sciences. The registry will allow users to search for, filter, submit, and view information on standards and databases. It will link standards to databases that implement them and provide visualizations of standard formats and terminologies. The goal is to help stakeholders make informed decisions by providing a curated, searchable registry of standards and database information. An advisory board and working group will oversee the registry's development and operations.
This document discusses the need for critical infrastructure to promote data synthesis and evidence-based nutrient management. It outlines 10 steps for real-time data uptake, analysis, and customized nutrient recommendations. Key challenges include data standards, minimum data sets, provenance, and repositories. The Purdue University Research Repository is presented as a solution, providing preservation, curation, and publication of agricultural data. Hands-on support from librarians and agronomists is discussed to help researchers transition data and ensure best practices.
Overview to: BBSRC Oxford Doctoral Training Partnership - Dr Sansone - July 2014Susanna-Assunta Sansone
What to know when planning for your data management strategy and preparing a data management statement for a research proposal for BBSRC DTP first year students
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014Susanna-Assunta Sansone
This document provides information about Scientific Data, an online publication from Nature Research that publishes peer-reviewed descriptions of scientifically valuable datasets. It summarizes the goals of Scientific Data, which are to promote data sharing, reuse, and reproducibility. The document outlines the structured format for Data Descriptors, which include both a narrative component and experimental metadata. It describes the peer review process, which focuses on data quality, completeness of description, and potential for reuse rather than novelty of findings. Finally, it provides examples of diverse current content and encourages collaboration with data repositories.
A 45min presentation given at the 'Getting published in Nature's Scientific Data journal', hosted by the University of Cambridge Research Data Management team (www.data.cam.ac.uk). Presented on Monday 11th January 2016.
1) The document discusses best practices for managing research data, including organizing files, documenting data with metadata, storing data securely both internally and externally, and presenting data through tables, charts, and text for publication and sharing.
2) Key recommendations for data management include using logical file naming conventions, non-proprietary file formats, and documenting data with standard metadata fields. External repositories can increase data accessibility and preservation.
3) Effective data presentation involves using tables and charts to clearly visualize quantitative and qualitative findings. Graphs should have clear titles and labels while tables should have logical data placement. Text should concisely summarize results.
Why should I care about information literacy? nmjb
This document summarizes a workshop on improving researchers' competency in information handling and data management. The workshop covered how information literacy relates to researcher development, defined information literacy using the 7 Pillars model, and discussed national initiatives and case studies in applying information literacy. Participants engaged in group work applying information literacy concepts to the Researcher Development Framework and discussed motivation and examples of good practice in supporting information literacy development.
This document discusses the BioSharing registry, which connects standards, databases, and policies in the life sciences. BioSharing provides a searchable portal for standards and databases, helping researchers choose the right options for publishing and funding requirements. It monitors the development of standards and their adoption. The registry links three sections on standards, databases, and policies to help answer common questions about which options to use. Users can search, filter, and refine results or create customized collections. BioSharing aims to support better informed decisions across the life sciences research community.
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataSusanna-Assunta Sansone
1) The document discusses Susanna-Assunta Sansone's roles and work related to promoting FAIR data standards and practices.
2) It highlights some of her leadership positions with organizations like BioSharing that work to map and promote standards.
3) The document also discusses Scientific Data, a peer-reviewed journal launched by Nature Publishing Group to publish detailed descriptions of scientifically valuable datasets to facilitate reuse.
Similar to Big data, small data, data papers - short statement for "BDebate on Biomedicine 2014" (20)
This document summarizes a presentation given by Susanna Sansone at the GSC 23rd meeting education day in Bangkok, Thailand on August 7, 2023. The presentation discussed standards across life sciences, including definitions of different types of standards and over 1,600 identified standards. It covered standard organizations and grassroots groups, as well as the FAIRsharing database which catalogs over 2,885 standards and databases and aims to promote their use and value across research.
The FAIRsharing journey in RDA document discusses:
1) FAIRsharing's growth and involvement with RDA since 2011, including its Working Group established in 2015 to curate standards, databases, and policies to promote FAIR data.
2) FAIRsharing's current activities and impact, such as its registry of over 4,000 records from many disciplines and usage in various tools and services.
3) Opportunities for further engagement with RDA, such as leveraging their expertise for contributions to the FAIR Cookbook, an open resource providing technical recipes for applying FAIR principles to life science data.
Overview of metadata standards, and how FAIRsharing and the FAIR Cookbook help selecting and using them. Presentation to the What is metadata? Common standards and properties. EHP Workshop, November 9, 2022: https://ephconference.eu/pre-conference-programme-441
Pharmas and academia are joining forces to make data FAIR (Findable, Accessible, Interoperable, and Reusable) through the development of the FAIR Cookbook. The FAIR Cookbook provides over 70 recipes and growing that give step-by-step guidance on improving the FAIRness of different data types through the use of tools, technologies, and best practices. It aims to provide practical examples and guidelines to support researchers, data managers, and others in managing data according to FAIR principles. The FAIR Cookbook is an open, community-developed resource overseen by an editorial board, with contributions from nearly 100 life sciences professionals.
FAIR, community standards and data FAIRification: components and recipesSusanna-Assunta Sansone
Overview of FAIR, FAIRsharing and the FAIR Cookbook at the ATI event on Knowledge Graphs: https://github.com/turing-knowledge-graphs/meet-ups/blob/main/symposium-2022.md
Presentation to the EOSC workshop on policies (https://www.google.com/url?q=https://eoscfuture.eu/eventsfuture/monitoring-eosc-readiness-fair-data-policies) on what FAIRsharing does for policies, including providing registration, discovery, flexible and clearer descriptions, relationships, machine readability and comparability.
The document summarizes how FAIRsharing assists others with promoting FAIR data principles without directly assessing FAIRness compliance. It does this by (1) providing a lookup service for standards and repositories via its API, (2) serving as a registry for FAIRness tests and indicators to make them discoverable, and (3) enabling communities to create profiles declaring which standards and repositories they use. The document also outlines FAIRsharing's operations, advisory boards, and future plans to further support assessment and tracking of FAIRness improvements over time.
ELIXIR is a European infrastructure that brings together life science resources from across Europe. It offers databases, tools, computing capabilities, and training opportunities. ELIXIR nodes provide these services and connect national data infrastructures. ELIXIR communities connect infrastructure experts to drive service developments. ELIXIR is funded through a mixed model including public sources. It works to sustain important biological data resources and make data FAIR through recommended standards and interoperability resources. ELIXIR also aims to develop a sustainable tools ecosystem and provides training through its portal.
Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop III, November 8, 2021.
Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop I, October 11, 2021.
The FAIR Cookbook poster, as presented at the ELIXIR-UK Node and the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
The FAIR Cookbook poster, as presented at the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
Big data, small data, data papers - short statement for "BDebate on Biomedicine 2014"
1. !
What is Big Data in Biomedicine?!
Data Types to be considered!
!
Susanna-Assunta Sansone, PhD!
!
@biosharing!
@isatools!
@scientificdata!
!
B-DEBATE: Big Data in Biomedicine. Challenges and Opportunities, 11 Nov, 2014
Data Consultant,
Honorary Academic Editor
Associate Director,
Principal Investigator
2. Let’s not forget the long tail of research data
• Big science efforts represent only a small proportion!
o often featuring homogenous and well-organized data!
!
• There is a large proportion of small independent research efforts!
o a rich variety of specialty data sets!
3. Let’s not forget the long tail of research data
• Small independent research efforts fall in the long-tail of the distribution!
o Most of this (such as as siloed databases, null findings) is
unpublished!
o These dark data hold a potential wealth of knowledge!
4. Plagued by selective reporting of data and methods
• Over 50% of completed studies in
biomedicine do not appear in the
published literature!
!
• Instead reside in file drawers
personal and hard drives!
!
• Often because results do not
conform to author's hypotheses!
“Only half the health-related
studies funded by the European
Union between 1998 and 2006 -
an expenditure of €6 billion - led
to identifiable reports”!
5. Role of data papers and data journals
• Incentive, credit for sharing!
o Big and small data!
o Unpublished data!
o Long tail of data!
o Curated aggregation !
• Peer review focus!
• Value of data vs. analysis!
• Discoverability and reusability!
o Complementing community
databases!
• Narrative/context!
6. Role of data papers and data journals
• The power of “small data” are in their aggregation and integration
with other datasets!
• There is value in all well-curated, validated and reusable data – big
and small!
7. Adding value to research articles and data records
Research
articles
Descriptors
Data
Data
records
8. Adding value to research articles and data records
Research
articles
Descriptors
Data
Data
records
Credit for sharing
your data
Focused on reuse
and reproducibility
Peer reviewed,
curated
Open Access
Promoting
community
data and code
repositories
12. Help stakeholders to make informed decisions
Researchers, developers and curators lack support and guidance on how to best navigate and
select content standards, understand their maturity, or find databases that implement them;
Funders, journals and librarians do not have enough information to make informed decisions
on which content standards or database to recommended in policies, or funded or implemented
13. Summarizing
• Selective reporting of data and methods is still an issue
• Let’s not forget the potential value of the long-tail of data
• Data papers and journals can provide incentive and
credit to share more data - big and small
• Content standards do help - but the current wealth of
options is an obstacle