Metadata and data citation. Session 2.5 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
The Road from Millennium to Alma: Two Tracks, One DestinationNASIG
In 2016, two academic libraries migrated from Innovative Interface’s Millennium to Ex Libris’ Alma. Though both libraries came from a similar starting point in terms of library software, their migration environments were quite different: Colorado State University’s migration involved two campuses, CSU Fort Collins and CSU Pueblo, while Central Connecticut State University migrated with a newly-formed consortium comprised of 18 institutions. Even though both libraries share the same proprietary ILS, the environmental differences between the two libraries shape their experiences throughout the migration process. The presenters will share their libraries’ unique experiences while also addressing commonalities germane to the ILS migration process such as pre-migration data clean up, data migration, training, and designing workflows. Particular attention will be paid to the data migration process that details the extraction process along with coordinating these efforts. Because Alma is designed on a different concept than III’s Millennium, the redesign of workflows is critical prior to the final cutover to the new system. In light of this, the presenters will address the engagement of staff during these discussions along with their professional growth. In addition to explaining the technical aspects of this migration, they will also delve beneath the surface of the intellectual labor required for implementation and examine the psychological impact on all constituents who will use the new system for their daily work.
Kristin D'Amato
Central Connecticut State University
Kristin D’Amato is the Head of Acquisitions and Serials at Central Connecticut State University’s Elihu Burritt Library. She received her master’s in Library and Information Science from SUNY Albany and her bachelor’s in English Literature from SUNY Geneseo.
Rachel Erb Edit Profile
Colorado State University
Rachel A. Erb is the Electronic Resources Management Librarian at Colorado State University’s Morgan Library. She received her master's in Library Science from Florida State University, a master's in Slavic Languages and Literatures from Ohio State University, and her bachelor’s in Russian from Dickinson College.
Opening Scholarly Communication in Social Sciences (OSCOSS)GESIS
Our system will initially provide readers, authors and reviewers with an alternative, thus having the potential to gain wider acceptance and gradually replace the old, incoherent publication process of our journals and of others in related fields. It will make journals more “open” (in terms of reusability) that are open access already, and it has the potential to serve as an incentive for turning “closed” journals into open access ones.
OSCOSS is funded by the DFG in the Open Access Transformation programme.
This presentation illustrates an online A-Z usage statistics web site at Arizona State University Libraries and how usage reports are gathered, stored and made accessible to decision makers. Furthermore, details about creating a usage web site and challenges one may encounter. Additionally, potential uses and future plans are discussed.
The Road from Millennium to Alma: Two Tracks, One DestinationNASIG
In 2016, two academic libraries migrated from Innovative Interface’s Millennium to Ex Libris’ Alma. Though both libraries came from a similar starting point in terms of library software, their migration environments were quite different: Colorado State University’s migration involved two campuses, CSU Fort Collins and CSU Pueblo, while Central Connecticut State University migrated with a newly-formed consortium comprised of 18 institutions. Even though both libraries share the same proprietary ILS, the environmental differences between the two libraries shape their experiences throughout the migration process. The presenters will share their libraries’ unique experiences while also addressing commonalities germane to the ILS migration process such as pre-migration data clean up, data migration, training, and designing workflows. Particular attention will be paid to the data migration process that details the extraction process along with coordinating these efforts. Because Alma is designed on a different concept than III’s Millennium, the redesign of workflows is critical prior to the final cutover to the new system. In light of this, the presenters will address the engagement of staff during these discussions along with their professional growth. In addition to explaining the technical aspects of this migration, they will also delve beneath the surface of the intellectual labor required for implementation and examine the psychological impact on all constituents who will use the new system for their daily work.
Kristin D'Amato
Central Connecticut State University
Kristin D’Amato is the Head of Acquisitions and Serials at Central Connecticut State University’s Elihu Burritt Library. She received her master’s in Library and Information Science from SUNY Albany and her bachelor’s in English Literature from SUNY Geneseo.
Rachel Erb Edit Profile
Colorado State University
Rachel A. Erb is the Electronic Resources Management Librarian at Colorado State University’s Morgan Library. She received her master's in Library Science from Florida State University, a master's in Slavic Languages and Literatures from Ohio State University, and her bachelor’s in Russian from Dickinson College.
Opening Scholarly Communication in Social Sciences (OSCOSS)GESIS
Our system will initially provide readers, authors and reviewers with an alternative, thus having the potential to gain wider acceptance and gradually replace the old, incoherent publication process of our journals and of others in related fields. It will make journals more “open” (in terms of reusability) that are open access already, and it has the potential to serve as an incentive for turning “closed” journals into open access ones.
OSCOSS is funded by the DFG in the Open Access Transformation programme.
This presentation illustrates an online A-Z usage statistics web site at Arizona State University Libraries and how usage reports are gathered, stored and made accessible to decision makers. Furthermore, details about creating a usage web site and challenges one may encounter. Additionally, potential uses and future plans are discussed.
Resources in uct libraries fin_hon_2017Susanne Noll
Presentation to familiarise 'Finance' students with the navigation of the University of Cape Town (UCT) Libraries' home page, secondary and primary resources, reference managing tool RefWorks,
and the evaluation of the internet resources.
An analysis and characterization of DMPs in NSF proposals from the University...Megan O'Donnell
Beginning in July 2011, the University of Illinois at Urbana-Champaign Library, working in conjunction with the campus Office of Sponsored Programs and Research Administration (OSPRA) began an analysis of Data Management Plans (DMPs) in newly submitted National Science Foundation (NSF) grant proposals. The DMP became a required element in all NSF proposals beginning on January, 18th 2011. This analysis was undertaken to provide the Illinois campus and library with detailed information on the DMPs being submitted by Illinois researchers. In particular, the analysis allows us to categorize the grant applicant’s proposed DMP data storage venues and data reuse mechanisms, and provides us with data on the use of DMP templates developed by the University of Illinois Library.
Enterprise Data World Webinars: Metadata Management – Getting Off On The Righ...DATAVERSITY
One of the great things about EDW is that each year new people join the Data Management community, and bring new perspectives to the challenges we all work with. At the same time, there are some questions that are re-raised very year, as the pool of talent is refreshed! One is "How should I start with metadata management?"
Initial metadata management projects often founder because they are wrongly scoped, wrongly structured, poorly justified or based on unverified assumptions. Drawing on many years of experience, Ian Rowlands, ASG's VP of Product Management for metadata solutions will walk areas that have to be considered to establish an environment for metadata success.
The focus will be on:
Justifying metadata management
Managing your stakeholders
Scoping a project
Marketing your solution
Managing for sustainability
Working with vendors
This webinar will preview the upcoming talk at Enterprise Data World 2014 Conference & Expo.
An overview of the benefits of using both taxonomies and metadata to make your information easier to search. Presentation by Alice Redmond-Neal of Access Innovations, Inc.
Research, researchers, and research data management. Session 1.2 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
RDMRose 1.5 Data management and sharing plansRDMRose
Data management and sharing plans. Session 1.5 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Resources in uct libraries fin_hon_2017Susanne Noll
Presentation to familiarise 'Finance' students with the navigation of the University of Cape Town (UCT) Libraries' home page, secondary and primary resources, reference managing tool RefWorks,
and the evaluation of the internet resources.
An analysis and characterization of DMPs in NSF proposals from the University...Megan O'Donnell
Beginning in July 2011, the University of Illinois at Urbana-Champaign Library, working in conjunction with the campus Office of Sponsored Programs and Research Administration (OSPRA) began an analysis of Data Management Plans (DMPs) in newly submitted National Science Foundation (NSF) grant proposals. The DMP became a required element in all NSF proposals beginning on January, 18th 2011. This analysis was undertaken to provide the Illinois campus and library with detailed information on the DMPs being submitted by Illinois researchers. In particular, the analysis allows us to categorize the grant applicant’s proposed DMP data storage venues and data reuse mechanisms, and provides us with data on the use of DMP templates developed by the University of Illinois Library.
Enterprise Data World Webinars: Metadata Management – Getting Off On The Righ...DATAVERSITY
One of the great things about EDW is that each year new people join the Data Management community, and bring new perspectives to the challenges we all work with. At the same time, there are some questions that are re-raised very year, as the pool of talent is refreshed! One is "How should I start with metadata management?"
Initial metadata management projects often founder because they are wrongly scoped, wrongly structured, poorly justified or based on unverified assumptions. Drawing on many years of experience, Ian Rowlands, ASG's VP of Product Management for metadata solutions will walk areas that have to be considered to establish an environment for metadata success.
The focus will be on:
Justifying metadata management
Managing your stakeholders
Scoping a project
Marketing your solution
Managing for sustainability
Working with vendors
This webinar will preview the upcoming talk at Enterprise Data World 2014 Conference & Expo.
An overview of the benefits of using both taxonomies and metadata to make your information easier to search. Presentation by Alice Redmond-Neal of Access Innovations, Inc.
Research, researchers, and research data management. Session 1.2 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
RDMRose 1.5 Data management and sharing plansRDMRose
Data management and sharing plans. Session 1.5 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
What infrastructure is necessary for successful research data management (RDM...heila1
RDM life cycle; research data elements in the research life cycle; what is RDM infrastructure; IT infrastructure; Library infrastructure; Research Office infrastructure; Examples of 4 universities RDM service offerings
UK Research Data Management: overview to ADBU congress, 19 Sep 2013 by Laura ...L Molloy
Research data management in the UK: interventions by the Jisc Managing Research Data programme and the Digital Curation Centre. Specifies the importance of academic librarians for RDM. Includes links to openly available training resources. Presentation by L Molloy to ABDU congress, 19 Sep 2013 in Le Havre.
Interviewing a researcher on Research Data Management .Session 2.6 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
RDMRose 2.3 Institutional data repository policiesRDMRose
Policies for institutional research data repositories. Session 2.3 of the RDMRose 2.3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Data Asset Framework (DAF) surveys. Session 3.1 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Interlinking Standardized OpenStreetMap Data and Citizen Science Data in the ...Werner Leyh
Abstract. The aim of this work is to explore the opportunities offered by
semantic standardization to interlink primary “spatial data” (GI) from “Open-
StreetMap” (OSM) with repositories of the “Linked Open Data Cloud” (LOD).
Research in natural sciences can generate vast amounts of spatial data, where
Wikidata could be considered as the central hub between more detailed natural
science hubs on the spatial semantic web. Wikidata is a world readable and
writable community-driven knowledge base. It offers the opportunity to collaboratively
construct an open access knowledge graph that spans biology,
medicine, and all other domains of knowledge. In this study, we discuss
the opportunities and challenges provided by exploring Wikidata as a central
integration facility by interlink it with OSM, a popular, community driven
collection of free geographic data. This is empowered by the reuse of terms
and properties from commonly understood controlled vocabularies that
represent their respective well-identified knowledge domains.
URL: https://www.springerprofessional.de/en/interlinking-standardized-openstreetmap-data-and-citizen-science/13302088
DOI: https://doi.org/10.1007/978-3-319-60366-7_9
Werner Leyh, Homero Fonseca Filho
University of São Paulo (USP), São Paulo, Brazil
WernerLeyh@yahoo.com
Similar to RDMRose 2.5 Metadata and data citation (20)
Introduction to the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
These role cards accompany the slides for session 3.2 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
RDMRose 4.1 Handout institutional case studyRDMRose
This handout accompanies the slides of session 4.1 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Introduction to the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
The basics of Research Data Management. Session 1.1 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
The academic research data lifecycle. Session 1.4 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Institutional research data services in Higher Education. Session 1.6 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Institutional research data services in Higher Education. Session 2.1 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Practical research data management. Session 2.2 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Designing library webpages on Research Data Management. Session 2.4 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Advocacy in Research Data Management. Session 3.2 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Training researchers in Research Data Management. Session 3.3 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Rdm rose v3-slides-4.1-an-institutional-case-studyRDMRose
An institutional case study of Research Data Management. Session 4.1 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Research Data Management as a wicked problem. Session 4.2 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Review of the workshops. Session 4.3 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Resources for further study on research data management. Session 4.4 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
1. Metadata and data citation
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Research Data Management
Workshop 2.5
2. Learning Outcomes
By the end of this session you will be able to
• Discuss the varying requirements of metadata
that will enable researchers to identify the
potential of a particular dataset
• Evaluate ways of citing data
• Articulate and reflect upon some of the issues
involved with citing data and datasets
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
3. Session 2.5 overview
• EPSRC principles and expectations
• What is sufficient metadata?
• How to cite data?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
4. EPSRC Principle 6
• “Sufficient metadata should be recorded and made openly
available to enable other researchers to understand the
potential for further research and re-use of the data.
Published results should always include information on how
to access the supporting data.”
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/principles.
aspx
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
5. EPSRC Expectation 5
• “Research organisations will ensure that appropriately
structured metadata describing the research data they hold
is published (normally within 12 months of the data being
generated) and made freely accessible on the internet; in
each case the metadata must be sufficient to allow others to
understand what research data exists, why, when and how it
was generated, and how to access it. Where the research
data referred to in the metadata is a digital object it is
expected that the metadata will include use of a robust
digital object identifier (For example as available through the
DataCite organisation - http://datacite.org).”
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/exp
ectations.aspx
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
6. Activity 1: Metadata
• What is “sufficient metadata” that enables
“other researchers to understand the
potential for further research and re-use of
the data”?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
7. Activity 1: Metadata
The University of Poppleton holds a dataset with meteorological
observations, taken at the university’s weather station. In particular, it
contains a set of precipitation measurements since the foundation of
the university. A climatologist, Jenny Fairweather, is interested in this
dataset for her research into climate change. She is looking for trends
in the weather. A meteorologist, Wilson Rainbird, who works for the
UK Met Office wants to use these data for the purposes of weather
prediction. He is mainly interested in combining these precipitation
measurements with other similar datasets. A researcher, Alice Snowe,
from another university’s Accident Research Unit conducts most of her
research in the area of road traffic accidents. She would like to map
the precipitation measurements to another dataset containing
information on road accidents in order to analyse possible
correlations. Lastly, the university’s data repository manager, John
Shower, is concerned with issues regarding data access and IPR.
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
8. Activity 1: Metadata
• What is “sufficient metadata” for each of
these stakeholders “to understand the
potential for further research and re-use of
the data”?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
9. Example
• The DaMaRO project at the University of Oxford has developed a
metadata schema for its DataFinder (Rumsey, 2012).
• A three-tier metadata approach:
– Mandatory minimal metadata to enable basic discovery, such as
Creator, Title, Publisher, Date, Location, Access terms & conditions
– Mandatory contextual metadata (mostly administrative and partly
based on EPSRC expectations), such as Funding Agency, Grant Number,
Last access request date, Project Information, Data Generation
Process, Why the data was generated, Date (range) of data collection,
Reasons for embargo
– Optional metadata (including discipline-specific metadata) to enable
reuse, such as machine settings and experimental conditions under
which the data were gathered
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
10. Activity 2: Data citation
• How should data be cited?
• There are no established standards for data
citation yet, although some style manuals
such as the APA’s (in the 5th and 6th editions)
and some repositories such as the UK Data
Archive do provide instructions.
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
11. Activity 2: Data citation
• Researcher, Alice Snowe, from another university’s
Accident Research Unit is seeking to use the dataset
with precipitation measurements going back to the
foundation of the University. This dataset was
deposited in 2011 by the University’s meteorologist,
Christopher Oldman Frost, and covers all years up to
and including 2010. It consists of data subsets that are
organised per year, each consisting of several files,
including Excel spreadsheets, Word files, and image
files (digitised observations written down on paper). Of
course, Mr Oldman Frost is not the only meteorologist
who has been involved in taking the measurements
that make up this dataset.
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
12. Activity 2: Data citation
• Alice Snowe is now writing a research paper for Science
called ‘The correlation between bicycle accidents and
precipitation in urban centres during the rush hour’.
She needs to cite our institutional repository’s dataset.
In particular she will need to refer to the precipitation
measurements of 4 May 1979. Elsewhere in her article
she also needs to refer to a subset covering the winter
months of the years 1981-1985.
• Write down the references that Alice Snowe needs to
give in her article.
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
13. APA
Basic form:
• Rightsholder. (Year). Title of data set (Version number)
[Description of form]. Location: Name of producer.
or
Rightsholder. (Year). Title of data set (Version number)
[Description of form]. Retrieved from http://
• University of Poppleton (2011). Precipitation
measurements 1905-2010 taken at Western Bank
weather station [Data files and documentation].
Poppleton: The University of Poppleton,
Meteorological Service.
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
14. DataCite
• DataCite (http://www.datacite.org) is a not-for-profit
organisation that aims to promote and support the
sharing of research data
• They are developing an infrastructure that supports
methods of data citation, discovery, and access
• They are currently leveraging the DOI (Digital Object
Identifier) infrastructure, which is also used for
research articles
• They can provide DOIs for datasets
• DataCite DOIs have to resolve to a public landing page
with information about the dataset and a direct link to
it
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
15. DataCite
Basic form:
• Creator (PublicationYear): Title. Version. Publisher.
ResourceType. Identifier
• Version and ResourceType are optional elements
• For citation purposes, DataCite recommends that DOI
names are displayed as linkable, permanent URLs
• More info in DataCite (2011)
• University of Poppleton (2011): Precipitation
measurements 1905-2010 taken at Western Bank weather
station. Meteorological service, The University of
Poppleton. http://dx.doi.org/10.1594/UoP.MS.298
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
16. Activity 2: Data citation
• What practical issues did you encounter when
writing the references for Alice Snowe’s
research paper? How could these issues be
solved?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
17. Data Citation
• Issues include (Ball & Duke, 2011a and b):
– At what granularity should data be made citeable?
– How to credit each contributor in a dataset that is
assembled from very many contributions?
– Where in a research paper should a data citation
be given (e.g. a paper describing a dataset versus
subsequent papers using it)?
– What to do with frequently updated data?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
19. References
• American Psychological Association (2010). Publication
Manual of the American Psychological Association (6th
edition). Washington, DC: American Psychological Association,
pp. 210-211.
• Ball, A., & Duke, M. (2011a). Data Citation and Linking. DCC
Briefing Papers. Edinburgh: Digital Curation Centre. Retrieved
from http://www.dcc.ac.uk/resources/briefing-
papers/introduction-curation/data-citation-and-linking
• Ball, A., & Duke, M. (2011b). How to Cite Datasets and Link to
Publications. DCC How-To Guides. Edinburgh: Digital Curation
Centre. Retrieved from http://www.dcc.ac.uk/resources/how-
guides/cite-datasets
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
20. References
• DataCite (2011). DataCite Metadata Schema for the Publication and
Citation of Research Data. Version 2.2. London: DataCite. Retrieved
from http://schema.datacite.org/meta/kernel-2.2/doc/DataCite-
MetadataKernel_v2.2.pdf. doi:10.5438/0005
• DataCite (n.d.). Why cite data? Hannover. Retrieved from
http://datacite.org/whycitedata
• Rumsey, S. (2012). Just enough metadata: Metadata for research
datasets in institutional data repositories [PowerPoint
presentation]. Oxford: The University of Oxford. Retrieved from
http://damaro.oucs.ox.ac.uk/docs/Just%20enough%20metadata%2
0v3-1.pdf
• UK Data Archive (n.d.). Citing Data. Colchester. Retrieved from
http://www.data-archive.ac.uk/conditions/citing-data
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Editor's Notes
Cf UK Data Archive’s distinction between data-level documentation and study-level documentation