2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...ICPSR
Data Sharing with ICPSR was presented at IASSIST 2015 in Minneapolis, MN.
The learning objectives and content cover:
- Federal data sharing requirements and
other good reasons to share data
• Options for sharing data
• Protection of confidentiality when
sharing data
• Data discovery tools
• Online data exploration tools from ICPSR
This is an update on the status of federal requirements for data sharing in 2015. These slides were presented at ACRL in Portland in March 2015, by Linda Detterman and Jared Lyle of ICPSR, based at the University of Michigan. The session includes overviews of federal requirements, data curation, data management plans, data sharing services, and lots of fun!
Slides from Monday 30 July - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
RDAP13 Mark Parsons: The Research Data Alliance: Making Data WorkASIS&T
Mark Parsons, Rensselaer Polytechnic Institute
Mark A. Parsons and Francine Berman: "The Research Data Alliance: Making Data Work"
Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
This slide deck provides an overview and resources to respond to the OSTP memo with the subject: Increasing Access to the Results of Federally Funded Scientific Research issued by John P. Holdren in February 2013. It provides resources and information agencies, foundations, and research projects can use to assemble achieve public access to scientific data in digital formats.
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...ICPSR
Data Sharing with ICPSR was presented at IASSIST 2015 in Minneapolis, MN.
The learning objectives and content cover:
- Federal data sharing requirements and
other good reasons to share data
• Options for sharing data
• Protection of confidentiality when
sharing data
• Data discovery tools
• Online data exploration tools from ICPSR
This is an update on the status of federal requirements for data sharing in 2015. These slides were presented at ACRL in Portland in March 2015, by Linda Detterman and Jared Lyle of ICPSR, based at the University of Michigan. The session includes overviews of federal requirements, data curation, data management plans, data sharing services, and lots of fun!
Slides from Monday 30 July - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
RDAP13 Mark Parsons: The Research Data Alliance: Making Data WorkASIS&T
Mark Parsons, Rensselaer Polytechnic Institute
Mark A. Parsons and Francine Berman: "The Research Data Alliance: Making Data Work"
Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
This slide deck provides an overview and resources to respond to the OSTP memo with the subject: Increasing Access to the Results of Federally Funded Scientific Research issued by John P. Holdren in February 2013. It provides resources and information agencies, foundations, and research projects can use to assemble achieve public access to scientific data in digital formats.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
Link resolver failures, erroneous URLs, EZproxy
configuration errors and inaccurate metadata in e-resource
records are commonplace problems reported by users in
pursuit of e-resource access. This presentation describes
the categorisation and analysis of data generated from the
troubleshooting process over the period of an academic
year. The process is designed to be pre-emptive, seeking to
anticipate e-resource problems that users may encounter,
and productive, providing insight to inform user instruction
and trigger mechanisms to create enhanced electronic
access for users.
Geraldine O Beirn, Queen’s University Belfast
Unlocking the potential of cloud in research and education - Jisc Digifest 2016Jisc
We’re delighted to be hosting a discussion on the transformative potential of cloud computing for research and education, and the day-to-day running of our institutions.
We’ll be crowdsourcing questions from delegates via the Digifest app - look out for the prompt!
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...Micah Altman
This class focuses on the tools and good practices for storing confidential data, sharing data for collaboration, and publishing data or derivative results for broad use. Topics covered in this class include: an overview of information security standards and frameworks; information security core practices (credentials, authentication, authorization, and auditing); information partitioning and secure linking; file, disk, and network encryption tools and practices; cloud storage practices for confidential information; data “de-identification” tools and practices; statistical disclosure limitation approaches and tools; and data use agreements.
UK and US positions on open access – Steven Hill, HEFCE and Sarah Thomas, Harvard University
University of California and university digital library costing models – MacKenzie Smith, UC Davis
Total cost of ownership and flipped OA – Liam Earney, Jisc
Jisc and CNI conference, 6 July 2016
Stop press: should embargo conditions apply to metadata?Jisc RDM
Sarah Middle of Cambridge University discusses whether embargo conditions should apply to metadata. Session held at the Research Data Network event in May 2016, Cardiff University.
Data Management and Broader Impacts: a holistic approachMegan O'Donnell
[please download to view at full resolution]
The National Science Foundation’s (NSF) Broader Impacts Criterion asks scientists to frame their research beyond “science for science’s sake.” Examining data and data management through a Broader Impacts lens highlights the benefits of good data management, data management plans (DMPs), and strengthens the argument for better Data Information Literacy (DIL) in the sciences.
Research Integrity Advisor and Data ManagementARDC
Dr Paul Wong from the Australian Research Data Commons presented at the University of Technology Sydney's RIA Data Management Workshop on 21 June 2018. In partnership with the Australian Research Council, the National Health and Medical Research Council, the Australian Research Data Commons, and RMIT University, this is part of a national workshop series in data management for research integrity advisors.
From Data Policy Towards FAIR Data For All: How standardised data policies ca...Rebecca Grant
There is evidence that good data practice leads to increased citation, increased reproducibility, increased productivity, reduced harm and costs of biased or non-transparent research, and that it helps researchers with career progression and provides a better return on investment in research funding. In this presentation we will share feedback on data sharing from a survey of more than 11,000 researchers globally, as well as evidence from our own implementation of standardised data policies and the work of the Research Data Alliance’s Data Policy Implementation Interest Group.
OPEN DATA. The researcher perspective
Preface
Paul Wouters
Professor of Scientometrics,
Director of CWTS,
Leiden University
Wouter Haak
Vice President,
Research Data Management,
Elsevier
A year ago, in April 2016, Leiden University’s Centre for
Science and Technology Studies (CWTS) and Elsevier
embarked on a project to investigate open data practices
at the workbench in academic research. Knowledge
knows no borders, so to understand open data practices
comprehensively the project has been framed from the
outset as a global study. That said, both the European
Union and the Dutch government have formulated the
transformation of the scientific system into an open
innovation system as a formal policy goal. At the time
we started the project, the Amsterdam Call for Action on
Open Science had just been published under the Dutch
presidency of the Council of the European Union. However,
how are policy initiatives for open science related to the
day-to-day practices of researchers and scholars?
Principles, key responsibilities, and their intersectionARDC
Dr Daniel Barr from RMIT University presented at the University of Technology Sydney's RIA Data Management Workshop on 21 June 2018. In partnership with the Australian Research Council, the National Health and Medical Research Council, the Australian Research Data Commons, and RMIT University, this is part of a national workshop series in data management for research integrity advisors.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
Link resolver failures, erroneous URLs, EZproxy
configuration errors and inaccurate metadata in e-resource
records are commonplace problems reported by users in
pursuit of e-resource access. This presentation describes
the categorisation and analysis of data generated from the
troubleshooting process over the period of an academic
year. The process is designed to be pre-emptive, seeking to
anticipate e-resource problems that users may encounter,
and productive, providing insight to inform user instruction
and trigger mechanisms to create enhanced electronic
access for users.
Geraldine O Beirn, Queen’s University Belfast
Unlocking the potential of cloud in research and education - Jisc Digifest 2016Jisc
We’re delighted to be hosting a discussion on the transformative potential of cloud computing for research and education, and the day-to-day running of our institutions.
We’ll be crowdsourcing questions from delegates via the Digifest app - look out for the prompt!
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...Micah Altman
This class focuses on the tools and good practices for storing confidential data, sharing data for collaboration, and publishing data or derivative results for broad use. Topics covered in this class include: an overview of information security standards and frameworks; information security core practices (credentials, authentication, authorization, and auditing); information partitioning and secure linking; file, disk, and network encryption tools and practices; cloud storage practices for confidential information; data “de-identification” tools and practices; statistical disclosure limitation approaches and tools; and data use agreements.
UK and US positions on open access – Steven Hill, HEFCE and Sarah Thomas, Harvard University
University of California and university digital library costing models – MacKenzie Smith, UC Davis
Total cost of ownership and flipped OA – Liam Earney, Jisc
Jisc and CNI conference, 6 July 2016
Stop press: should embargo conditions apply to metadata?Jisc RDM
Sarah Middle of Cambridge University discusses whether embargo conditions should apply to metadata. Session held at the Research Data Network event in May 2016, Cardiff University.
Data Management and Broader Impacts: a holistic approachMegan O'Donnell
[please download to view at full resolution]
The National Science Foundation’s (NSF) Broader Impacts Criterion asks scientists to frame their research beyond “science for science’s sake.” Examining data and data management through a Broader Impacts lens highlights the benefits of good data management, data management plans (DMPs), and strengthens the argument for better Data Information Literacy (DIL) in the sciences.
Research Integrity Advisor and Data ManagementARDC
Dr Paul Wong from the Australian Research Data Commons presented at the University of Technology Sydney's RIA Data Management Workshop on 21 June 2018. In partnership with the Australian Research Council, the National Health and Medical Research Council, the Australian Research Data Commons, and RMIT University, this is part of a national workshop series in data management for research integrity advisors.
From Data Policy Towards FAIR Data For All: How standardised data policies ca...Rebecca Grant
There is evidence that good data practice leads to increased citation, increased reproducibility, increased productivity, reduced harm and costs of biased or non-transparent research, and that it helps researchers with career progression and provides a better return on investment in research funding. In this presentation we will share feedback on data sharing from a survey of more than 11,000 researchers globally, as well as evidence from our own implementation of standardised data policies and the work of the Research Data Alliance’s Data Policy Implementation Interest Group.
OPEN DATA. The researcher perspective
Preface
Paul Wouters
Professor of Scientometrics,
Director of CWTS,
Leiden University
Wouter Haak
Vice President,
Research Data Management,
Elsevier
A year ago, in April 2016, Leiden University’s Centre for
Science and Technology Studies (CWTS) and Elsevier
embarked on a project to investigate open data practices
at the workbench in academic research. Knowledge
knows no borders, so to understand open data practices
comprehensively the project has been framed from the
outset as a global study. That said, both the European
Union and the Dutch government have formulated the
transformation of the scientific system into an open
innovation system as a formal policy goal. At the time
we started the project, the Amsterdam Call for Action on
Open Science had just been published under the Dutch
presidency of the Council of the European Union. However,
how are policy initiatives for open science related to the
day-to-day practices of researchers and scholars?
Principles, key responsibilities, and their intersectionARDC
Dr Daniel Barr from RMIT University presented at the University of Technology Sydney's RIA Data Management Workshop on 21 June 2018. In partnership with the Australian Research Council, the National Health and Medical Research Council, the Australian Research Data Commons, and RMIT University, this is part of a national workshop series in data management for research integrity advisors.
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
Similar to 2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement on Research Transparency and Data Citation (George Alter - ICPSR)
Dataverse in the Universe of Data by Christine L. Borgmandatascienceiqss
Data repositories are much more than "black boxes" where data go in but may never come out. Rather, they are situated in communities, with contributors, users, reusers, and repository staff who may engage actively or passively with participants. This talk will explore the roles that Dataverse plays – or could play – in individual communities.
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
Open Research Data Frameworks: Lessons for the Global SouthAnup Kumar Das
The presentation titled "Open Research Data Frameworks: Lessons for the Global South" was delivered in the National Symposium on Improving eGovernance using Big Data Analytics, held at Department of Management Studies, Indian Institute of Technology Delhi, India, on 28th February 2017. The symposium was a run up event of ICEGOV2017 (10th International Conference on Theory and Practice of Electronic Governance), held at New Delhi. I briefly discussed the global initiatives such as UNESCO's Global Open Access Portal (GOAP), Re3Data.org (Registry of Research Data Repositories), GODAN (Global Open Data for Agriculture and Nutrition), Research Data Alliance (RDA), ICSSR Data Service, and self-archiving of scientific data on data repositories.
FAIR for the future: embracing all things dataARDC
FAIR for the future: embracing all things data - Natasha Simons, Keith Russell and Liz Stokes, presented at Taylor & Francis Scholarly Summits in Sydney 11 Feb 2019 and Melbourne 14 Feb 2019.
Funding agencies are instituting requirements for data management and sharing as a condition of receiving research funds. This presentation addresses why researchers should care about research data management, what libraries have to do with it, and a case study of what one research specialist at the University of Colorado Anschutz Medical Campus is doing in this area.
Talk given at the Data Visualisation and the Future of Academic Publishing event. https://www.eventbrite.com/e/data-visualisation-and-the-future-of-academic-publishing-tickets-25372801733?password=dataviz
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...ICPSR
This is ICPSR's core workshop deck designed to introduce, remind, and refresh your knowledge of ICPSR. It contains four "tours" or sub-presentations describing ICPSR's general reason for being, it's social and behavioral research data complete with search strategies, its training, educational, and instructional resources, and its data management and curation services, data repository options, and support resources (content and budget estimates) for those writing grant proposals.
Ada slide presentation rsc day_feb2017_v2SusanMRob
Introduction to Australian Data Archive by Steven McEachern presented at the Research Support Community Day 2017
Similar to 2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement on Research Transparency and Data Citation (George Alter - ICPSR) (20)
ODIN Final Event - Publishing and citing, and the role of persistent identifiersdatacite
Sünje Dallmeier-Tiessen
CERN
Presentation delivered at the ODIN Final Event in Amsterdam (Netherlands) on Wednesday, September 24, 2014: ORCID and DataCite: Towards Holistic Open Research.
More info: www.odin-project.eu
ODIN Final Event - Submission to datacentresdatacite
Sergio Ruiz
DataCite
Presentation delivered at the ODIN Final Event in Amsterdam (Netherlands) on Wednesday, September 24, 2014: ORCID and DataCite: Towards Holistic Open Research.
More info: www.odin-project.eu
ODIN Final Event - Supporting the research lifecycle: Discovery and Analysisdatacite
Rachael Kotarski
The British Library
Presentation delivered at the ODIN Final Event in Amsterdam (Netherlands) on Wednesday, September 24, 2014: ORCID and DataCite: Towards Holistic Open Research.
More info: www.odin-project.eu
ODIN Final Event - The Care and Feeding of Scientific Datadatacite
Mercè Crosas @mercecrosas
Director of Data Science, IQSS, Harvard University
Presentation delivered at the ODIN Final Event in Amsterdam (Netherlands) on Wednesday, September 24, 2014: ORCID and DataCite: Towards Holistic Open Research.
More info: www.odin-project.eu
2013 DataCite Summer Meeting - Thomson Reuters Data citation index cooperatio...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Update on Force 11 and the Amsterdam manifesto...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Opening Keynote: A short history of the Higgs ...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement on Research Transparency and Data Citation (George Alter - ICPSR)
1. Data Access and Research
Transparency:
a Data Repository View
George Alter
ICPSR
University of Michigan
2. Mission: ICPSR provides leadership and training in data access,
curation, and methods of analysis for a diverse and expanding social
science research community.
• Acquire and archive social science data
• Distribute data to researchers
• Preserve data for future generations
• Provide training in quantitative methods
About the Inter-university Consortium for Political
and Social Research (ICPSR)
3. ICPSR Then and Now
• ICPSR History
– Established in 1962 so that social scientists could share
data
– Started as a partnership among 21 universities
– Data distributed on punched cards and then magnetic
reel to reel tape
4. ICPSR Then and Now
• ICPSR History
– Established in 1962 so that social scientists could share data
– Started as a partnership among 21 universities
– Data distributed on punched cards and then magnetic reel
to reel tape
• ICPSR Today
– More than 700 members
– 390+ U.S. institutions
– 46 national memberships
– 8,000+ data collections
– Direct downloads
– Online analysis
5. Data archiving and dissemination for more
than 20 federal and private agencies
9. “Building Community Engagement in
Data Citation and Open Access to
Data”
• Funded by Alfred P. Sloan Foundation
– Challenge Grants to improve data citation and
access
– Social science journals
– Domain repositories
10. “Building Community Engagement in Data
Citation and Open Access to Data”
• Challenge grants: 4 selected from 26
applications:
– Richard Ball and Norm Medeiros, "Replication of Empirical
Research: A Soup-to-Nuts Protocol for Documenting Data
Management and Analysis," Haverford College
– Thomas Carsey, "Implementing a Data Citation Workflow within
the State Politics and Policy Journal," University of North Carolina
at Chapel Hill
– Lisa Neidert, "OPEN Data Through a Restricted Data Portal," The
University of Michigan
– Jian Qin and Kevin Crowston, "Development and Dissemination
of a Capability Maturity Model for Research Data Management
Training and Performance Assessment," Syracuse University
11. • AERA Education Evaluation and Policy Analysis
• American Economic Journal: Applied Economics
• American Economics Review
• American Educational Research Association
• American Journal of Political Science
• American Journal of Sociology
• American Psychological Association
• American Sociological Review
• American Statistical Association
• Archives of Scientific Psychology
• Demography
• Institute for Quantitative Social Science, Harvard University
• Journal of Politics
• MIT Libraries
• Society for Research on Educational Effectiveness
• State Politics and Policy Quarterly
Data Citation and Research Transparency
Standards For the Social Sciences
June 13-14, 2013
12. • Association of Religion Data Archives
• CIESIN
• Cultural Policy and the Arts National Data Archive
• Data Conservancy
• Data ONE
• Databrary
• Dryad
• Human Relations Area Files
• Linguistic Data Consortium
• National Academies of Science
• National Snow and Ice Data Center
• Odum Institute
• Roper Center
• SEAD
• tDAR Digital Archaeological Record
• UCLA Data Archive
• University of Michigan Transportation Research Institute
• US Virtual Astronomical Observatory
• Worldwide Protein Data Bank
Sustaining Domain Repositories for Digital
Data, June 24-25, 2013
13.
14. What do we know about sharing of social
science data?
15. Source: Pienta, Amy, Myron Gutmann, & Jared Lyle. 2009. “Research Data in The Social Sciences: How Much is Being Shared?” Research
Conference on Research Integrity, Niagara Falls, NY.
Most data are not shared.
16. Data
Archived
(n=111)
Data Shared
Informally
(n=415)
Data Not
Shared
(n=409)
Primary PI Pubs
(median)
6 6 3
Secondary Pubs, No PI
(median)
8 6 3
Pubs with Students
(median)
4 3 1
Total 18 15 7
Median # of Publications by Data Sharing Status
Source: Pienta, Amy M., George Alter, and Jared Lyle. 2010. “The Enduring Value of Social Science Research: The Use and Reuse of
Primary Research Data.” Presented at the BRICK, DIME, STRIKE Workshop, The Organisation, Economics, and Policy of Scientific
Research, Turin, Italy, April 23‐24, 2010 (http://hdl.handle.net/2027.42/78307)
Shared Data Produce More Publications
17. Why don’t researchers share their
data?
The usual suspects:
• I don’t have time.
• My grant doesn’t pay for it.
• It will be used incorrectly.
• Someone might scoop me with my own data.
Our usual replies:
• You will get credit for sharing.
• More research will be done.
• Transparency and replication are good for science.
18.
19. What are the weak points
in this story?
Will Researcher
2 cite the data?Will Researcher 1
deposit the data?
20. Researcher 1 collects
data and publishes an
article.
Publication as Seen by a
Researcher
Researcher 1 is
rewarded.
45. Achieving Data Access and Research
Transparency:
• Enforcement by funding
agencies
• Ethics codes from Professional
Associations
• Author guidelines from
Journals
• Enforcement by journals
46. Why should funding agencies require
data sharing?
• Data re-use is a more efficient use of funds
– Collecting data is expensive
– Data that are shared produce more science
• Funding agencies are the biggest beneficiaries of data
citation.
• Political winds favor open data
47. Reproducibility should be the gold standard that all peer reviewers and editors aim
for when assessing whether a manuscript has supplied sufficient information to
allow others to repeat and build on the experiments. As such, the presumption must
be that, unless there is a strong reason otherwise, data should be fully disclosed and
made publicly available. In line with this principle, data associated with all publicly
funded research should, where possible, be made widely and freely available. The
work of researchers who expend time and effort
adding value to their data, to make it usable by others, should be acknowledged and
encouraged.
House of Commons, Science and Technology Committee - Eighth Report of
Session 201012 Peer review in scientiic publications. Ordered by the House
of Commons to be printed 18 July 2011.
http://www.publications.parliament.uk/pa/cm201012/cmselect/cmsctech/8
56/856.pdf
Transparency and reproducibility
are politically popular
48. The White House has mandated public
access to federally funded data
49. Congress favors open access to data
“The growing lack of scientific integrity and transparency has many causes but one thing is
very clear: without open access to data, there can be neither integrity nor transparency from
the conclusions reached by the scientific community. Furthermore, when there is no reliable
access to data, the progress of science is impeded and leads to inefficiencies in the scientific
discovery process. Important results cannot be verified, and confidence in scientific claims
dwindles.”
Statement of Research Subcommittee Chairman Larry Bucshon (R-Ind.) Hearing
on Scientific Integrity and Transparency, March 5, 2013.
Open data has bi-partisan support!
50. National Institutes of Health,
Data and Informatics Working Group
Draft Report to The Advisory Committee to the Director,
June 15, 2012
Recommendation 1: Promote Data Sharing Through
Central and Federated Catalogues
1a. Establish a Minimal Metadata Framework for Data
Sharing
1b. Create Catalogues and Tools to Facilitate Data
Sharing
1c. Enhance and Incentivize a Data Sharing Policy for
NIH-Funded Data
51. What is motivating Professional
Associations and Journals?
• Concern about legitimacy
– Cases of fraud and misuse of data
52.
53.
54. What is motivating Professional
Associations and Journals?
• Concern about legitimacy
– Cases of fraud and misuse of data
– Failures of replication
– Public attacks on science
55. How can Professional Associations and
Journals respond?
• Professional associations
– Ethics guidelines that emphasize data access and
research transparency
• Journals
– Data citation guidelines
– Data access policies
• Replication data
• Codes and scripts
– Journals worry about
• Cost
• Compliance
• Competition
56. Improving Data Citation in Journals
Data-PASS letter to the American Sociological Association, August 8, 2010
Similar letters sent to American Economics Association, American Education Research
Association, and American Political Science Association.
57. Data
Citation
References for data sets should include a
persistent identifier, such as a Digital Object
Identifier (DOI). Persistent identifiers ensure
future access to unique published digital
objects, such as a text or data set. Persistent
identifiers are assigned to data sets by digital
archives, such as institutional repositories and
partners in the Data Preservation Alliance for the
Social Sciences (Data-PASS).
58. American Political Science Association
“Guide to Professional Ethics”
October 2012
6. Researchers have an ethical obligation to facilitate the evaluation of their
evidence-based knowledge claims through data access, production
transparency, and analytic transparency so that their work can be tested or
replicated.
6.1 Data access: Researchers making evidence-based knowledge claims
should reference the data they used to make those claims. If these are data
they themselves generated or collected, researchers should provide access to
those data or explain why they cannot.
6.2 Production transparency: Researchers providing access to data they
themselves generated or collected, should offer a full account of the
procedures used to collect or generate the data.
6.3 Analytic Transparency: Researchers making evidence-based knowledge
claims should provide a full account of how they draw their analytic
conclusions from the data, i.e., clearly explicate the links connecting data to
conclusions.
American Political Science Association Guide to Professional Ethics, Rights and
Freedoms
59. The American Economic Review: Data Availability Policy
It is the policy of the American Economic Review to publish papers only if the data
used in the analysis are clearly and precisely documented and are readily available to
any researcher for purposes of replication. Authors of accepted papers that contain
empirical work, simulations, or experimental work must provide to the Review, prior
to publication, the data, programs, and other details of the computations sufficient
to permit replication. These will be posted on the AER Web site. The Editor should be
notified at the time of submission if the data used in a paper are proprietary or if, for
some other reason, the requirements above cannot be met.
As soon as possible after acceptance, authors are expected to send their
data, programs, and sufficient details to permit replication, in electronic form, to the
AER office.
…
If a request for an exemption based on proprietary data is made, authors should
inform the editors if the data can be accessed or obtained in some other way by
independent researchers for purposes of replication. Authors are also asked to
provide information on how the proprietary data can be obtained by others in their
Readme PDF file. A copy of the programs used to create the final results is still
required.
60. Concluding thoughts
• Changing researcher behavior is difficult
• The rewards of data citation are not enough
• Funding agencies and Journals
– have the greatest leverage for changing behavior
– are sympathetic to data access and transparency
61. What can we do?
Funding agencies
• Fund data stewardship
– Researchers should not be faced with a tradeoff
between their scientific aims and data stewardship
• Enforce data management plans
• Improve funding of data repositories
– Recognize data repositories as scientific infrastructure
– Develop relevant evaluation criteria
62. What can we do?
Journals
• Guidelines to authors should include
– Data access policies
– Data citation policies
– Persistent identifiers for data
– Examples
• Keep it simple
– Focus on key elements: Author, Title, Date,
Location (i.e. persistent identifier)
63. What can we do?
Data Archiving Community
• See the whole picture
• Train researchers in data management
– See Ball and Medeiros, “Teaching Students to
Document Empirical Research” on YouTube
• Reduce the costs of capturing metadata in
scientific workflows
• Rate journals on their policies and
performance