"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is http://dprma.oerc.ox.ac.uk/
Paper is
David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590 http://doi.acm.org/10.1145/2499583.2499590
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...Matthew J Collins
Introduces the Global Unified Open Data Architecture (GUODA) collaboration between iDigBio, independent developers, and EOL which aims to provide support for processing large biodiversity data sets using Apache Spark. A specific example with text mining is described. This presentation was given during the 31st Annual Meeting in 2016 of the Society for Presentation of Natural History Collections (SPNHC) in Berlin, Germany
Data analysis software for upper atmospheric research. The software was written by JavaFX. The software can handle many kinds of upper atmospheric data observed by ground-based observation.
"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is http://dprma.oerc.ox.ac.uk/
Paper is
David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590 http://doi.acm.org/10.1145/2499583.2499590
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...Matthew J Collins
Introduces the Global Unified Open Data Architecture (GUODA) collaboration between iDigBio, independent developers, and EOL which aims to provide support for processing large biodiversity data sets using Apache Spark. A specific example with text mining is described. This presentation was given during the 31st Annual Meeting in 2016 of the Society for Presentation of Natural History Collections (SPNHC) in Berlin, Germany
Data analysis software for upper atmospheric research. The software was written by JavaFX. The software can handle many kinds of upper atmospheric data observed by ground-based observation.
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!
Exploring Process Barriers to Release Public Sector Information in Local Gove...Peter Conradie
Conradie, P. & Choenni, S., 2012. Exploring Process Barriers to Release Public Sector Information in Local Government. In 6th International Conference on Theory and Practice of Electronic Governance, Albany. NY. Albany, New York, pp. 5–13.
Profile of an Industry: Research Data ServicesTanner Jessel
Discusses an emerging sector of job growth within information organizations: research data services. Discusses data science, analytical skills, trends, and training needs.
Scott Edmunds talk on GigaScience Big-Data, Data Citation and future data handling at the International Conference of Genomics on the 15th November 2011.
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...MusicNet
Jean-Philippe Fauconnier (Université Catholique de Louvain, Belgium) and Joseph Roumier (CETIC, Belgium).
Music Linked Data Workshop, 12 May 2011, JISC, London.
Data Equivalence
Mark Parsons, Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.
Weichselgartner, E. (2011, Februar).
Identifying psychological research data in the digital environment. (PDF) IDSC of IZA/Gesis/RatSWD Workshop: Persistent Identifiers for the Social Sciences, Bonn.
Scientific discovery and innovation in an era of data-intensive science
William (Bill) Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.
Using the Research Graph and Data Switchboard for cross-platform discoveryamiraryani
RDA EU Webinar - DDRI WG / April2017
Overview:
Driven by the rapid development of data storage technology, the number of data repositories is growing fast. Researchers now have access to a range of data infrastructures such as discipline-specific repositories and national (regional) data infrastructures. The problem is that these infrastructures are often operating in silos; that is, they do not connect their datasets to related research information in other platforms.
One solution to this problem is the work undertaken by the Data Description Registry Interoperability (DDRI) WG of Research Data Alliance (RDA). The group has developed the Research Data Switchboard which connects datasets and related information across research data repositories using information on co-authorship and jointly funded projects.
In this webinar, Dr Amir Aryani presents an overview of the Switchboard project and discuss how it enables connecting datasets to the Research Graph -- a distributed graph of scholarly works derived by the Switchboard project. Also, we will show a live demo of traversing the graph of connections between publications, datasets, researchers and research projects across repositories and data infrastructures.
Target Audience:
Research data managers, government agency representatives, data infrastructure managers, and technologists who are interested in interoperabilities between research infrastructures
Using Neo4j for exploring the research graph connections made by RD-Switchboardamiraryani
In this talk, Jingbo Wang (NCI) and Amir Aryani (ANDS) have presented the Neo4j queries that can help data managers to explore the connections between datasets, researchers, grants, and publications using the graph model and Research Data Switchboard. In addition, they have discussed a paper on "Graph connections made by RD-Switchboard using NCI’s metadata", presented in the Reproducible Open Science workshop in Hannover September 2016.
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!
Exploring Process Barriers to Release Public Sector Information in Local Gove...Peter Conradie
Conradie, P. & Choenni, S., 2012. Exploring Process Barriers to Release Public Sector Information in Local Government. In 6th International Conference on Theory and Practice of Electronic Governance, Albany. NY. Albany, New York, pp. 5–13.
Profile of an Industry: Research Data ServicesTanner Jessel
Discusses an emerging sector of job growth within information organizations: research data services. Discusses data science, analytical skills, trends, and training needs.
Scott Edmunds talk on GigaScience Big-Data, Data Citation and future data handling at the International Conference of Genomics on the 15th November 2011.
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...MusicNet
Jean-Philippe Fauconnier (Université Catholique de Louvain, Belgium) and Joseph Roumier (CETIC, Belgium).
Music Linked Data Workshop, 12 May 2011, JISC, London.
Data Equivalence
Mark Parsons, Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.
Weichselgartner, E. (2011, Februar).
Identifying psychological research data in the digital environment. (PDF) IDSC of IZA/Gesis/RatSWD Workshop: Persistent Identifiers for the Social Sciences, Bonn.
Scientific discovery and innovation in an era of data-intensive science
William (Bill) Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.
Similar to Identity Awareness: Toward an Invisible e-Infrastructure for Identifying Data and Authors (16)
Using the Research Graph and Data Switchboard for cross-platform discoveryamiraryani
RDA EU Webinar - DDRI WG / April2017
Overview:
Driven by the rapid development of data storage technology, the number of data repositories is growing fast. Researchers now have access to a range of data infrastructures such as discipline-specific repositories and national (regional) data infrastructures. The problem is that these infrastructures are often operating in silos; that is, they do not connect their datasets to related research information in other platforms.
One solution to this problem is the work undertaken by the Data Description Registry Interoperability (DDRI) WG of Research Data Alliance (RDA). The group has developed the Research Data Switchboard which connects datasets and related information across research data repositories using information on co-authorship and jointly funded projects.
In this webinar, Dr Amir Aryani presents an overview of the Switchboard project and discuss how it enables connecting datasets to the Research Graph -- a distributed graph of scholarly works derived by the Switchboard project. Also, we will show a live demo of traversing the graph of connections between publications, datasets, researchers and research projects across repositories and data infrastructures.
Target Audience:
Research data managers, government agency representatives, data infrastructure managers, and technologists who are interested in interoperabilities between research infrastructures
Using Neo4j for exploring the research graph connections made by RD-Switchboardamiraryani
In this talk, Jingbo Wang (NCI) and Amir Aryani (ANDS) have presented the Neo4j queries that can help data managers to explore the connections between datasets, researchers, grants, and publications using the graph model and Research Data Switchboard. In addition, they have discussed a paper on "Graph connections made by RD-Switchboard using NCI’s metadata", presented in the Reproducible Open Science workshop in Hannover September 2016.
Research Data and the Future of Software Engineeringamiraryani
Aryani, Amir; Schmidt, Heinz, Research Data and the Future of Software Engineering. Australian Software Engineering Conference (ASWEC2014),
http://dx.doi.org/10.6084/m9.figshare.956086
Can we predict dependencies using domain information?
Identity Awareness: Toward an Invisible e-Infrastructure for Identifying Data and Authors
1. Identity Awareness:
Toward an Invisible e-Infrastructure
for Identifying Data and Authors
Amir Aryani, Adrian Burton
Australian National Data Service
2. Identity Awareness
• Connecting data to
• Researchers
• Grants
• Publications
• Licence
ODIN Project
• Interoperability between
• ORCID
• DataCite
4. “My vision is a scientific community that
does not waste resources on recreating
data that have already been produced, in
particular if public money has helped to
collect those data in the first place.”
Neelie Kroes, Vice-President of the European Commission, Digital Agenda
5. Research Data Australia (RDA)
Number of published research
collections in RDA
40,000"
30,000"
20,000"
10,000"
0"
2009)11" 2011)09" 2012)01" 2012)05" 2012)07" 2012)09" 2012)10"
7. Identity Awareness
means knowing how to
• Identify the researchers who contributed to a dataset
• Identify the publications that use a dataset
• Identify the related grant or the research project
• Identify the licence for a dataset
Researcher Licence
Data
Grants and
Publication
projects
7
8. RDA Quality Model for Data
RIF-CS Elements Requirement 1 2 3
1 registry object Required * * *
2 originating source Required * * *
3 group Required * * *
4 key Required * * *
5 collection type Required * * *
6 name/title Required * *
7 related party (researcher or organisation) Required * *
8 description Required * *
9 location/address Required * *
10 rights (Licence) Required * *
11 activity (grant or research project) Required if available *
12 subject Recommended *
13 spatial coverage Recommended *
14 temporal coverage Recommended *
15 citation Recommended *
16 identifier Recommended *
17. Data Creator,
Researcher, Author
Birth Cohort Study
dataset
Non- Birth Cohort
Study dataset
Derived dataset
Grey Literature
1958
Published article
Citation
Data Creator
Derived Data Creator
External Data input
Author: Grey lit
External Data
Author: Article
(Census, Health etc )
1970
18. Data Creator,
Researcher, Author
Birth Cohort Study
dataset
Non- Birth Cohort
Study dataset
Derived dataset
Grey Literature
1958
Published article
Citation
Data Creator
Derived Data Creator
External Data input
Author: Grey lit
External Data
Author: Article
(Census, Health etc )
1970
19. Data Creator,
Researcher, Author
Birth Cohort Study
dataset
Non- Birth Cohort
Study dataset
Derived dataset
Grey Literature
1958
Published article
Citation
Data Creator
Derived Data Creator
External Data input
Author: Grey lit
External Data
Author: Artticle
(Census, Health etc )
1970
20. Data Creator,
Researcher, Author
Birth Cohort Study
dataset
Non- Birth Cohort
Study dataset
Derived dataset
Grey Literature
1958
Published article
Citation
Data Creator
Derived Data Creator
External Data input
Author: Grey lit
External Data
Author: Article
(Census, Health etc )
1970
21. Acknowledgment
ANDS is supported by the The ODIN project is funded by the
Australian Government through European Union under FP7 call
the National Collaborative INFRA-2012-3.3 (Grant Agreement
Research Infrastructure Strategy number 312788)
Program and the Education
Investment Fund (EIF) Super
Science Initiative
22. Conclusion
Enabling identity awareness is an international challenge
that requires a collaborative effort. ANDS encourage your
collaboration in this area and particularly to investigate
these questions:
• How can we measure and improve identity awareness
of research data?
• How can we measure and improve data reuse?
• How can we measure research impact?
• How can your organisation take advantage of the some
of the emerging global identity infrastructures?