Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ischools future of data managemente dec2017


Published on

Keynote by Natasha Simons at the iSchools Data Science Winter Institute, 7-9 December 2017, University of Hong Kong

Published in: Education
  • Be the first to comment

  • Be the first to like this

ischools future of data managemente dec2017

  1. 1. Natasha Simons What’s coming next? The future of research data management Australian National Data Service iSchools Data Science Winter Institute Hong Kong, 7-8 December 2017
  2. 2. Brisbane, Australia
  3. 3. University of Queensland
  4. 4. What is ANDS?
  5. 5. NCRIS • National Collaborative Research Infrastructure Strategy (NCRIS) • Australian government program • Drives research excellence and collaboration between researchers, government and industry to deliver practical outcomes • Funds research infrastructure projects including ANDS, Nectar and RDS • 2016 National Research Infrastructure Roadmap outlines Australian research infrastructure required over next decade
  6. 6. ANDS/Nectar/RDS Aligned set of joint investments to deliver four key transformations in the research sector: 1. A world leading data advantage 2. Accelerated innovation 3. Collaboration for borderless research 4. Enhanced translation of research
  7. 7. Our approach Building on and leveraging previous investments and relationships: 1. Research domain program 2. Research data platforms 3. Sector-wide support and engagement
  8. 8. What is the future of Research Data Management? Photo by Michal Lomza on Unsplash
  9. 9. Trend #1 Data policies Funder data sharing policies are on the rise. Examples: Data sharing is essential for expedited translation of research results into knowledge, products and procedures to improve human health….[and it] should be made as widely and freely available as possible... - National Institutes of Health USA (1) Publicly funded research data are a public good...which should be made openly available with as few restrictions as possible in a timely and responsible manner - Research Councils UK (2) (1) National Institutes of Health. 2003. Data Sharing Policy and Implementation Guidance. (2) Research Councils UK. 2011 (revised 2015). RCUK Common Principles on Data Policy. Photo by Christine Roy on Unsplash
  10. 10. Trend #1 Data policies Research data principle: as open as possible, as closed as necessary - European Commission Horizon 2020 Guidelines (1) We expect our researchers to maximise the availability of research data, software and materials with as few restrictions as possible - Wellcome Trust (2) The ARC is committed to maximising the benefits from ARC-funded research, including by ensuring greater access to research data. Since 2007, the ARC has encouraged researchers to deposit data arising from research projects in publicly accessible repositories. The ARC’s position reflects an increased focus in Australian and international research policy and practice on open access to data generated through publicly funded research. - Australian Research Council (3) (1) National Institutes of Health. 2003. Data Sharing Policy and Implementation Guidance. (2) Research Councils UK. 2011 (revised 2015). RCUK Common Principles on Data Policy. (3) Australian Research Council.
  11. 11. Trend #1 Data policiesGovernment open data policies are on the rise. Examples: Newly-generated [USA] government data is required to be made available in open, machine-readable formats, while continuing to ensure privacy and security (1) All EU institutions are invited to make their data publicly available whenever possible (2) The Japanese government is promoting the Open Data initiative, in which the government widely discloses public data (3) (1) USA Federal Government. 2013. Memorandum - Open Data Policy - Managing Information as an Asset. (2) European Union. Open Data Portal - About. (3) Japan Open Data Initiative
  12. 12. Trend #1 Data policies The Australian Government Open Data Declaration is about making more government information available to the public online (4) The public sector information portal of the Government of the Hong Kong Special Administrative Region with datasets from different government departments and public/private organisations (5) Keywords: transparency, openness, return on investment, economy, industry/government/research collaborations, innovation (1) Australian Government. 2010. Declaration of Open Government. (2) Hong Kong Open Data Portal.
  13. 13. Trend #1 Data policies Publisher/Journal data policies and initiatives are on the rise. Examples: PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception - PLOS (1) A condition of publication in a Nature Research journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications. - Nature (2) (1) PLOS. Data availability. (2) Nature. Availability of data, materials and methods.
  14. 14. Trend #1 Data policiesPublisher signed statement examples: The ultimate measure of success is in the replicability of science, generation of new discoveries, and in progress on the grand challenges facing society that depend on the integration of open data, tools, and models from multiple sources. This statement of commitment signals important progress and a continuing commitment by publishers and data facilities to enable open data in the Earth and space sciences - COPDESS (1) Transparency, open sharing, and reproducibility are core values of science. Over 5,000 journals and organizations have already become signatories of the TOP Guidelines. - TOP Guidelines (2) (1) COPDESS. Statement of Commitment. (2) Centre for Open Science. Transparency and Openness Guidelines. Photo by Drew Hays on Unsplash
  15. 15. Policy challenges ● Existence of data policy e.g. the higher the Impact Factor of the journal the more likely they are to have a data availability policy and to enforce it (1) ● Data policies vary widely: content; discoverability; ease of interpretation; infrastructure providers; support for compliance (2) ● Most journal data sharing policies do not provide specific guidance on the practices that ensure data is maximally available and reusable (3) (1) Piwowar, HA and Chapman, WW (2010) Public sharing of research datasets: A pilot study of associations. Journal of Informetrics, 4 (2). 148 - 156. ISSN 1751-1577 (2) Naughton, L. & Kernohan, D., (2016). Making sense of journal research data policies. Insights. 29(1), pp.84–89. DOI: (3) Vasilevsky NA, Minnier J, Haendel MA, Champieux RE. (2017) Reproducible and reusable research: are journal data sharing policies meeting the mark? PeerJ 5:e3208
  16. 16. Policy challenges ● Data availability declines over time (1) ● The most effective journal data policies mandate data sharing in a repository and a data availability statement with a link to the data (2) ● Data availability from authors on request has been found wanting in several studies/case studies (3-5) ● The introduction of a data availability policy can polarize the research community e.g. PLOS, ICMJE (1) Vines et al. (2013) Current Biology. DOI: (2) Vines, et al. (2013) FASEB J doi: 10.1096/fj.12-218164 (3) Systematic Reviews 2014, 3:97 doi:10.1186/2046-4053-3-97 (4) American Psychologist, Vol 61(7), Oct 2006, 726-728. doi:10.1037/0003-066X.61.7.726 (5) 1.PLoS ONE 4(9): e7078. doi:10.1371/journal.pone.0007078 Thanks to Iain Hyrnaszkiewicz, Springer Nature, for dot points 1-3 above.
  17. 17. Trend #2 Data sharing Figshare open data survey 2017: ● 82% aware of open data sets ● 80% willing to reuse open data sets in own research ● 60% routinely share their data (frequently or sometimes) ● 21% have never made a data set openly available ● 74% are now curating their data for sharing ● 77% value a data citation the same as an article Science, Digital (2017): The State of Open Data 2017 Report - Infographic. figshare. pp. 7- 11
  18. 18. Trend #2 Data sharing We can see strong signals that open data is becoming more embedded [but] there is still a lack of confidence around open data. Figshare open data survey 2017
  19. 19. Trend #2 Data sharing A 2011 study of 500 papers that were published in 2009 from 50 top-ranked research journals showed that only 47 papers (9%) of those reviewed had deposited full primary raw data online. As another study notes, the number of datasets being shared annually has increased by more than 400% from 2011 to 2015, and this pace will likely continue. What Constitutes Peer Review of Data? A Survey of Peer Review Guidelines by Todd A. Carpenter. Scholarly Kitchen blog post 11 April 2017. research-data/
  20. 20. Trend #2 Data sharing More than two thirds of Wiley researchers reported they are now sharing their data. Though this varies geographically and across research disciplines we are seeing that more researchers are sharing their data and taking efforts to make it reproducible. Wiley Global Data Sharing Infographic June 2017. hor-resources/Journal- Authors/licensing-open- access/open-access/data- sharing.html
  21. 21. Data sharing challenges Lack of understanding of the open/shared/closed model. Lack of skills/understanding about how to share sensitive data. Still too few “rewards” for data sharing. Researchers may lack skills needed to manage and share data. Wiley survey - Top 4 reasons why researchers are hesitant to share their data: ● 50% Intellectual Property or confidentiality issues ● 31% Ethics concerns ● 23% Concerns about misinterpretation or misuse of my research ● 22% Concerns that my research will be scooped Photo by on Unsplash
  22. 22. Trend #3 Connected research/data Connected research (researchers, research organisations, publications, data, grants, software, methods and more) is important: ● for better discovery of research (data) ● to assist the ability to reproduce research ● to research transparency ● to aid attribution and credit ● to track use and impact Persistent Identifiers (PIDs) and global standards play a key role in connecting research.
  23. 23. Looks something like this.. Research Graph is an open collaborative project that builds the capability for connecting researchers, publications, research grants and research datasets (data in research).
  24. 24. Trend #3 Connected research/data Examples of progress: The ability to access and review the data behind research is a well sought after, but often elusive, resource. In recognition of this, Scopus has been working to incorporate new tools that can make it easier to search and share data - Scopus makes strides in data linking Major publishers have committed to requiring ORCID iDs in the publishing process for their journals and invite other publishers to do the same - Requiring ORCID in Publication Workflows: Open Letter Approximately 148 million DOIs have been assigned [to publications, data, software and more] through a federation of Registration Agencies world-wide - Frequently asked questions about the DOI system
  25. 25. Connected data challenges Photo by William Bout on Unsplash ● Raise PiD adoption levels e.g. THOR Project ● ORCIDs - need to be populated and used ● Increasing PiDs in research workflows ● Need standard ways to exchange information e.g. Scholix initiative to link data and publications ● Data Citation practice challenges
  26. 26. Trend #4 Data reuse There is a push for reusable research data. Examples: Why enable reuse? The UK Data Archive provides many reasons, including: encouraging scientific enquiry and debate; promoting innovation and potential new data uses. 2013 study: “We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003” - Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ1:e175 Early 2017: Springer Nature responded to the US National Institutes of Health’s request for information on Strategies for NIH Data Management, Sharing, and Citation. They made a number of recommendations to the NIH, and funding organisations, including: Encouraging researchers to share and describe datasets in a way that facilitates reuse and reproducibility.
  27. 27. Data reuse challenges “87% of researchers don’t know what licence to apply to their data” - Daniel Hook, CEO Digital Science, 3/1//17 There is a quality issue: (a) sharing data is necessary but not sufficient for future reuse, (b) ensuring that data is “independently understandable” is crucial, and (c) incorporating a data review process is feasible - Peer et al. Committing to a Data Quality review. IDCC14 Practice Paper. Other issues: Geographic differences and differences across age groups: younger respondents feel more favorably toward data sharing and reuse, yet make less of their data available than older respondents - Tenopir et al. Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide
  28. 28. So the future of RDM is...FAIR Data #1 Data policies help support Findable data #2 Data sharing helps create Accessible data #3 Connected research/data is part of Interoperable data #4 Data reuse is enabled by Reusable data FORCE11 Fair Data Principles By SangyaPundir (Own work) [CC BY-SA 4.0 (], via Wikimedia Commons
  29. 29. FAIR Data... ● Requires good data management across the whole lifecycle. ● Requires many stakeholders to work together iProfessionals have: ● a challenge ● an opportunity ● an incredible amount of skills and knowledge to contribute! ANDS FAIR Data flyer
  30. 30. Research lifecycle - traditional University of Bournemouth - DATA
  31. 31. Research lifecycle - data infused Find data Plan to manage data Publish data Collect, store, analyse, visualise data Cite data
  32. 32. The future is an opportunity "The challenge of the unknown future is so much more exciting than the stories of the accomplished past." - Simon Sinek Photo by Warren Wong on Unsplash
  33. 33. With the exception of third party images or where otherwise indicated, this work is licensed under the Creative Commons 4.0 International Attribution Licence. ANDS, Nectar and RDS are supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program (NCRIS). @n_simons Natasha Simons