2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement on Research Transparency and Data Citation (George Alter - ICPSR)
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement on Research Transparency and Data Citation (George Alter - ICPSR)

on

  • 636 views

2013 DataCite Summer Meeting - Making Research better ...

2013 DataCite Summer Meeting - Making Research better

DataCite. Co-sponsored by CODATA.

Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30

Washington, DC. National Academy of Sciences

http://datacite.eventbrite.co.uk/

Statistics

Views

Total Views
636
Views on SlideShare
636
Embed Views
0

Actions

Likes
0
Downloads
10
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement on Research Transparency and Data Citation (George Alter - ICPSR) Presentation Transcript

  • 1. Data Access and Research Transparency: a Data Repository View George Alter ICPSR University of Michigan
  • 2. Mission: ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community. • Acquire and archive social science data • Distribute data to researchers • Preserve data for future generations • Provide training in quantitative methods About the Inter-university Consortium for Political and Social Research (ICPSR)
  • 3. ICPSR Then and Now • ICPSR History – Established in 1962 so that social scientists could share data – Started as a partnership among 21 universities – Data distributed on punched cards and then magnetic reel to reel tape
  • 4. ICPSR Then and Now • ICPSR History – Established in 1962 so that social scientists could share data – Started as a partnership among 21 universities – Data distributed on punched cards and then magnetic reel to reel tape • ICPSR Today – More than 700 members – 390+ U.S. institutions – 46 national memberships – 8,000+ data collections – Direct downloads – Online analysis
  • 5. Data archiving and dissemination for more than 20 federal and private agencies
  • 6. Summer Program 2013 1,000+ participants 42 four-week courses 37 one- to five-day courses
  • 7. ICPSR Bibliography of Data Related Publications (66,000+) in the Data Citation Index
  • 8. “Building Community Engagement in Data Citation and Open Access to Data” • Funded by Alfred P. Sloan Foundation – Challenge Grants to improve data citation and access – Social science journals – Domain repositories
  • 9. “Building Community Engagement in Data Citation and Open Access to Data” • Challenge grants: 4 selected from 26 applications: – Richard Ball and Norm Medeiros, "Replication of Empirical Research: A Soup-to-Nuts Protocol for Documenting Data Management and Analysis," Haverford College – Thomas Carsey, "Implementing a Data Citation Workflow within the State Politics and Policy Journal," University of North Carolina at Chapel Hill – Lisa Neidert, "OPEN Data Through a Restricted Data Portal," The University of Michigan – Jian Qin and Kevin Crowston, "Development and Dissemination of a Capability Maturity Model for Research Data Management Training and Performance Assessment," Syracuse University
  • 10. • AERA Education Evaluation and Policy Analysis • American Economic Journal: Applied Economics • American Economics Review • American Educational Research Association • American Journal of Political Science • American Journal of Sociology • American Psychological Association • American Sociological Review • American Statistical Association • Archives of Scientific Psychology • Demography • Institute for Quantitative Social Science, Harvard University • Journal of Politics • MIT Libraries • Society for Research on Educational Effectiveness • State Politics and Policy Quarterly Data Citation and Research Transparency Standards For the Social Sciences June 13-14, 2013
  • 11. • Association of Religion Data Archives • CIESIN • Cultural Policy and the Arts National Data Archive • Data Conservancy • Data ONE • Databrary • Dryad • Human Relations Area Files • Linguistic Data Consortium • National Academies of Science • National Snow and Ice Data Center • Odum Institute • Roper Center • SEAD • tDAR Digital Archaeological Record • UCLA Data Archive • University of Michigan Transportation Research Institute • US Virtual Astronomical Observatory • Worldwide Protein Data Bank Sustaining Domain Repositories for Digital Data, June 24-25, 2013
  • 12. What do we know about sharing of social science data?
  • 13. Source: Pienta, Amy, Myron Gutmann, & Jared Lyle. 2009. “Research Data in The Social Sciences: How Much is Being Shared?” Research Conference on Research Integrity, Niagara Falls, NY. Most data are not shared.
  • 14. Data Archived (n=111) Data Shared Informally (n=415) Data Not Shared (n=409) Primary PI Pubs (median) 6 6 3 Secondary Pubs, No PI (median) 8 6 3 Pubs with Students (median) 4 3 1 Total 18 15 7 Median # of Publications by Data Sharing Status Source: Pienta, Amy M., George Alter, and Jared Lyle. 2010. “The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data.” Presented at the BRICK, DIME, STRIKE Workshop, The Organisation, Economics, and Policy of Scientific Research, Turin, Italy, April 23‐24, 2010 (http://hdl.handle.net/2027.42/78307) Shared Data Produce More Publications
  • 15. Why don’t researchers share their data? The usual suspects: • I don’t have time. • My grant doesn’t pay for it. • It will be used incorrectly. • Someone might scoop me with my own data. Our usual replies: • You will get credit for sharing. • More research will be done. • Transparency and replication are good for science.
  • 16. What are the weak points in this story? Will Researcher 2 cite the data?Will Researcher 1 deposit the data?
  • 17. Researcher 1 collects data and publishes an article. Publication as Seen by a Researcher Researcher 1 is rewarded.
  • 18. Researcher 2 reads the article and has an idea
  • 19. Researcher 2 obtains the data
  • 20. Researcher 2 writes a new manuscript
  • 21. Journal Researcher 2 sends the manuscript to the Journal Editor
  • 22. Journal The Editor sends the manuscript out for reviews Editor Reviewers
  • 23. Journal The accepted manuscript goes to the Copy Editor Editor Reviewers Copy Editor
  • 24. Journal Publisher The Copy Editor sends the article to the Publisher Editor Reviewers Copy Editor Printer Publisher
  • 25. Journal Publisher The article is published and Researcher 2 is rewarded Editor Reviewers Copy Editor Printer Publisher
  • 26. Journal Publisher Repository The Researcher’s view of publication does not include a Repository
  • 27. Journal Publisher Repository The Researcher’s view of publication does not include a Repository or data citation with a persistent identifier
  • 28. Repository Who can assure that data are sent to a repository?
  • 29. Funding Agency Repository Most data collection is supported by a Funding Agency
  • 30. Funding Agency Repository The Funding Agency has a carrot Awards
  • 31. Funding Agency Repository The Funding Agency has a carrot and a stick AwardsCompliance
  • 32. Funding Agency Journal Publisher Repository Who can assure that data are cited?
  • 33. Professional Association Journal Publisher The Journal is owned by a Professional Association
  • 34. Professional Association Journal Publisher Committees of the Professional Association oversee the Journal
  • 35. Professional Association Journal Publisher Committees of the Professional Association oversee the Journal Executive
  • 36. Professional Association Journal Publisher Committees of the Professional Association oversee the Journal Ethics Executive
  • 37. Professional Association Journal Publisher Committees of the Professional Association oversee the Journal Ethics Publications Executive
  • 38. Professional Association Journal Publisher Ethics Code The Professional Association issues an Ethics Code requiring data access and research transparency
  • 39. Professional Association Journal Publisher Ethics Code Author Guide The Ethics Code informs the Journal’s Author Guide
  • 40. Professional Association Journal Publisher Ethics Code Author Guide Someone at the Journal enforces the Author Guide requirements for data access and citation
  • 41. Funding Agency Professional Association Journal Publisher Ethics Code Repository Author Guide
  • 42. Achieving Data Access and Research Transparency: • Enforcement by funding agencies • Ethics codes from Professional Associations • Author guidelines from Journals • Enforcement by journals
  • 43. Why should funding agencies require data sharing? • Data re-use is a more efficient use of funds – Collecting data is expensive – Data that are shared produce more science • Funding agencies are the biggest beneficiaries of data citation. • Political winds favor open data
  • 44. Reproducibility should be the gold standard that all peer reviewers and editors aim for when assessing whether a manuscript has supplied sufficient information to allow others to repeat and build on the experiments. As such, the presumption must be that, unless there is a strong reason otherwise, data should be fully disclosed and made publicly available. In line with this principle, data associated with all publicly funded research should, where possible, be made widely and freely available. The work of researchers who expend time and effort adding value to their data, to make it usable by others, should be acknowledged and encouraged. House of Commons, Science and Technology Committee - Eighth Report of Session 201012 Peer review in scientiic publications. Ordered by the House of Commons to be printed 18 July 2011. http://www.publications.parliament.uk/pa/cm201012/cmselect/cmsctech/8 56/856.pdf Transparency and reproducibility are politically popular
  • 45. The White House has mandated public access to federally funded data
  • 46. Congress favors open access to data “The growing lack of scientific integrity and transparency has many causes but one thing is very clear: without open access to data, there can be neither integrity nor transparency from the conclusions reached by the scientific community. Furthermore, when there is no reliable access to data, the progress of science is impeded and leads to inefficiencies in the scientific discovery process. Important results cannot be verified, and confidence in scientific claims dwindles.” Statement of Research Subcommittee Chairman Larry Bucshon (R-Ind.) Hearing on Scientific Integrity and Transparency, March 5, 2013. Open data has bi-partisan support!
  • 47. National Institutes of Health, Data and Informatics Working Group Draft Report to The Advisory Committee to the Director, June 15, 2012 Recommendation 1: Promote Data Sharing Through Central and Federated Catalogues 1a. Establish a Minimal Metadata Framework for Data Sharing 1b. Create Catalogues and Tools to Facilitate Data Sharing 1c. Enhance and Incentivize a Data Sharing Policy for NIH-Funded Data
  • 48. What is motivating Professional Associations and Journals? • Concern about legitimacy – Cases of fraud and misuse of data
  • 49. What is motivating Professional Associations and Journals? • Concern about legitimacy – Cases of fraud and misuse of data – Failures of replication – Public attacks on science
  • 50. How can Professional Associations and Journals respond? • Professional associations – Ethics guidelines that emphasize data access and research transparency • Journals – Data citation guidelines – Data access policies • Replication data • Codes and scripts – Journals worry about • Cost • Compliance • Competition
  • 51. Improving Data Citation in Journals Data-PASS letter to the American Sociological Association, August 8, 2010 Similar letters sent to American Economics Association, American Education Research Association, and American Political Science Association.
  • 52. Data Citation References for data sets should include a persistent identifier, such as a Digital Object Identifier (DOI). Persistent identifiers ensure future access to unique published digital objects, such as a text or data set. Persistent identifiers are assigned to data sets by digital archives, such as institutional repositories and partners in the Data Preservation Alliance for the Social Sciences (Data-PASS).
  • 53. American Political Science Association “Guide to Professional Ethics” October 2012 6. Researchers have an ethical obligation to facilitate the evaluation of their evidence-based knowledge claims through data access, production transparency, and analytic transparency so that their work can be tested or replicated. 6.1 Data access: Researchers making evidence-based knowledge claims should reference the data they used to make those claims. If these are data they themselves generated or collected, researchers should provide access to those data or explain why they cannot. 6.2 Production transparency: Researchers providing access to data they themselves generated or collected, should offer a full account of the procedures used to collect or generate the data. 6.3 Analytic Transparency: Researchers making evidence-based knowledge claims should provide a full account of how they draw their analytic conclusions from the data, i.e., clearly explicate the links connecting data to conclusions. American Political Science Association Guide to Professional Ethics, Rights and Freedoms
  • 54. The American Economic Review: Data Availability Policy It is the policy of the American Economic Review to publish papers only if the data used in the analysis are clearly and precisely documented and are readily available to any researcher for purposes of replication. Authors of accepted papers that contain empirical work, simulations, or experimental work must provide to the Review, prior to publication, the data, programs, and other details of the computations sufficient to permit replication. These will be posted on the AER Web site. The Editor should be notified at the time of submission if the data used in a paper are proprietary or if, for some other reason, the requirements above cannot be met. As soon as possible after acceptance, authors are expected to send their data, programs, and sufficient details to permit replication, in electronic form, to the AER office. … If a request for an exemption based on proprietary data is made, authors should inform the editors if the data can be accessed or obtained in some other way by independent researchers for purposes of replication. Authors are also asked to provide information on how the proprietary data can be obtained by others in their Readme PDF file. A copy of the programs used to create the final results is still required.
  • 55. Concluding thoughts • Changing researcher behavior is difficult • The rewards of data citation are not enough • Funding agencies and Journals – have the greatest leverage for changing behavior – are sympathetic to data access and transparency
  • 56. What can we do? Funding agencies • Fund data stewardship – Researchers should not be faced with a tradeoff between their scientific aims and data stewardship • Enforce data management plans • Improve funding of data repositories – Recognize data repositories as scientific infrastructure – Develop relevant evaluation criteria
  • 57. What can we do? Journals • Guidelines to authors should include – Data access policies – Data citation policies – Persistent identifiers for data – Examples • Keep it simple – Focus on key elements: Author, Title, Date, Location (i.e. persistent identifier)
  • 58. What can we do? Data Archiving Community • See the whole picture • Train researchers in data management – See Ball and Medeiros, “Teaching Students to Document Empirical Research” on YouTube • Reduce the costs of capturing metadata in scientific workflows • Rate journals on their policies and performance
  • 59. Thank you! George Alter ICPSR altergc@umich.edu