Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Reuse of Repository Data

391 views

Published on

Presentation of initial findings for Summer 2010 DataONE internship.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Reuse of Repository Data

  1. 1. Reuse of Repository Data Valerie Enriquez – DataONE – Summer 2010
  2. 2. Motivation <ul><li>Data deposit vs. data reuse </li></ul><ul><li>Why track the reuse of data? </li></ul><ul><ul><li>Transparency </li></ul></ul><ul><ul><li>Collaboration </li></ul></ul><ul><ul><ul><ul><ul><li>Confirm existing data </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Refute existing data </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Combine with existing data to form new conclusions </li></ul></ul></ul></ul></ul><ul><ul><li>Healthy Competition </li></ul></ul><ul><ul><li>Invigoration </li></ul></ul>
  3. 3. Initial Questions <ul><li>How is data currently cited and how often? </li></ul><ul><li>How do we find data citations using available resources (search engines, databases, etc.)? </li></ul><ul><li>How difficult is it to find data citations using these tools and why? </li></ul><ul><li>What are the best/worst ways to find data citations? </li></ul><ul><li>How do the citations vary across discipline, repository and publication? </li></ul><ul><li>What is the most common citation? Repository name? Data author name? Unique identifier like a study number or DOI? </li></ul>
  4. 4. To whose benefit? <ul><li>Scientists </li></ul><ul><li>Academic researchers </li></ul><ul><li>Students </li></ul><ul><li>Anyone who uses or deposits data </li></ul><ul><li>Anyone interested in the citation or reuse of data </li></ul><ul><li>Similar projects </li></ul><ul><ul><li>See also: list of projects, discussion and editorials on the OpenWetware DataONE Web Resources page: http://openwetware.org/wiki/User:Valerie_Enriquez/Notebook/DataONE_Web_resources </li></ul></ul>
  5. 5. Methods <ul><li>Initial search process: Test TreeBASE searches </li></ul><ul><li>Focused search </li></ul><ul><ul><li>Repositories </li></ul></ul><ul><ul><li>TreeBASE </li></ul></ul><ul><ul><li>Pangaea </li></ul></ul><ul><ul><li>ORNL DAAC </li></ul></ul><ul><ul><li>Databases </li></ul></ul><ul><ul><li>ISI Web of Science Cited Reference Search </li></ul></ul><ul><ul><li>Scirus </li></ul></ul><ul><ul><li>Google Scholar </li></ul></ul><ul><li>Limits </li></ul><ul><ul><li>Date range: 2008-2010 </li></ul></ul><ul><ul><li>Language: English </li></ul></ul><ul><ul><li>Journal articles only </li></ul></ul><ul><li>Repository-specific search terms </li></ul><ul><ul><li>TreeBASE: repository name, study accession number (S####), data author name </li></ul></ul><ul><ul><li>Pangaea: repository name, DOI prefix:10.1594/PANGAEA. ######, data author name </li></ul></ul><ul><ul><li>ORNL DAAC: repository name, DOI prefix: 10.3334/ORNLDAAC/###, data author name, project name (BOREAS, FLUXNET, etc.) </li></ul></ul>
  6. 6. Initial Analysis <ul><li>Search comparison spreadsheet hosted here </li></ul><ul><ul><li>Search methods, terms and datasets used to construct search terms were captured as well as the total number of results followed by respective hits and misses. </li></ul></ul><ul><ul><li>Percentages of hits vs. misses calculated within the spreadsheet. </li></ul></ul><ul><ul><li>Reasons for miss captured </li></ul></ul><ul><ul><li>Reasons for hit captured </li></ul></ul><ul><li>Shared fields template from Sarah with my input data hosted here </li></ul><ul><ul><li>Hosts data about individual articles, including DOIs as applicable, metadata and coding for hits and misses. </li></ul></ul>
  7. 7. Stumbles and other Worrisome Things <ul><li>Finding focus and the difficulty of going beyond the obvious </li></ul><ul><li>“ Missing” searches </li></ul><ul><li>How broad is too broad? How narrow is too narrow? </li></ul><ul><li>Article cited vs. data cited </li></ul>Image courtesy of: http://currentskateofmind.com/2008/03/25/glossary-of-skating-falls/
  8. 8. Initial Findings *: invalid field input $: effective # ineffective ISI Web of Science Scirus Google Scholar TreeBASE <ul><li>$ Repository name </li></ul><ul><li>* </li></ul><ul><li>$ Cited Author Name/original publication title/date </li></ul><ul><li>$ Repository name </li></ul><ul><li># Study Accession Number </li></ul><ul><li># Cited Author Name/original publication title/date </li></ul><ul><li># Repository name </li></ul><ul><li># Study Accession Number </li></ul><ul><li># Cited Author Name/original publication title/date </li></ul>Pangaea <ul><li>$ Repository name </li></ul><ul><li>* </li></ul><ul><li>$ Cited Author Name/original publication title/date </li></ul><ul><li>Repository name </li></ul><ul><li>$ DOI prefix </li></ul><ul><li># Cited Author Name/original publication title/date </li></ul><ul><li># Repository name </li></ul><ul><li>$ DOI prefix </li></ul><ul><li># Cited Author Name/original publication title/date </li></ul>ORNL DAAC <ul><li>$ Repository name </li></ul><ul><li>* </li></ul><ul><li>$ Cited Author Name/original publication title/date </li></ul><ul><li>$ Repository name </li></ul><ul><li>$ DOI prefix </li></ul><ul><li>$ Cited Author Name/project name/original publication title/date </li></ul><ul><li># Repository name </li></ul><ul><li>$ DOI prefix </li></ul><ul><li>$ Cited Author Name/project name/original publication title/date </li></ul>
  9. 9. Lessons Learned Image courtesy of: http://www.squidoo.com/stop_information_overload Hey, I think I found that data citation you were looking for.
  10. 10. Where do we go from here? <ul><li>Solidify conclusions from initial findings. </li></ul><ul><li>Compare data with other interns. </li></ul><ul><li>Examine other repositories, search terms and databases. </li></ul><ul><li>Write article about how difficult it is to find data reuse citations. Some possible publications: </li></ul><ul><ul><li>Collection Management   </li></ul></ul><ul><ul><li>DLib Link provided by Heather. </li></ul></ul><ul><ul><li>Information Services & Use   Author Guidelines   </li></ul></ul><ul><ul><li>Informing Science </li></ul></ul><ul><ul><li>International Digital Curation Conference Call for Papers . Link provided by Nic. </li></ul></ul><ul><ul><li>Journal of the American Society for Information Science & Technology   </li></ul></ul><ul><ul><li>Journal of Information Science   </li></ul></ul><ul><ul><li>Library Technology Reports </li></ul></ul><ul><ul><li>Scientometrics </li></ul></ul>

×