Reuse of Repository Data              Valerie Enriquez
Motivation Data deposit vs. data reuse Why track the reuse of data?    Transparency    Collaboration        Confirm existi...
Initial Questions How is data currently cited and how often? How do we find data citations using available resources (sear...
To whose benefit? Scientists Academic researchers Students Anyone who uses or deposits data Anyone interested in the citat...
Methods Initial search process: Test    Limits TreeBASE searches                 Date range: 2008-2010 Focused search     ...
Initial Analysis1.   Search comparison spreadsheet hosted here     Search methods, terms and datasets used to construct   ...
Stumbles and other Worrisome Things Finding focus and the difficulty of going beyond the obvious “Missing” searches How br...
Initial Findings                            ISI Web of Science             Scirus                Google ScholarTreeBASE   ...
Lessons Learned                          Hey, I think I found that data                          citation you were looking...
Where do we go from here? Solidify conclusions from initial findings. Compare data with other interns. Examine other repos...
Upcoming SlideShare
Loading in …5
×

Reuse of repository_data_2.0

364 views
296 views

Published on

Presentation of initial findings for Summer 2010 DataONE internship.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
364
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Reuse of repository_data_2.0

  1. 1. Reuse of Repository Data Valerie Enriquez
  2. 2. Motivation Data deposit vs. data reuse Why track the reuse of data? Transparency Collaboration Confirm existing data Refute existing data Combine with existing data to form new conclusions Healthy Competition Invigoration
  3. 3. Initial Questions How is data currently cited and how often? How do we find data citations using available resources (search engines, databases, etc.)? How difficult is it to find data citations using these tools and why? What are the best/worst ways to find data citations? How do the citations vary across discipline, repository and publication? What is the most common citation? Repository name? Data author name? Unique identifier like a study number or DOI?
  4. 4. To whose benefit? Scientists Academic researchers Students Anyone who uses or deposits data Anyone interested in the citation or reuse of data Similar projects See also: list of projects, discussion and editorials on the OpenWetware DataONE Web Resources page: http://openwetware.org/wiki/User:Valerie_Enriquez/Not ebook/DataONE_Web_resources
  5. 5. Methods Initial search process: Test Limits TreeBASE searches Date range: 2008-2010 Focused search Language: English Repositories Journal articles only 1. TreeBASE Repository-specific search 2. Pangaea terms 3. ORNL DAAC TreeBASE: repository name, study accession number Databases (S####), data author name 1. ISI Web of Science Cited Pangaea: repository name, Reference Search DOI 2. Scirus prefix:10.1594/PANGAEA. 3. Google Scholar ######, data author name ORNL DAAC: repository name, DOI prefix: 10.3334/ORNLDAAC/###, data author name, project name (BOREAS, FLUXNET, etc.)
  6. 6. Initial Analysis1. Search comparison spreadsheet hosted here Search methods, terms and datasets used to construct search terms were captured as well as the total number of results followed by respective hits and misses. Percentages of hits vs. misses calculated within the spreadsheet. Reasons for miss captured Reasons for hit captured2. Shared fields template from Sarah with my input data hosted here Hosts data about individual articles, including DOIs as applicable, metadata and coding for hits and misses.
  7. 7. Stumbles and other Worrisome Things Finding focus and the difficulty of going beyond the obvious “Missing” searches How broad is too broad? How narrow is too narrow? Article cited vs. data cited Image courtesy of: http://currentskateofmind.com/2008/03/25/glo ssary-of-skating-falls/
  8. 8. Initial Findings ISI Web of Science Scirus Google ScholarTreeBASE 1. $ Repository name 1. $ Repository name 1. # Repository name 2. * 2. # Study Accession 2. # Study Accession 3. $ Cited Author Number Number Name/original 3. # Cited Author 3. # Cited Author publication Name/original Name/original title/date publication publication title/date title/datePangaea 1. $ Repository name 1. Repository name 1. # Repository name 2. * 2. $ DOI prefix 2. $ DOI prefix 3. $ Cited Author 3. # Cited Author 3. # Cited Author Name/original Name/original Name/original publication publication publication title/date title/date title/dateORNL DAAC 1. $ Repository name 1. $ Repository name 1. # Repository name 2. * 2. $ DOI prefix 2. $ DOI prefix 3. $ Cited Author 3. $ Cited Author 3. $ Cited Author Name/original Name/project Name/project publication name/original name/original*: invalid field input $: title/date publication effective search #: ineffective search publication title/date title/date
  9. 9. Lessons Learned Hey, I think I found that data citation you were looking for.Image courtesy of: http://www.squidoo.com/stop_information_overload
  10. 10. Where do we go from here? Solidify conclusions from initial findings. Compare data with other interns. Examine other repositories, search terms and databases. Write article about how difficult it is to find data reuse citations. Some possible publications: Collection Management DLib Link provided by Heather. Information Services & Use Author Guidelines Informing Science International Digital Curation Conference Call for Papers. Link provided by Nic. Journal of the American Society for Information Science & Technology Journal of Information Science Library Technology Reports Scientometrics

×