Data Citation from the perspective of tracking data reuse

  • 2,100 views
Uploaded on

Presentation by Heather Piwowar at DataCite 2011 Summer Meeting, Aug 24 2011.

Presentation by Heather Piwowar at DataCite 2011 Summer Meeting, Aug 24 2011.

More in: Education , Technology , Sports
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,100
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
10
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Data CitationChallenges and Opportunities from the perspective of Tracking Data Reuse Heather  Piwowar DataONE  postdoc  with  NESCent  and  Dryad @researchremix   DataCite  Summer  Meeting August  2011
  • 2. http://www.metmuseum.org/toah/ho/09/euwf/ho_24.45.1.htm
  • 3. http://www.flickr.com/photos/jsmjr/62443357/
  • 4. http://www.flickr.com/photos/camilleharrington/3587294608/
  • 5. http://www.flickr.com/photos/rkuhnau/3318245976/
  • 6. http://www.flickr.com/photos/conformpdx/1796399674/
  • 7. http://www.flickr.com/photos/rkuhnau/3317418699/
  • 8. http://www.flickr.com/photos/zemlinki/261617721/
  • 9. http://www.flickr.com/photos/tracenmatt/3020786491/
  • 10. http://www.flickr.com/photos/the-o/2078239333/
  • 11. ? http://www.flickr.com/photos/ryanr/142455033/
  • 12. http://www.flickr.com/photos/archeon/2941655917/
  • 13. http://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Gamma_distribution_pdf.svg/500px-Gamma_distribution_pdf.svg.png
  • 14. http://www.flickr.com/photos/jima/606588905/
  • 15. http://www.flickr.com/photos/lofaesofa/248546821/
  • 16. We have observed reuse of at 35% of GEO datasets submitted in 2005.
  • 17. Piwowar, Vision, Whitlock (2011) Data archiving is a good investment. Nature 473, 285http://researchremix.wordpress.com/2011/05/19/nature-letter/
  • 18. Tracking 1k
  • 19. 10 * 100 = 1000
  • 20. !"#$"%&()*++,*&*#"-."*/#01-2(%. $!!"# ,!"# +!"# *!"# 1?6@AB:C2#@2#64D4642E4# 948:;1<4=># )!"# (!"# 1?6@AB:C2#@2#DCC<2C<4# !"# 1?6@AB:C2#@2#<1AF4# &!"# 1?6@AB:C2#@2#<4G<# %!"# $!"# !"# -./# 0123141# 56447184#
  • 21. !&#$!! 456!!"#$%&()*+,+&%"-%-.,-./01,23! 70890,0! !&$$!! :;,,<0-,! !%#$!! =>?@89!0?,;,09,! A4563! =>?@89!0?,;09,! !%$$!! A70890,03! =>?@89!0?,;09,! A:;,,<0-,3! !#$!! !"!!!! &$$! &$$#! &$$(! &$$)! &$$*! &$$+! &$%$! &$%%!
  • 22. Piwowar, Carlson, Vision (2011) Beginning to track 1000 datasets from public repositories into the published literature. ASIS&T poster.https://notebooks.dataone.org/tracking1000datasets/
  • 23. My research blog:ResearchRemix.wordpress.com http://www.flickr.com/photos/myklroventine/892446624/
  • 24. Data citation in the wild IDCC 2010 poster.
  • 25. A best-practice solution!
  • 26. http://www.flickr.com/photos/nilsrinaldi/5157809483/
  • 27. #1Lack of tool supportfor our best practice
  • 28. Research Remix blog: http://bit.ly/aOwLoJ“Tracking Dataset Citations Using Common Citation Tracking Tools Doesn’t Work”
  • 29. Research Remix blog: http://bit.ly/aOwLoJ“Tracking Dataset Citations Using Common Citation Tracking Tools Doesn’t Work”
  • 30. We need more diversityWe need more players We need start-ups
  • 31. Abstracts are open.Ref lists should be too.
  • 32. #2Our best practice doesn’t scale to mega-reuse
  • 33. !"#$%&()*+,+-%,-&%)%&%./%*$0+&1/2%- ,3+,&%"-%*456*+,+7/"#"2+18%$!!"# ,!"# +!"# *!"# )!"# (!"# !"# &!"# %!"# $!"# !"# !# (# $!# $(# %!# %(# &!#
  • 34. !"#$%&()*+,+-%,-&%)%&%./%*$0+&1/2%- ,3+,&%"-%*456*+,+7/"#"2+18%$!!"# ,!"# +!"# *!"# )!"# (!"# !"# &!"# %!"# $!"# !"# !# (# $!# $(# %!# %(# &!#
  • 35. !"#$%&()*+,+-%,-&%)%&%./%*$0+&1/2%- ,3+,&%"-%*456*+,+7/"#"2+18%$!!"# ,!"# +!"# *!"# )!"# (!"# !"# &!"# %!"# $!"# !"# !# (# $!# $(# %!# %(# &!#
  • 36. http://www.nature.com/nature/authors/gta/
  • 37. But wait!
  • 38. #2Our best practice doesn’t can scale to mega-reuse (if we work at it)
  • 39. Another place where having a few big players is a bottleneck Open reference lists.
  • 40. #3aAdoption of best practices erode incentives in the short term
  • 41. ~70%in multivariateanalysis
  • 42. #3bData citations only matter if they are valued
  • 43. Please donʼt tweet or publicize this next bit...Early results from an ongoing survey.n=538
  • 44. 9:/54351#,*;5+3#<4-#=82/1#,-#>0#?,+@#>,+5#5432/0#(!"#!"#&!"#%!"#$!"# !"# )*+,-./0# 6# 6# 758*+4/# 6# 6# )*+,-./0# 1234.+55# 4.+55# 9:/54351#;#<2//#.5*#=,+5#>2*4?,-3##(!"#!"#&!"#%!"#$!"# !"# )*+,-./0# 6# 6# 758*+4/# 6# 6# )*+,-./0# 1234.+55# 4.+55# Do not publicize
  • 45. 9:/54351#,*;5+3#<4-#=82/1#,-#>0#?,+@#>,+5#5432/0# 9#:/54351#2*#;2//#<5#=4/851##(!"# <0#>0#?8-15+# (!"#!"# !"#&!"# &!"#%!"# %!"#$!"# $!"# !"# !"# )*+,-./0# 6# 6# 758*+4/# 6# 6# )*+,-./0# )*+,-./0# 6# 6# 758*+4/# 6# 6# )*+,-./0# 1234.+55# 4.+55# 1234.+55# 4.+55# 9:/54351#;#<2//#.5*#=,+5#>2*4?,-3## 9#:/54351#2*#;2//#<5#=4/851##(!"# <0#>0#:+,>,?,-#,+#*5-8+5#@,>>2A55# (!"#!"# !"#&!"# &!"#%!"# %!"#$!"# $!"# !"# !"# )*+,-./0# 6# 6# 758*+4/# 6# 6# )*+,-./0# )*+,-./0# 6# 6# 758*+4/# 6# 6# )*+,-./0# 1234.+55# 4.+55# 1234.+55# 4.+55# Do not publicize
  • 46. Top-down
  • 47. http://www.nsf.gov/pubs/policydocs/pappguide/nsf08_1/gpg_2.jsp
  • 48. Bottom-up
  • 49. Text DataCite!
  • 50. http://www.flickr.com/photos/ginable/325235488/
  • 51. http://www.flickr.com/photos/ryanr/142455033/
  • 52. http://www.flickr.com/photos/supersam5/216868485/
  • 53. thank youTodd Vision, Jonathan Carlson, Estephanie Sta Maria, Nicholas Weber, Sarah Judson,Valerie Enriquez Jason Priem and Beyond Impact Dryad and DataONE teamsThe open science online community and those who release their articles, datasets and photos openly blog: ResearchRemix.wordpress.com
  • 54. No consistent practice
  • 55. We reviewed 500 articles in six major evolution and ecologyjournals for evidence of data citation: Sarah Judson, Data citation in the wild IDCC 2010 poster.
  • 56. We reviewed 500 articles in six major evolution and ecologyjournals for evidence of data citation: Sarah Judson, Data citation in the wild IDCC 2010 poster.
  • 57. In 2009, 116 articles cited ORNL DAAC data.Finding these articles took 70-80 hoursacross at least 12 resourcesall chosen from a deep understandingof this specific research domain then the full text of all the hits were manually reviewed Valerie Enriquez interview with James Kidder http://openwetware.org/wiki/DataONE:Notebook/Reuse_of_repository_data