In 2009, 116 articles cited ORNL DAAC data.Finding these articles took 70-80 hoursacross at least 12 resourcesall chosen from a deep understandingof this specific research domain then the full text of all the hits were manually reviewed Valerie Enriquez interview with James Kidder http://openwetware.org/wiki/DataONE:Notebook/Reuse_of_repository_data
How to iden9fy Dataset Reuse in the published literature This cita2on paCern (dataset DOI/ID in references sec2on) is used almost exclusively for dataset has an iden2ﬁer? with dataset unique ID search in reference dataset reuse. (DOI, url, accession #) sec2ons of all papers Manual disambigua2on not required: can be automated IDs are diﬃcult to DOI/ID reference search possible in full-‐text portals like pending API support. unambiguously iden2fy in PubMed Central and HighWire Press, however portal full text unless they have a coverage is limited and search is not restricted to Does not require access to unique paCern (DOI) or references sec2on. full-‐text unusual preﬁx or suﬃx. with dataset unique DOI/ID search works in Google Scholar, but scope is This cita2on paCern is currently ID poorly deﬁned, results are messy. rare This cita2on paCern is diﬃcult DOI/ID search not supported by ISI Web of Science or to track with exis2ng tool Scopus limita2ons with (submi-er surname AND repository name), publicly dataset submission record has and also This cita2on paCern archived submiCer name or dataset (dataset 9tle AND search in full text of all sort hits to disambiguate (accession numbers in full dataset 2tle? repository name) papers reuse from submission text) is very common in some subdisciplines, so Names and 2tles are messy Disambigua2on is 2me probably ﬁnds most Requires ability to query iden2ﬁers consuming reuses. full text across all literature that may Requires access to full text of with (ﬁrst author surname contain reuse search hits for sor2ng AND repository name) sort hits to disambiguate dataset submission record men2ons gather papers that cite the data This cita2on paCern with data reuse from other data collec2on ar2cle publica2on? collec2on paper (cita2on to data crea2on collec2on ar2cle’s cita2on contexts paper) is very common in journal, volume, Disambigua2on is 2me some subdisciplines, so page, etc. Cita2on history export is 2me probably ﬁnds most reuses. Link to data collec2on paper oVen consuming: most cita2ons are consuming: automa2on not missing from dataset submission record, not in the context of reuse supported. especially when dataset submission predates ar2cle publica2on. Only ﬁnds cita2ons indexed by Requires access to full text of cita2on databases search hits for sor2ng This ﬂow s2ll misses aCribu2ons embedded in supplementary informa2on, reuses aCributed through a query descrip2on, etc. Heather Piwowar, v1.0, CC-‐BY
10 * 100 = 1000
deposited in 2005
1. following citations to thepaper that describes the data collection, then filtering.
2. searching for accessionnumbers, urls, and DOIs in full text