RDAP13 Kathleen Fear: The impact of data reuse: a pilot study of 5 measures
https://www.asis.org/rdap/The impact of data reuse: a pilot study of five measures Kathleen Fear April 5, 2013 LastName, Title
What is reuse impact?• Scholarly contribution through producing data – Recognized and rewarded through publications and publication metrics• Scholarly contribution through sharing data – Recognized and rewarded through … ?
What can we do to measure and communicate the scholarly contribution a data producer makes when their data is reused?
Pilot study of 5 measures• Identify a set of social science datasets• Find out how much and in what contexts they have been reused• Demonstrate a variety of measures – Do they all come out the same? – Or do different measures highlight different data?
Sample set: 273 studies Release Date Processed vs. unprocessed 19%30% 38% 2000 Processed 2001 Unprocessed 2002 81% 32% Author Type 3% 3% Single author 11% Two or more authors 38% Government Non-governmental institution 45% Media organization
Reuse citations• How many times has the data been reused?• ICPSR Bibliography of Data-Related Literature – Excluded: publications by study authors and research team members, literature reviews, commentary
Lots of data is reused a little Some data is reused a lot Even more isreused not at all
Study ID Study Name Reuse count National Comorbidity Survey: Baseline 6693 175 (NCS-1), 1990-1992 National Treatment Improvement 2884 32 Evaluation Study (NTIES), 1992-1997 Project on Policing Neighborhoods in 3160 Indianapolis, Indiana, and St. 34 Petersburg, Florida, 1996-1997 Hispanic Established Populations for the Epidemiologic Studies of the Elderly, 2851 24 1993-1994: [Arizona, California, Colorado, New Mexico, and Texas] Drug Abuse Treatment Outcome Study 2258 19 (DATOS), 1991-1994: [United States]
How high-quality arethe data’s reusepublications?
SecondaryStudy ID Study Name Impact National Comorbidity Survey: Baseline 6693 83 (NCS-1), 1990-1992 Drug Abuse Treatment Outcome Study 2258 21 (DATOS), 1991-1994: [United States] Hispanic Established Populations for the Epidemiologic Studies of the Elderly, 2851 20 1993-1994: [Arizona, California, Colorado, New Mexico, and Texas] National Treatment Improvement 2884 19 Evaluation Study (NTIES), 1992-1997 Gambling Impact and Behavior Study, 2778 18 1997-1999: [United States]
Diversity• Variety, balance, disparity among reuse publications + disparity between citing disciplines and data discipline
DataID Study Title Diversity National Organizations Survey (NOS),3190 2.5000 1996-1997 Evaluation of the Gang Resistance3337 Education and Training (GREAT) Program 2.2230 in the United States, 1995-1999 Police Stress and Domestic Violence in2976 Police Families in Baltimore, Maryland, 2.1794 1997-1999 Aging, Status, and Sense of Control3334 (ASOC), 1995, 1998, 2001 [United 2.0313 States] Reintegrative Shaming Experiments2993 2.0000 (RISE) in Australia, 1995-1999
How large is thepublication networkstemming from the data?
Downloaders• How many individuals download the data?• Unique users identified by email address or IP address
Study ID Study Name Downloaders National Comorbidity Survey: Baseline 6693 3787 (NCS-1), 1990-1992 World Values Surveys and European 2790 Values Surveys, 1981-1984, 1990-1993, 3393 and 1995-1997 Gambling Impact and Behavior Study, 2778 2637 1997-1999: [United States] Alcohol and Drug Services Study (ADSS), 3088 2478 1996-1999: [United States] 3355 Recidivism of Prisoners Released in 1994 2209
Study Diver- Reuse Sec. Down- Study Name ID sity Count Impact loaders Risk Factors for Violent Victimization of Women in a Major3052 5 18 26 27 Northeastern City, 1990-1991 and 1996- 1997 Recidivism of3355 Prisoners Released in 29 18 18 4 1994 Pennsylvania3450 Sentencing Data, 30 14 6 29 1998
https://www.asis.org/rdap/ Thank you! Questions? Kathleen Fearkfear@umich.edu LastName, Title