Datasets are increasingly emerging as a ‘new currency’ in collection development. While purchasing models may in some ways mirror more traditional forms of electronic information, there are many unique considerations in the collection and acquisition of datasets. The purpose of this study is to determine the extent to which academic libraries have formalized dataset collection development policies and to highlight some of the key considerations in the development of such policies. The focus here is on commercially available datasets, rather than datasets produced at home institutions.
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
What to do about data? An overview of guidelines and policies for dataset collection development
1. What to do about data?
An overview of guidelines and policies for dataset collection development
Sarah Young, Health Science and Policy Librarian, Cornell University, sy493@cornell.edu
Why purchase data?
Secondary datasets are increasingly
important to researchers as they attempt to
answer questions, make predictions and
test hypotheses in new and powerful ways.
For libraries that strive to provide
information to support research needs,
these datasets can be considered a ‘new
currency’ in collection
development. There are many unique
considerations in the collection and
acquisition of datasets.
Methods
Currently existing dataset collection
development policies, guidelines and
programs were gathered from web
searches of academic library websites,
calls to listservs and personal
communications. A total of 18 policies,
guidelines, or programs were identified and
considered in this work. A literature review
was conducted with a focus on the
collection of commercially available
datasets. For references and links to dataset collection development
policies, please see handout.
Getting a dataset collection
development program off the
ground:
Getting the word out
Liaison librarians and subject selectors can
and should be involved in working with
researchers and faculty across disciplines,
particularly in the beginning stages of the dataset
evaluation process. They can help determine if
free datasets, or datasets already held in library
collections, meet researcher needs and can get
the word out to departments.
Handling requests
Requests can be handled on an ad-hoc basis or
via formal application procedures. Two
institutions examined in this studied provided an
online application process through which
researchers could apply for library support for
dataset purchasing (University of Cincinnati and
the University of Illinois).
Negotiating licenses
License negotiation can be lengthy and tedious;
commercial vendors selling datasets are often
used to working with individual researchers, not
libraries or institutional licensing arrangements.
Datasets in the Workflow
Decide whether datasets will be treated like
other electronic acquisitions. Licenses may be
negotiated by e-resource acquisitions
departments with expertise in negotiating terms
of use. datasets should be integrated into the
normal cataloguing workflow, and should be
considered a part of the digital preservation
program.
Purpose
The purpose of this overview was
to get a sense of current
approaches to dataset collection
development at other research
institutions, to determine key
considerations in dataset
purchasing, and to highlight
particular challenges in
implementing a dataset collection
development program.
The amount the library is willing and
able to contribute to a given dataset
should be considered, with joint
purchases between the library and the
researchers when possible.
Data should be
provided in a
format that can be
supported by the
library and used
by the researcher.
Consider readability
in commonly used
statistical software.
Datasets that come
with adequate
documentation
and relevant
metadata are
preferred. Consider
the language and
ease of cataloguing.
Datasets should
comply with the
library's existing
storage
capabilities.
Confidential data
requires special
storage and
access
considerations.
The commercial
supplier of the data
and the data itself
should be vetted for
quality and
reliability, and
long-term access
ensured.
Datasets purchased
should be
institutionally
accessible to all faculty,
students and staff.
Terms should be in
accordance with those
for other electronic
resource purchases
made by the library.
Consider fair use and the
rights of scholars to data
derivatives.
Datasets with a broad
subject appeal to the
research community,
supporting the
mission of the
institution, should be
prioritized. Consider
currency, the value of
historical data, and
geographic scope. Will
the value of a dataset
increase or decrease
over time?
Considerations in dataset purchasing
Acknowledgements
Thanks to all of those who took the time to thoughtfully
respond to listserv inquiries!
Storage
needs
Cost
Quality
Format
Scope and
Relevance
Terms
of Use
Documen-
tation
2. New England Area Librarian e-Science Symposium Poster Session, April 9, 2014
What to do about data?
An overview of guidelines and policies for dataset collection development
Sarah Young, Health Science and Policy Librarian, Cornell University, sy493@cornell.edu
Dataset purchasing policies, guidelines and programs
*Brown University http://library.brown.edu/about/datacenter
Carleton College https://apps.carleton.edu/campus/library/assets/FacParticipation2012_13rev.
pdf
Duke University not online; personal communication
Emory University https://edc.library.emory.edu/content/policy
Georgetown University http://guides.library.georgetown.edu/datapolicy
Harvard University https://hcl.harvard.edu:8001/forms/requests/data_purchase_guidelines.cfm
*MIT http://libguides.mit.edu/ssds/suggest
New York University not online; personal communication
*Yale University http://csssi.yale.edu/collections/data?destination=node%2F18
James Madison University http://www.lib.jmu.edu/faculty/datasetcdpolicy.aspx
*McMaster University http://library.mcmaster.ca/maps/Library_Data_Service_Collection_Policy.pdf
*Michigan State University http://libguides.lib.msu.edu/dataservicescollectiondevpolicy
*NC State University http://libguides.lib.msu.edu/dataservicescollectiondevpolicy
*Texas A&M http://library.tamu.edu/about/collections/collection-development/tamu-
purchased-data-value-statement.html
**University of Cincinnati http://webcentral.uc.edu/taftawards/programdetail.cfm?programid=8
**University of Illinois at
Urbana-Champaign
http://www.library.illinois.edu/sc/datagis/purchase/description2013.html
University of New
Hampshire
http://www.library.unh.edu/research-support/data-services
University of North
Carolina
http://library.unc.edu/services/data/purchase/
* Denotes institutions with detailed dataset collection development policies online.
** Denotes institutions with formal application processes for dataset purchasing programs.
3. New England Area Librarian e-Science Symposium Poster Session, April 9, 2014
Other data policies to consider:
ICPSR (Inter-University Consortium for Political and Social Research
http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/policies/colldev.html
UK Data Archive
http://www.data-archive.ac.uk/media/54773/ukda067-rms-collectionsdevelopmentpolicy.pdf
References
Church, J. (2008). International Survey Data: Challenges and Strategies for Collection
Development. DttP: A Quarterly Journal of Government Information Practice &
Perspective, 36(1), 12–16.
Davis, H. M., & Vickery, J. N. (2007). Datasets, a Shift in the Currency of Scholarly
Communication: Implications for Library Collections and Acquisitions. Serials Review,
33(1), 26–32. doi:10.1016/j.serrev.2006.11.004
Dollar, D., Eow, G., Linden, J., & Grafe, M. (2013). Distinctive Collections: The Space Between
“General” and “Special” Collections and Implications for Collection Development. In
Proceedings of the Charleston Library Conference. Charleston, SC: Purdue University
Press. doi:10.5703/1288284315094
Erwin, T., Sweetkind-Singer, J., & Larsgaard, M. L. (2009). The National Geospatial Digital
Archives—Collection Development: Lessons Learned. Library Trends, 57(3), 490–515.
Florance, P. (2006). GIS collection development within an academic library. Library Trends,
55(2), 222–234.
Lee, S. D. (2002). Electronic collection development: a practical guide. New York; London:
Neal-Schuman Publishers ; Library Association Pub.
Mooney, H., Hogenboom, K., Bordelon, B., Partlo, K., Hudson, M., & Jankowska, M. (2013, May
29). Strategies and Models for Data Collection Development. Presented at the
International Association for Social Science Information Services & Technology (IASSIST)
2013, Cologne, Germany. Retrieved from
http://iassistdata.org/downloads/2013/2013_c2_mooney_etal.pdf
Teper, T. H., Hogenboom, K., & Wiley, L. N. (2011). Collecting Small Data. Research Library
Issues: A Quarterly Report from ARL, CNI, and SPARC, 276, 12–19.
Walters, W. H. (1999). Building and maintaining a numeric data collection. Journal of
Documentation, 55(3), 271–287.