RDAP 15 Data Management Outreach for the Humanities: A University of Illinois at Chicago Case Study
1. WRITINGSOURCES
IMAGES
TABLES
MANAGING HUMANITIES DATA
TEXT
Identifying Solutions
•Scholars generate their own original writing in the form of
monographs, articles, chapters, proposals for funding, and
other works.
• Many such projects are created through collaborative
efforts: scholars co-authoring a chapter, an editor and
author correcting an article, an institution or department
requesting funding for a workshop series.
•Collaborative work must be coherent to all parties; chronology of drafts
must be clear and edits preserved through versions.
•Working from a single draft is key to version control.
•Scholars across UIC departments are encouraged to work in Box;
scholars at other institutions without access to Box can use Google Drive.
•Neither service will claim ownership of written data.
•Textual data consist of words; words form works, and works
form corpora.
•These appear as print material, manuscripts, digital plain
text, or marked-up and encoded text (marginalia compiled in
a text file; an HTML version of a 19th
c. book of poetry;
encoded lists of adjectives ascribed to a Twitter hashtag).
•Tabular data include spreadsheets or comma-separated
value files.
•Tables may be renditions of primary source tables (e.g. a
ledger book, census), or aggregations of data points (e.g.
deaths due to scurvy across a century, commodities prices
in multiple countries).
•Image data are digital files of photographs, artwork, maps,
screenshots, and other visual renderings for study or
publication.
•Scholars visiting archives may take their own photos of
collection materials; some of these images must last years.
•Scholars note their primary and secondary sources (their
locations and significance); typically tracked on notecards
before the proliferation of personal computers.
•Sources are tracked through the research phase of a
project; they must remain accessible through the writing
phase (2-10 years).
•Source notes must be kept for access in future projects.
•Some projects reach completion in 2-5 years and others will last 5+
years, but certain data must last throughout a scholar’s career; clear,
thorough documentation is key for later analysis.
•Images should be embedded with metadata on subject matter, dates,
location, rights holders (in the event of publication), and other information
as necessary for the field. Metadata (structured or unstructured) will also
clarify textual and tabular data.
•Tabular data benefits from data dictionaries, illuminating “hidden”
information in tables and charts that may be forgotten by the researcher
after periods of time.
•Constraints or definitions of variables can be noted: a column “County
Name” may include “as defined by 1980 county boundaries”, a column
“Year” may include “fiscal year July 1-June 30”.
•Coded values can be forgotten after a period of disuse: values S, C, and
F in a column may refer to “sonata”, “cantata”, and “fugue” and can be
noted in the dictionary.
•Essential for all data types which take many hours to replicate (compiled
tables or writing); are prohibitively expensive to replace (data gathered at
from remote archives); or may be impossible to replace (e.g. photographs
taken by a researcher at an event).
•Scholars should understand the necessity of additional copies of static
data and regular backups for active data.
•Institutions should promote simple, safe, cloud storage providers with
which the institution has business agreements.
•Cloud storage or hard drive storage can simplify the process of
transferring data to a researcher’s new laptop or other device.
•Sources accumulate over the length of a career and must last years.
•Citation management software may be obtained through an institution;
Zotero, however, is independent and open source.
•Zotero can be synced to the cloud and saved on a local drive.
•RDF or CSV copies of sources are retained, in case of loss of access to
institutional or other source management programs and backed up.
•UIC Library provides data management assistance to all disciplines
across 15 colleges and 4 campuses.
•Outreach takes the form of workshops, library liaison contacts, and a
postcard mailing to individuals and departments.
•The library assists with data management plans, metadata,
preservation, locating repositories for sharing, security and privacy, and
general data workflows for individuals and teams.
•Current efforts focus on teaching simple steps with immediate benefits
to researchers: standardized organization, rigorous documentation, and
regular backup.
•Historically, researchers in the biological, social, and health sciences
have collaborated with librarians to improve their data management
practices.
•Additional efforts were needed to connect and educate humanist
scholars in history, linguistics, literary and cultural studies, anthropology,
art history, music, and other fields.
•Promotional postcards and the library web-based research guide were
designed with discipline-neutral features and text suggesting broad
definition of research data.
• A workshop was taught, recorded, and distributed via the research
guide specifically addressing the issues and solutions for humanities
data.
•As in other aspects of library outreach, success still relies heavily on
personal connections; resources and workshops circulate through word-
of-mouth by scholars who have a relationship with librarians.
•Continuing efforts should strive to establish more connections or take
advantage of other existing channels.
•Graduate students have been receptive to other forms of outreach, but
their time becomes severely limited at certain times of the year.
•Convenience may trump content; many scholars are interested, but
demands on their time makes ease of access paramount.
Carmen Caswell,
Academic Resident in Data
Management
DOCUMENTATION
BACKUP
VERSION CONTROL
SOURCE MANAGEMENT
Defining Data Types Planning Outreach
UIC DATA MANAGEMENT
CONNECTING WITH HUMANISTS
ASSESSMENT AND EVALUATION
A. Philip Randolph, September, 1926. Pullman Company Archives, 06-01-04, Box 17, Folder 457. S. S. Weinsheimer, May 12, 1890. Pullman Company Archives, 07-00-04, Box 4, Folder 6.
Reserve Officer Training Corps (ROTC), undated. UIC University Archives, 086 UA 90-999 2221.
Collection of the author. Collection of the author.
Diagnosing, Resolving, and Educating on Data Management Issues at the University of Illinois at Chicago