DATAVERSE
Julie Goldman 2014
HARVARD UNIVERSITY	

!
INSTITUTE FOR QUANTITATIVE 	

SOCIAL SCIENCE
DATAVERSE 	

NETWORK PROJECT
• repository for research data that takes care of
long-term preservation	

• employs old archival practices while allowing
researchers to keep control of and receive
recognition for their data
DATA MANAGEMENT
• provides access and sharing capabilities	

• allows researchers to deposit data in organized,
curated and citable network	

• promotes access and sharing
ARCHIVING
• metadata is exported to XML	

• data files reformatted for long-term access	

• all versions are kept	

• metadata and data are replicated to multiple locations
through LOCKSS	

๏ Lots of Copies Keep Stuff Safe (Standford University)
TERMINOLOGY
Container)for)your)Research)Data)Studies(
Schema4c)Diagram)of)a)Dataverse(in)a)Dataverse)Network)
Dataverse)
Research)Study)#1)
Research)Study)#2)
Study&
Data&sets&
Documenta/on&
Code&
Container&for&your&Research&Data&
Cataloging&Informa/on&
Schema/c&Diagram&of&a&Study&in&a&Dataverse&
UNIQUE SERVICES
• tubular data sets	

๏ files with rows and columns (SPSS, STATA, CSV) can be subset	

๏ user can extract only some of the variables	

• social network data	

๏ data that describes a network of entities and relationships	

๏ sets uploaded in GraphML format to provide flexibility
Data sharing and archiving with control
and recognition for data authors
Persistent Data Citations
linking data to publications
Customized Branding
or embed on your site
Support for All File Types
any format
Data Restrictions
& terms of use options set by data author
Rich data support for certain file
formats
SPSS, Stata, R Data
metadata extraction, subset & R analysis
FITS Data
metadata extraction
Social Network Data (GraphML)
smart queries & subsetting
Data Visualizations
for time series
Data management, standards
and archival best practices
General and Domain-Specific Metadata
following metadata standards
Data Versioning
preserve & cite previous versions
Traffic & Downloads Tracking
for your data with Guestbook
Permanent Storage
preservation format; copies in
multiple locations
Harvard Dataverse Network Features
Learn more at: thedata.org or start searching and uploading at thedata.harvard.edu
Coming soon in Dataverse 4.0:
• Redesign of the entire user interface
• Dataverses can contain other dataverses
• Simplified workflows for creating an account, a dataverse, and datasets as well as uploading files
• Terminology changes:
• Study now called Dataset
• Cataloging Information now called Metadata (general, domain-specific and file metadata)
• Collections are now dataverses
Want to participate in Dataverse
4.0 testing? Sign up @
http://tinyurl.com/DVUserTesting
USING DATAVERSE
• web interface	

๏ individual researcher can create a dataverse through a web form
on the DVN and deposit their own data sets	

๏ dataverses are organized in studies which are given a data citation
so they can be referenced	

• software installation	

๏ institution can intall it on their servers and create their own
Dataverse Network
Ge#ng&Started&with&the:&Harvard&Dataverse&Network&
Step&1:&Go#to:#
thedata.harvard.edu#
Step&2:&Create#Account#
open#to#Harvard#&#non5Harvard#users#
Step&3:&Create#your#own#Dataverse#
for#your#own#research,#project,#journal,#and#more#
Step&4:#Create#a#Study##
describe#the#study#to#receive#a#formal&data&
cita@on&(w/#persistent#URL)#for#others#to#
discover#and#cite#your#work!(required!fields!
are!author,!1tle,!and!date,!plus!op1onal!fields)!
Step&5:#Upload#Data#Sets#+#
Code#+#DocumentaJon#
any#format#or###of#files,#with#a#max#of#
2GB/file,#with#more#features#for#certain#
formats#(SPSS,#Stata,#R,#FITS,#GraphML)#
Step&6:#Release#Study#+#
Dataverse#
for#others#to#find,#cite,#share,#and#
reproduce#analyses#of#your#study#
Learn#more#at:#thedata.org#
CREATE ACCOUNT
DATAVERSE REPORT
• Due November 1	

• create dataverse	

• add midterm study to the dataverse	

• add datasets to the midterm study	

๏ interview instrument, recorded interview, interview
transcript, data management plan
DATAVERSE REPORT
• report on your experience using DVN	

• 1 page write-up: 	

๏ what you liked, did not like	

๏ what was useful, what was confusing	

๏ did you have to do a lot of searching on the website for help	

๏ try searching for data sets: was it easy or hard
QUESTIONS?

Dataverse Netowrk Project