Jonathan Crabtree Cheryl Thompson Using Dataverse Virtual Archive Technology  for Research Data Management
Outline <ul><li>Overview of Odum and issues around data management </li></ul><ul><li>Concepts around Dataverse and federat...
H. W. Odum Institute Archive Services <ul><ul><li>The Howard W. Odum Institute was founded in 1924. </li></ul></ul><ul><ul...
The Problem <ul><li>Different needs for archives, data libraries, researchers, journals, funding agencies… </li></ul>We sh...
Odum ’s Solution <ul><li>Dataverse Network:   centralized   professional archiving with  distributed  control and recognit...
How it works? Cross, M.  Why the Dataverse Network?  Available at: thedata.org
Supporting data <ul><li>Convert to a preservation format (data and metadata) </li></ul><ul><li>Calculate Universal Numeric...
Creating data citations <ul><li>Author(s) </li></ul><ul><li>Year </li></ul><ul><li>Title </li></ul><ul><li>Persistent URL ...
Managing data and versions Contributor, curator, admin view  End user view Data File 1 Data File 2 Edit study & add new fi...
Data never permanently deleted <ul><li>A study is never permanently deleted after it is released. Curators or admins can  ...
Supporting standards <ul><li>Study and variable metadata are exported into  XML  (Dublin Core, Data Documentation Initiati...
Replicating data
Dataverse Virtual Archives <ul><li>Custom web skins </li></ul><ul><li>Researchers retain control of data access </li></ul>...
 
 
 
Dataverse Features <ul><li>Federated search & discovery </li></ul><ul><li>Online analysis </li></ul><ul><li>Multi-format d...
 
 
 
 
 
 
Data archiving in 4 steps <ul><li>Gather and convert study files to the appropriate format </li></ul><ul><li>Log into your...
 
 
 
 
 
 
 
 
 
Moving beyond social science <ul><li>Dataverse Network is cross-disciplinary. </li></ul><ul><li>We are expanding the study...
Benefits to… <ul><li>Researchers: </li></ul><ul><li>Gives recognition to authors/researchers  </li></ul><ul><li>Creates a ...
Questions? <ul><li>Jonathan Crabtree, Asst. Director for Archives & IT </li></ul><ul><ul><li>Phone: (919) 962-0517 </li></...
Upcoming SlideShare
Loading in …5
×

Using Dataverse Virtual Archive Technology for Research Data Management

519
-1

Published on

One of the most important components of research is access to quality data. Digital data archives must work to increase submission rates to insure that quality data exist for future researchers. This is a challenge given that recent studies show that vast amounts of data collected during publicly funded projects are not being archived. Even the best-planned methodology will not succeed when researchers use tainted data or fail to find adequate data. Social science data archivists play a key role in the effort to maintain quality sources of data for social science investigators to repurpose and reuse. The dynamic, circular movement of data between the producers and archives is critical to the future of social science research. Data archives have historically provided for this data interchange using considerable human capital. Dedicated archivists and investigators have worked together to ensure that data were processed and placed into an archive best designed for their preservation, a manual process that has become increasingly expensive and unwieldy due to the volume of data being produced and the advanced metadata required to provide future researchers enough details to reuse the study. Typical methods have the researchers working with the archives to deposit the data long after the project has been complete and the papers published. The manual creation of metadata at this point takes far long than if it were collected earlier in the research life cycle. Recent advances in archival repository software may be the key to streamlining this increasingly inefficient archival process by allowing archivist and researchers the ability to create detailed metadata earlier in the research lifecycle at a point where it will take far less time. Software allows researchers greater personal control over archival ingest processes, bridging the gap between researchers and archives and possibly increasing submission rates of valuable data to archives. Archival technology provides tools that manage automated ingest, data cataloging, advanced search and indexing, and rights and access issues. Archival tools also provide proper citation, creation of persistent identifiers, automatic creation of preservation formats, format migration, and statistical analysis of data. Customized branding and citation management can provide investigators collecting these data with a tool that will ensure that they get the credit they deserve. The Dataverse Network Technology has the potential to aid many research groups at UNC in the data management processes and has the potential for use in many disciplines. This presentation will explain the technology and its applicability for managing research data.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
519
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Using Dataverse Virtual Archive Technology for Research Data Management

  1. 1. Jonathan Crabtree Cheryl Thompson Using Dataverse Virtual Archive Technology for Research Data Management
  2. 2. Outline <ul><li>Overview of Odum and issues around data management </li></ul><ul><li>Concepts around Dataverse and federated data systems </li></ul><ul><li>A look into Dataverse Virtual Archives </li></ul><ul><li>Features of the Dataverse Network </li></ul><ul><li>Benefits to Researchers & IT providers </li></ul><ul><li>Exploring new possibilities </li></ul>
  3. 3. H. W. Odum Institute Archive Services <ul><ul><li>The Howard W. Odum Institute was founded in 1924. </li></ul></ul><ul><ul><li>It is the oldest multidisciplinary social science university institute. </li></ul></ul><ul><ul><li>Odum Archive Services is host to the third largest catalog of machine-readable social science data in the U.S.  </li></ul></ul><ul><ul><li>Founding member of Data-PASS </li></ul></ul><ul><ul><li>Founding member of The Library of Congress NDSA </li></ul></ul><ul><ul><li>The Odum Dataverse Network (DVN) catalog includes polling, census, and other social science and health-related data.  </li></ul></ul>
  4. 4. The Problem <ul><li>Different needs for archives, data libraries, researchers, journals, funding agencies… </li></ul>We should preserve the data I want credit for my data We need persistent links I need a Data Management Plan No publications without data Cross, M. Why the Dataverse Network? Available at: thedata.org
  5. 5. Odum ’s Solution <ul><li>Dataverse Network: centralized professional archiving with distributed control and recognition </li></ul>Cross, M. Why the Dataverse Network? Available at: thedata.org <ul><li>Persistent identifiers </li></ul><ul><li>Fixity </li></ul><ul><li>Backups & recovery </li></ul><ul><li>Metadata standards </li></ul><ul><li>Conversion standards </li></ul><ul><li>Preservation standards </li></ul><ul><li>Branding & visibility </li></ul><ul><li>Data discovery </li></ul><ul><li>Ease of use </li></ul><ul><li>Scholarly citation </li></ul><ul><li>Control over updates </li></ul><ul><li>Terms of access & use </li></ul>
  6. 6. How it works? Cross, M. Why the Dataverse Network? Available at: thedata.org
  7. 7. Supporting data <ul><li>Convert to a preservation format (data and metadata) </li></ul><ul><li>Calculate Universal Numerical Fingerprint (UNF) </li></ul><ul><li>Download in multiple formats </li></ul><ul><li>Download a subset of the data </li></ul><ul><li>Generate summary statistics </li></ul><ul><li>Apply Zelig (R) statistical methods </li></ul><ul><li>Visualize time series </li></ul><ul><li>Define Terms of Use and Permission </li></ul>Cross, M. Why the Dataverse Network? Available at: thedata.org Tabular Data: STATA SPSS CSV + control card Tab delimited + DDI Social Network Data: GraphML Other data or relevant files: All formats are accepted BUT only tabular files have full data support
  8. 8. Creating data citations <ul><li>Author(s) </li></ul><ul><li>Year </li></ul><ul><li>Title </li></ul><ul><li>Persistent URL and ID </li></ul><ul><li>UNF </li></ul><ul><li>Distributor </li></ul><ul><li>Version </li></ul><ul><li>Other optional fields </li></ul><ul><li>Louis Harris and Associates, Inc., 1992, &quot;Harris 1984 Female Veterans Survey, study no. 843002&quot;, http://hdl.handle.net/1902.29/H-843002 UNF:3:4VngKZgBorG/7T6aZSaq1g== Odum Institute;Odum Institute for Research in Social Science [Distributor] V1 [Version] </li></ul>Cross, M. Why the Dataverse Network? Available at: thedata.org
  9. 9. Managing data and versions Contributor, curator, admin view End user view Data File 1 Data File 2 Edit study & add new file Cross, M. Why the Dataverse Network? Available at: thedata.org
  10. 10. Data never permanently deleted <ul><li>A study is never permanently deleted after it is released. Curators or admins can deaccession the study. </li></ul>Edit study This study is deaccessioned. [Go to other study] Cross, M. Why the Dataverse Network? Available at: thedata.org
  11. 11. Supporting standards <ul><li>Study and variable metadata are exported into XML (Dublin Core, Data Documentation Initiative – DDI, FGDC) and MARC </li></ul><ul><li>OAI-PMH for harvesting metadata </li></ul><ul><li>LOCKSS for data duplication in multiple locations </li></ul><ul><li>Z39.50 for distributed search </li></ul><ul><li>E-Z Proxy to authenticate for data access </li></ul><ul><li>Federations enable via standards </li></ul>Cross, M. Why the Dataverse Network? Available at: thedata.org
  12. 12. Replicating data
  13. 13. Dataverse Virtual Archives <ul><li>Custom web skins </li></ul><ul><li>Researchers retain control of data access </li></ul><ul><li>Citations provide academic credit for data collection work </li></ul><ul><li>Easy access to online research tools </li></ul>
  14. 17. Dataverse Features <ul><li>Federated search & discovery </li></ul><ul><li>Online analysis </li></ul><ul><li>Multi-format download </li></ul><ul><li>Collection organization </li></ul><ul><li>Automated metadata generation </li></ul><ul><li>Custom metadata templates </li></ul><ul><li>Controlled ingest workflows </li></ul>
  15. 24. Data archiving in 4 steps <ul><li>Gather and convert study files to the appropriate format </li></ul><ul><li>Log into your virtual archive </li></ul><ul><li>Add a new study </li></ul><ul><li>Add the study files </li></ul>
  16. 34. Moving beyond social science <ul><li>Dataverse Network is cross-disciplinary. </li></ul><ul><li>We are expanding the study metadata and building communities of interested groups: </li></ul><ul><ul><li>[email_address] </li></ul></ul>Cross, M. Why the Dataverse Network? Available at: thedata.org
  17. 35. Benefits to… <ul><li>Researchers: </li></ul><ul><li>Gives recognition to authors/researchers </li></ul><ul><li>Creates a permanent data citation with UNF </li></ul><ul><li>Converts data and study files to a preservable format </li></ul><ul><li>Allows researchers to set who can access the data (and modify this at a later point) </li></ul><ul><li>IT/Computer support: </li></ul><ul><li>It ’s free </li></ul><ul><li>Do not need additional software for Dataverse </li></ul><ul><li>Offload long-term data preservation concerns </li></ul>
  18. 36. Questions? <ul><li>Jonathan Crabtree, Asst. Director for Archives & IT </li></ul><ul><ul><li>Phone: (919) 962-0517 </li></ul></ul><ul><ul><li>Email: Jonathan_Crabtree@unc.edu  </li></ul></ul><ul><li>Cheryl A. Thompson, Graduate Research Assistant </li></ul><ul><ul><li>Email: cheryl_thompson@unc.edu </li></ul></ul><ul><li>Email: odumarchive@unc.edu </li></ul>

×