Computers in Libraries 2012 - Discovering Data: Cataloguing Data Collections
Discovering Data: Cataloguing Data Collections Kimberly Silk, Data Librarian, Martin Prosperity Institute, University of TorontoSteve Marks, Digital Preservation Policy Librarian, University of Toronto Libraries (in absentia)
Setting the Stage• Steve works for OCUL, a consortium of Ontario’s university libraries, which is housed at the University of Toronto.• Kim works for MPI, a think tank at the University of Toronto.• We both have a lot of data to manage, but (until recently) had no way to do so.• This is the story of how we met, and how we’re ﬁguring out how to manage our data.
The University of Toronto is Canada’s largest university, with almost80,000 undergraduate students and over 15,000 graduate students 30 physical libraries make up the UTLibrary system, plus various specialized collections. With over 18 million holdings, UTL is the third-largest system in North America.
Martin Prosperity Institute The MPI is a think tankwithin the Rotman School of Management at the University of Toronto. We study the role of location, place and cityregions in global economic prosperity.
OCUL is a consortium of Ontario’s 21 university libraries. Goal: to support andenhance research and to create rich learning environments.
How Kim Met Steve• “Since I began working at • “We were starting to hear concerns from OCUL schools the MPI in 2008, I have that they were starting to ﬁeld been building systems and questions from faculty and introducing tools to administration about what they manage our information, were doing to support research and our research process; data. • A lot of these schools dont• BUT, I had a growing have the IT resources to rapidly collection of over 4 TB of launch a program like that, so data to deal with; lots of we saw it as an opportunity to data, and no way to help them out. manage or search it; • It also gave us a controlled way to start exploring some of the• How do I catalogue ideas around research data management ourselves.” data??”
The ChallengeLots of research data, and no wayto manage it.Demands from various audiences(researchers, faculty, students,staff, administration)Kim came across Dataverse,which looked like a verypromising solution, and a muchbetter alternative to thisunwieldy, awkward, network drive
• Used by leading data repositories, including ICPSR at UMichigan, UCLA’s Social Science Data Archive, and the National Bureau for Economic Research• Open source, so free; not that easy to install, but the documentation is now much better and Steve got lots of support from IQSS staff• Take a look at http://thedata.org
This is how it happened: Hey Steve, I’ve been looking at Dataverse, the the guys told me you’re playing with it. Yeah, I am. I’m in the middle of installing it. I’ve been trying to ﬁnd a way to catalogue my data. Can I give Dataverse a try? Sure! Would you mind being a test case for a presentation I’m making to OCUL? Would love to! Let me know when it’s ready, and I’ll add some data.
And now, a demo.• MPI Dataverse: http://dataverse.scholarsportal.info/dvn/dv/mpi
What I Like• Very powerful, lots of metadata options available• Fairly easy to use• Lots of other Dataverses to look at as models• Can control access, and create multiple access/permission levels• Usage statistics on downloads, web site trafﬁc via Google Analytics• Great marketing tool for our data collection - we can show the world what we have (but not necessarily let the world access it).
Challenges• It takes a LOT of planning and time• Planning - because you want your records to be consistent• Time, because it just takes time to create the records - there’s no WorldCat for data!• Every data collection is unique; can look at other models to inform your own design.• We’re still at the beginning, in pilot mode.
Thanks for Listening Kim Silk, Data Librarian, Martin Prosperity Institute firstname.lastname@example.org @kimberlysilk