The summary of what I want to discuss: Data management as an emerging service of the academic library – how other places are exploring it, what the role of catalogers has been so far Public service aspects of data management – how we have determined data needs of researchers, how we have marketed our services to Pis developing DMPs, seeing what they know about curation, education specifically about what metadata is and why it’s important Technical service aspects of data management – knowing about repositories (ours, others), using our skills as metadata specialists to instruct PIs on best practices for annotating and describing their data so it can be easily found and referenced in whatever repository they drop it in Staff training needs – what skill sets are necessary for metadata librarians (faculty and staff) to fully participate in these services as they evolve? how can we make it relevant to existing activities in cataloging depts. (eTDs, digital collections)?
common data services tasks include: reaching out to researchers managing data to alert them to library services and determine their needs data literacy training for new researchers (i.e. grad students) partnering with early-adopting faculty who can champion the library as a data curation service
need to know: the institution’s IR status – if you have one, what its development status is, what sorts of metadata it handles currently and where it falls short in terms of meeting the institution’s data curation needs (whatever they happen to be) if faculty are depositing in external disciplinary repositories, need to know how to access those (persistent URLs, metadata guidelines for things like Dryad, etc.) establishing metadata standards – things like Dublin Core and MODS, mappings from existing metadata standards to whatever is in place in your repository, working with data contributors to see how much metadata they have created (likely it’s not much but you never know) and how to map it to your standards much of what we’ve discovered is that authority control, more than knowing about MARC, is the most important metadata task in the IR – because a lot of our metadata is user-contributed and batch editing tools aren’t very mature yet, the most important thing is mapping what we get from users to shared or local vocabularies to enhance discoverability, track the output of vars. departments and institutes within the university, etc. we’ve struggled with this in a few different contexts, from just getting things up and linked, to figuring out what vocabularies to use to describe certain fields in the IR. but this is an area where catalogers have lots of expertise and can leverage it in development work with IT folks never stop putting yourself in the room – ensure that you advocate for cataloging and metadata services as an essential component of IR development, and in your department as a resource
How to introduce current catalogers to data curation metadata tasks? -know your environment – participate in the design and UX testing process to ensure that the metadata forms are humane, easy to understand, ensure that it is as simple as possible for users to contribute their metadata (and for metadata librarians to enhance it via authority control) -know your content – cataloging departments have experience in providing metadata for many different forms and genres of content and data curation is no different – but relationships between different units of content in a data set are challenging to describe. -possibilities of using data curation as a teaching tool for RDA and linked data standards – because relationships are so important to this type of work
What our future steps will be: -- ScholarSphere launches in September, there will be a lot of outreach done to get content, and we’ll be working with Hydra community partners on beta-testing and rolling out new features (batch-editing of metadata, etc) that will make it easier to manage what we hope will be lots of content over the course of the academic year -- Outreach, both to let them know that we have a place for them to put their data, and just in general on how to develop a DMP and all the various parts thereof. For metadata: it depends on the repository they’re depositing and to what extent they are depositing data, advising can be anything from just telling them what Dublin Core is to actually going over best practices for documenting individual components of a data set and how they relate to one another, like building a finding aid. (I usually relate it to commenting code, which takes for some people but not for others. YMMV.) -- We’re doing a whole month of usability testing of our repository in July. I suspect this will be an ongoing activity as we extend and refine both our metadata standards and the ingest forms we use to get metadata from users. Because all metadata aggregations in ScholarSphere begin with the end user, we want it to be as humane as possible, so people don’t get discouraged from contributing data to us. More from this awesome article about UX design and forms: http://www.netmagazine.com/features/10-things-every-designer-needs-know-about-forms -- I am still thinking about how to train our staff in a way that doesn’t overwhelm them, which is hard because of RDA. For now I’m mostly encouraging them to participate in our July UX tests and give feedback on how the metadata forms work. The nature of the content we’re getting is pretty similar to eTDs, and because all of our content has a local focus it’s both natural and kind of easy to think about authority control issues within SS – things like controlling for academic programs, institutes, faculty and staff contributors, and eventually units on campus if we ever start to get into university electronic records (off-target for this). This one’s a work in progress, like I said before.
ALA 2012 - Metadata and data curation services
Developing practical metadataservices for data management programs kevin m. clair american library association annual conference anaheim, california 25 june 2012
introduction to data management• data management services in libraries• public service components• technical and infrastructural service components• staff training needs
data services in libraries• Purdue University• Cornell University• University of Houston
public service components• outreach• instruction• partnerships with early adopters
technical service components• repository development• awareness of disciplinary repositories• metadata standards (ours and theirs)• authority control• advocating for metadata services
staff training needs• systems• managing expectations for data and related content• connections with RDA and linked data standards
future steps• repository launch• further outreach to users• usability, usability, usability• training of staff
thanks for listening!kevin m. clairmetadata librarianpenn state university firstname.lastname@example.org: @kevinclair