Micro-Scholarship, What it is, How can it help me.pdf
Public Service for Large-Scale Digital Collections
1. Public Service for Large‐Scale
Digital Collec5ons
John Mark Ockerbloom (Penn)
Leslie Johnston (Library of Congress)
Chris Powell and Jeremy York (Michigan)
DLF forum working session
October 31, 2011
2. It’s all about collabora5on
• We collaborate
with users
• We collaborate
with each
other
• A dis5nc5ve
quality of
libraries
From a photo by Colleen McMahon, CC‐BY
hQp://www.flickr.com/photos/gatz125/5134393346/
4. Different collec5ons, different scales
• MLibrary Digital Collec5ons
– Presented by Chris Powell
• HathiTrust Digital Library
– Presented by Jeremy York
• Lib. of Congress Digital Collec5ons & Services
– Presented by Leslie Johnston
• The Online Books Page
– Presented by John Mark Ockerbloom
5. Qs: Goals of digital public service
• How do you find out who your users are, and
what they most need?
• What’s the most useful thing users can do for
you, and how can you best get them to do it?
• How do you ensure you’re paying aQen5on to
the people and issues you should?
• How does the type of collec5on, interface, or
audience affect your interac5ons?
6. Qs: Implemen5ng digital public service
• Do you have service quality goals and
benchmarks? How do you evaluate them?
• Who does the work? How do you allocate
labor, and do triage?
• What kinds of technologies or
implementa5ons do you find most useful?
• Has user feedback prompted you to change
your design or what you do? How?
7. Qs: Evalua5ng and scaling up
digital public service
• Do you use public service data to jus5fy
collec5ons and service work? If so, what data
do you find most effec5ve?
• How can different services and ins5tu5ons
most effec5vely collaborate with each other?
• What else is good to know about suppor5ng
large‐scale digital collec5on services?
9. Lots of stuff
• Various formats:
– Con5nuous tone image collec5ons
– Full‐text collec5ons: encoded XML and page
images
– Non‐MARC metadata collec5ons
– EAD‐encoded finding aid collec5ons
• Various sources
• Various access models
11. Footprints
• Centralize email from users of our online
resources
• Reply to people who provide contact
informa5on
• Act on those that we can, even without
contact informa5on
• Delete those we cannot follow up on
• Assigned to those responsible for the content
12. A few numbers
• Just over 5000 “5ckets” in the system, open
and closed
• 25,000 5ckets have been deleted
– Spam
– Anonymous and not ac5onable
• Roughly 650 of the 5000 5ckets are
categorized as MBooks or HathiTrust
• Moved HathiTrust to its own system in March
14. What are they asking?
• Topics consistent regardless of collec5on/site
– Did you know this was broken/wrong?
– Now that I’ve found this, how can I get access?
– Can I use this image in my book/project?
– I have this old book – is it valuable?
– Can you help me with this very complex query?
• The percentages vary in HathiTrust
– More about content errors and access desires
15. What are they not asking about
HathiTrust?
• Ques5ons related to the content
• Many repeated ques5ons in some of our
collec5ons
– What is Apocrypha? Why is Ecclesiates 9 showing as Qoh. 9? Why isn’t
this quote in your online Collected Works of Abraham Lincoln? Why can’t I
find the verse 9:11 about the wrath of the eagle in your online Koran?
• Also general ques5ons
– This census from 1901 lists people born in this town, but none of the
gazeQeers or atlases show a town of this name. The Philadelphia
Centennial exhibiFon history says this, a souvenir medallion from a
relaFve says something else. Which is right? Is it fake? Are you sure
these poems in Catholic World are by William Gibson?
16. Observa5ons
• HathiTrust feedback forms guide you to error/
access repor5ng, not subject inquiries
• Single 5tle sites (Bibles, Koran, Lincoln) get the
most repeated content ques5ons
• MoA gets many content ques5ons, but has a
unifying (if general) theme
• Affilia5on with MLibrary may make content
ques5ons seem more welcome
• Data skewed by so much past prac5ce
18. Timeline
• October 2008 – HathiTrust launched
– hathitrust‐info@umich.edu
• February 2010 – star5ng regular sta5s5cs
• August 2010 – Chris joined hathitrust‐info
• User support group since April 2011
19. What Digital Public Service
Accomplishes (1)
• How do you find out who your users are,
and what they most need?
– User feedback via the feedback link
– Frequent discussions with reference & instruc5on
librarians who share their "front‐line" encounters with
users
– HathiTrust UX Advisory Group & UX‐SIG for interested staff
at partner ins5tu5ons to discuss and bring issues to the
table
• Developing set of personas and scenarios
– Watch blogs & twiQer for unsolicited feedback
– Somewhat regular user research (surveys)
20.
21.
22. Use sta5s5cs
Visits % new visits Page views Pages/visit Time on site
Jan‐Jun2010 253,129 69% 2,154,385 11.6 6.3min
Jul‐Dec2010 907,524 75% 8,443,692 9.3 5.3min
Jan‐Jun2011 2,154,385 83% 14,945,119 6.9 4min
Jul‐Oct2010 2,274,468 84% 12,072,991 5.3 3.3min
23. What Digital Public Service
Accomplishes (1)
• What’s the most useful thing users can do for
you, and how can you best get them to do it?
– Let us know of problems, issues, desires (content
and services)
– People who are interested let us know
24. User Support Issues
450
400
350
300 Content
250 Cataloging
Access and Use
200
Web Applica5ons
150
Partner Ingest
100 General
50
0
Apr‐Jun2011 Jul‐11 Aug‐11 Sep‐11
25. Issue Type August Issues September Issues
Content 110 171
Quality 96 154
Non‐partner Digital Deposit 3 2
Collections 8 4
Cataloging 26 25
Access and Use 111 127
Copyright 58 73
Permissions 23 12
Takedown 2 3
Print on Demand 6 17
Inter‐library loan 0 5
Full‐PDF or e‐copy requests 14 24
Datasets 1 1
Data Availability and APIs 1 7
Reuse of content 7 5
Web applications 27 22
Functionality problems 5 5
Problems with login 1 0
speciNically
General Questions about login 3 2
Partners setting up login 4 5
Usability issues 11 6
Feature requests 7 2
Partner Ingest 2 0
General 59 65
Partnership 13 12
Infrastructure 1 0
Miscellaneous 45 53
27. What Digital Public Service
Accomplishes (2)
• How do you ensure you’re paying aQen5on to
the people and issues you should?
– Ensure that we look at many different sources to
get the largest variety possible.
– We receive, find, and solicit feedback in ways
noted above
28. What Digital Public Service
Accomplishes (3)
• How does the type of collec5on, interface, or
audience affect your interac5ons?
– We are preQy uniform
29. Implemen5ng Digital
Public Services (1)
• Do you have service goals and benchmarks?
How do you evaluate them?
– Respond to feedback within 1 business day
– Handle issues such as take‐down no5ces and
cri5cal systems problems immediately
– No goals for resolving other problems, but
correc5ons happen preQy quickly
– User support as outreach mechanism
• Important to how we are seen by the community
30. Implemen5ng Digital
Public Services (2)
• How do you allocate staff, and do triage?
– User support group
– Rota5on of 24‐hour “on‐call” periods
31. Implemen5ng Digital
Public Services (3)
• What kinds of technologies or
implementa5ons do you find most useful?
Users (partner and non)
All HathiTrust University of Michigan
Contacts Contacts
User Support
• Copyright
Working Group • Quality
• Print on Demand
Documenta5on
JIRA
32. Implemen5ng Digital
Public Services (4)
• Have you changed your design or what you do
based on user feedback? How?
– PageTurner improvements
• BookReader integra5on and accompanying interface
reorganiza5on; full‐screen mechanism
• Advanced search features for full‐text search
• Labeling for PDF download
33. Evalua5ng and Scaling Up Digital Public
Service (1)
• Do you use public service data to jus5fy
collec5ons and service work? If so, what do
you find most effec5ve?
– No, but key factor in expanding to working group
and having partner system was communica5on
34. Evalua5ng and Scaling Up Digital Public
Service (2)
• How can different services and ins5tu5ons
most effec5vely collaborate with each other?
– Commonly‐usable system
– Defined roles (contacts, etc.)
– Defined workflows and procedures
• What else is good to know about suppor5ng
large‐scale digital collec5on services?
– Factor it in, don’t underes5mate 5me needed
36. So Much Stuff, It’s Used
as a Unit of Measure
• What cons5tutes a “Library of Congress” worth of
digital content changes all the 5me.
• A huge variety of formats: Full Text, Page Images,
Image Collec5ons, Finding Aids, Electronic Serials,
Video, Audio, Legisla5on, Web Archives.
– An es5mate of 24.6 million files in the main LC web
presence at the end of 2010.
• A variety of acquisi5on methods, including
through the US Copyright Office
• A variety of access rules (included classified) and
access methods and systems.
37. Ques5ons are Always Coming In
• Via email
• Via “Ask a Librarian” Online Reference
• In Person
• In 2010, Library staff answered over 191,000
email or online reference ques5ons.
38. Who are we serving?
• Congress
• Librarians at other ins5tu5ons
• Academic Researchers
• The General Public
39. What Do They Want to Know?
• Do you have this thing that I couldn’t find? You must have
it, because the Library of Congress has everything.
• Why isn’t everything full text?
• How do I get the rights to use your collec5ons, and get
permissions to use them in my research or in a publica5on?
• How can I get copies of your digital files?
• Will you digi5ze something in your collec5ons for me?
• How do I cite your online collec5ons?
• Please fix this error in your web site, or in your
bibliographic or authority records.
• Can I donate digital collec5ons to the Library?
• Will the Library help me digi5ze my collec5on?
• What standards do you use for digi5zing?
40. What Digital Public Service Accomplishes
• How we learn something about who out
users are, and what they need most:
– Interac5ons with Digital Reference librarians
– User feedback via the Contact Form
– Comments on the Library’s blogs
– Share Tool and Web log sta5s5cs
• Over 580 million page‐views of the Library’s
website in 2010.
41. What Digital Public Service Accomplishes
• What we learn:
– What are we doing wrong? One issue is the
complexity of the Library’s web presence. We now
have an ini5a5ve to make it easier to find digital
collec5ons without knowing about them in the
first place.
– What are we doing right? We get more posi5ve
reac5ons and thanks for doing what we do than
nega5ve.
43. What Digital Public Service Accomplishes
• How does the type of collec5on,
interface, or audience affect your
interac5ons?
– Our feedback from Congress is different
from our feedback from the public
• Congress makes direct digital collec5on building and
services depending on the tasks at hand
• Researchers and the public are rela5vely uniform
48. Evalua5ng and Scaling Up
Digital Public Service
• Do you use public service data to jus5fy
collec5ons and service work? If so, what
do you find most effec5ve?
– No, we do not. But that doesn’t mean that
we don’t report on them.
49. Evalua5ng and Scaling Up
Digital Public Service
• How can different services and ins5tu5ons most
effec5vely collaborate with each other?
– Share more informa5on about what collec5ons and
digi5zed collec5ons we have
• Explicitly document rights for media files and metadata
• Make them available as linked open data where we can
– Common‐ish standards, or at least well‐documented
standards
– Consistent, documented workflows
• What else is good to know about suppor5ng large‐
scale digital collec5on services?
– The scale really does make a difference. Our response
5mes are always going to be slower.
51. Tes5ng the limits of a low‐resource
catalog project, since 1993
• 1 person working part‐5me falls behind
– Both in catalog, and in responding to public
• Scaling up the catalog
– Metadata automa5cally downloaded from HathiTrust,
other sources
– I con5nue to add new entries, “curate” auto‐loaded
entries, on my own and at user request
• Scaling up user service
– Invite people to standard forms
– Make efficient back end to deal with requests
• Ul5mately, no subs5tute for human investment
58. Looking ahead: Collabora5on
• Number of online books s5ll to grow lots
– Especially as I add more automa5cally loaded sources
• Staff, budget not likely to grow
– User feedback helps me determine where effort best spent
• Enhanced back‐end interface may make mul5ple
maintainers feasible
• May be useful collabora5ons with interns, volunteers
– A way to get hands‐on experience with librarianship
• Sharing data helps other services build on my work
– Enhancing “regular” library collec5ons may enable support
59. Discuss: Goals of digital public service
• How do you find out who your users are, and
what they most need?
• What’s the most useful thing users can do for
you, and how can you best get them to do it?
• How do you ensure you’re paying aQen5on to
the people and issues you should?
• How does the type of collec5on, interface, or
audience affect your interac5ons?
60. Discuss: Implemen5ng
digital public service
• Do you have service quality goals and
benchmarks? How do you evaluate them?
• Who does the work? How do you allocate
labor, and do triage?
• What kinds of technologies or
implementa5ons do you find most useful?
• Has user feedback prompted you to change
your design or what you do? How?
61. Discuss: Evalua5ng and scaling up
digital public service
• Do you use public service data to jus5fy
collec5ons and service work? If so, what data
do you find most effec5ve?
• How can different services and ins5tu5ons
most effec5vely collaborate with each other?
• What else is good to know about suppor5ng
large‐scale digital collec5on services?
– What have you learned that you didn’t expect?