“More than Meets the Eye” - Analyzing the Success of User Queries in Oria
1. “More than Meets the Eye”
Analyzing the Success of
User Queries in Oria
Hugo Huurdeman, Mikaela Aamodt, Dan Michael Heggø
University of Oslo Library
VIRAK conference, 2017-06-13
2. 1. Introduction
• More insights are needed into library search
issues, what goes right and what goes wrong
• Opportunity: analyze data gathered at UiO during
the last two years
3. Research Questions
1. Which insights can we gain from classifying user
queries within Oria by their popularity, specificity
and intended target resources?
2. To what extent are the most popular user queries
successful?
3. What underlying reasons for unsuccessful queries
can be determined?
6. 2. Data: Primo Analytics
• Actions
• Device usage
• Facets
• Popular Searches (monthly)
• Timeframes: Jan-Jun 2015; Nov-Dec 2015; Jan-Sep 2016
• Sessions
• Zero result queries (daily)
• Aug 7 2015-Sep 30 2016
8. 2. Processing & Analysis
• Basic normalization of queries
• "stanislav andreski" -- stanislav andreski
• Manual annotation of
• (1) top 50 popular queries
• (2) random selection of 50 "zero result" queries
• If derivable from query, determine
• nature of query?
• for which resource type?
• curriculum-related? [in pensum lists UiO]
• successful? [result on first results page]
9. 3. Previous work
• Dominance of commercial search engines in
students’ information seeking (Griffiths & Brophy,
2005)
• Students adopting search behavior from
commercial search engines in library OPACs
(Shiv, 2012), Willson & Given, 2010)
• In particular, undergraduates just entering
academia (Novotny, 2014)
10. 4. Nature of user queries
RQ1 Which insights can we gain from classifying
user queries within Oria by their popularity,
specificity and intended target resources?
• Popular Searches and Zero Result Searches
11. 4.1 Popular Searches dataset
• 5,776 different
queries, 115,590
searches (UiO)
• monthly totals (500)
• 2015 & 2016
• 4.9% of all search actions
• Analyzing top 50
• Together issued
almost 20,000 times
13. Top 50
• What types of queries are most popular?
titles (33x, 66%)
• det kvalitative
forskningsintervju
• books, journals, databases
topics / titles (6x, 12%)
• spesialpedagogikk
• neurology
topics (4x, 8%)
• cessio legis
14. Top 50
• What resource types are sought for?
Books (50%)
• det kvalitative
forskningsintervju
• det norske samfunn
Journals (12%)
• science
• lancet
Databases (10%)
• atekst
• pubmed
• duo
15. Curriculum?
• Were the queries related to pensum books?
• Quite often: at least 58%!
• menneskets fysiologi
• det kvalitative
forskningsintervju
• Or other things (9%) :)
• aftenposten;
morgenbladet
• harry potter
17. Curriculum?
• Were the queries related to pensum materials?
• Quite often: at least
58%!
• menneskets fysiologi
• det kvalitative
forskningsintervju
• Or other things (34%) :)
• aftenposten;
morgenbladet
18. 4.2 “Zero result” queries
• 39,925 different
queries
• In total 52,257
searches
• Aug 2015-Sep
2016
• 2.2% of all search
actions
• Annotating
random sample
(50)
al azm sadik «orientalisme og omvendt orientalisme» 28
curr eye res 27
allmenningen olaf 24
12136173x 22
am j ophthalmol 21
821017268 21
askim j. 19
900317809 19
cheng 19
(direkte krav or direktekrav) and subrogasjon 17
151582998 16
142734500 16
direkte krav or direktekrav 16
961675616 16
9780549547303 16
agirdag phalet 16
andresen steinar elin l. boasson og geir hønneland (2012) international environmental agreements 15
19. Queries without results
What is the nature of the “zero result” queries performed in Oria?
• Pasted reference (20x, 40%)
• Browning, N. (2015). The
ethics of two-way symmetry
and the dilemmas of dialogic
kantianism. Journal of Media
Ethics
• Title (15x, 30%)
• Sentralbankens oppgaver i
dag og i fremtiden
• Author (8x, 16%)
• Christopher Hotchens
20. Queries without results
• Book (28%)
• Prcopius Secret History
• Book Chapter (12%)
• Solhaug, (2006). Kapittel 13:
Strategisk læring i
samfunnsfag. I
• Article (24%)
• E.g., pasted references
Which resource types are not found?
21. Degree of pensum queries?
At least 28% of the
unsuccessful queries
are for pensum
materials
• Fukuyama, F. (2013): What Is
Governance? Governance,
Vol. 26, No. 3, July 2013 (s.
347–368).
• basic immubology
27. Were the queries successful?
• Often, yes:
• main result in first 10
results: 58%
• not (easily) found: 20%
• For example:
• pubmed
• nature
• science
28. What are causes for
unsuccessful queries?
• Ambiguous names
• nature, science
• At the time of writing, no entries for
some databases
• pubmed
29. 6. Queries without results
RQ3 What underlying reasons for zero result
queries can be determined?
31. Why no results?
• Query being too specific (pasted reference, pasting quote) (22%)
• Browning, N. (2015). The ethics of two-way symmetry and the
dilemmas of dialogic kantianism. Journal of Media Ethics
• Misspellings, reference mistakes (e.g. wrong year) (20%)
• svennevig j.: ledelsesretorikk i nedbemanningssituasjoner 2009
• Using incorrect query syntax (2%)
• "McLuhan" AND/OR "Understanding media"
• 978-147996410-9
• Wrong scope (12%), wrong field (4%)
32. Why no results (2)
• Searching for an ISBN number (specific edition not in library)
• 9780618721566
• Searching for journal titles, ISSNs, DOIs
• journal of speech and hearing disorders
• Searching for course codes
• MED1100
• Resource not (indexed) in Oria (16%)
• Haugianerliberalistene: En analyse av haugianere som politikere
og næringslivsaktører
33. Suggestions for misspellings?
• 10 out of 50 queries
(20%) were caused by
misspellings.
•For half, a correct spelling
suggestion exists
•(April 17)
34. Do queries still return 0 results?
• In 72% of the cases, no
improvement, but 28% is
now resolved.
• April ’17
35. 7. Conclusion & discussion
• Library catalog containing more than
meets the eye
• Even though materials are available,
they do not always show in searches
• Issues in:
• query formulation
• system support...
36. Discussion
• Refine and extend search suggestions
• Enhanced spell check / query corrections
• why students underacheive → 0 results
• why students underachieve → 463 results
• Query suggestions and autocomplete
• Could be based on previous (successful) queries
• especially recurring queries should be supported (50% of
popular queries!)
• ISBN suggestions
• Automatically search for the book title and author
(using information derived from ISBN number?)
37. Discussion
• 0 results: get more helpful information
• Requesting assistance, material
• Contextual search suggestions
• e.g. broaden search, simplify query, alternate
formulations
• Available scopes
• suggest how many searches in other scope
38. Discussion
• Importance of "pensum" queries
• Better integration with pensum lists
• Detecting curriculum queries
• E.g. widget which suggests pensum
materials directly // referring students to
UiO fagsider // etc.
• On the cataloging side:
• Course codes (e.g. INF2260)
"to catalog or not to catalog"
• More support for Database queries
• Atekst could be found, not pubmed
39. Discussion
• Monitoring searches
• Detect sudden ‘spikes’ in zero result queries
• Finding errors in curriculum lists
• Detecting ‘holes’ in collections (acquisition
staff), and e.g. popular books with too few
copies
40. 8. Future work
• Plans for further analysis
• Comparing queries with most frequent loans
• Alma Analytics
• Current analysis: many known-item queries. Also look at
common exploratory queries
• medicine; math; united nations; economics
• how can we support those types of queries better?
• Obtaining more data (Limits Primo Analytics)
• e.g. analyzing "struggling sessions", stats location, etc
• Look at use of external databases, link resolver stats