Reinventing Discovery: An Analysis of Big Data


Published on

As the late Steve Jobs would say, don’t listen to your customers, they don’t know what they want. Usability studies and live user analysis provide valuable feedback about your product or web site in terms of how the tool is used, but listening to the users about what they want out of the tool can result in a “whack-a-mole” scenario where you solve a problem for one user, but create new problems for other users. Analyzing usage data can provide a very different perspective on how live users actually use the tool and allow you to identify different personas and use cases. This talk will share how Serials Solutions collects and analyzes a dataset of queries and clicks generated by millions of users at hundreds of libraries around the world to find behaviors, patterns, successes and failures in the interface design and search algorithms and then how we leverage that to improve and redesign. We will share the details of our custom developed data warehouse system and how we leverage these tools to perform our analysis. We will also share with you before-and-afters that were developed based on the results of the ongoing analysis.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Despite the positive impact that Summon is having, we know that libraries continue to face challenges. This statement from Project Information Literacy highlights the real challenge that library’s are facing. And, that’s the fact that users’ expectations have evolved – and are ever-changing – so how can libraries possibly keep up?
  • Technology changes rapidly, so it’s no wonder that libraries can’t keep pace. As technology changes so do user expectations. Four years is an eternity in technology terms. Think for a moment that the iPad was not even invented when we introduced Summon and what a profound impact it has had on how users seek and interact with information.
  • And take a look at what Google has done lately. For better or worse, Google sets the bar for what users think of as discovery. This quote comes from Google’s lead interface designer who explains that Google has changed more in the last 2 years than in the previous decade. Why that matters is because if Google makes a change, your users’ expectations change right along with it. Their expectations can change almost overnight.
  • Millennials will text the reference desk even if they are in the library
  • 1.1 Billion Items600 Active Clients40 Countries17 Languages40 Queries per Second
  • Here you can see the behaviors of users. We can easily group them into two main areas:The broad topical queriesThe known item queries
  • Lets put this chart into perspective using a logarithmic scale so you can see.Majority of searches are 1, 2, and 3 word queries.
  • You will see here that there are lots of known item queries and also lots of subject terms grouped together.
  • Here you can see the number of queries in a session. This shows great success. Its not sharp like the last graph – we have a large amount of users who are using the system and working with it – staying in the environment running lots of queries.
  • Note: This is in a logarithmic scale so you can see the chart. Without the log scale, all you would see is a spike for english
  • The result is a redesigned interface that’s more streamlined and modern, faster and even easier to use with new functionality designed to provide more guidance and information to the user to help them determine the right results for them.
  • The result is a redesigned interface that’s more streamlined and modern, faster and even easier to use with new functionality designed to provide more guidance and information to the user to help them determine the right results for them.
  • Reinventing Discovery: An Analysis of Big Data

    1. 1. ANDREW NAGYSr. Product ManagerProQuest | Serials SolutionsRe-inventing DiscoveryAn Analysis of Big Data
    2. 2. “The reality is userexpectations have evolvedand library systems have notkept up.”
    3. 3. World Wide Web (1990?)Amazon (1995)Google (1998)Wikipedia (2001)Facebook (2004)YouTube (2005)Twitter (2006)Amazon Kindle (2006/2007)iPhone (2007)iPad (2010)Instagram (2010)Pinterest (2011)
    4. 4. The product’s designchanged more in the last twoyears than it had in theprevious decade.Jon Wiley, head designer of Google Search
    5. 5. It starts with users in mindWant to be self-sufficientThey do not ask questionsThey want to be anonymousExpect everything to be onlineand searchableMoving to mobile computing asprimary device
    6. 6. Usability StudiesOne on one interviewsUse of open ended questionsDon’t force them into unfamiliar territoryProvides valuable feedback about:ExperienceHow its usedFacial and body reactions
    7. 7. Some FindingsStart with broad topical searches and add wordsuntil they find something they like – then theybegin to engage with filtersStudents highly value the abstracts within thediscovery experience as it helps them todetermine if an item is “click worthy”Students did not find subject terms particularlyuseful but did like the disciplines
    8. 8. Matching User Expectations…“…the test subjects criticised the fact that thelocation of the saved entries and the „save‟ icon donot correspond with their experience of othersystems. Thus a different icon and the location of thesaved entries in the top right corner, as is the casewith most shopping baskets on various websites,would correspond more closely to the experience ofthe users”Helena Luca – University of Konstanz
    9. 9. Usability StudiesProblem:Users actually thinkduring usability studies
    10. 10. BIG DATAMonitor usage ofreal-time activityWhere is your data?How do you mine yourdata?Do you even keep yourdata?
    11. 11. Big Data WarehouseOpen SourceSchema FreeData VersioningFast/Safe WritingBuilt on Lucene
    12. 12. Number of terms per query0 10 20 30 40 50 60 70 80
    13. 13. Number of terms per query1 10 10045% of Searches3 words or less
    14. 14. Abandonment Rate: Terms in Query0 2 4 6 8 10 12
    15. 15. Top search termsJstorPubmedLeadershipGlobal warmingDiabetesMarketingObesityPsychologyDepression
    16. 16. Long tail search termsBranding in the Digital Age: You’re Spending Your Moneyin All the Wrong PlacesTowards an Ethnographic Approach to Art TherapyResearch: People with Psychiatric Disability asCollaboratorsA method for bias-reduction of sample-based MLE of theautologistic modelgroup therapy native american women effectivnessendocrinology prolactinchildhood obesity organic baby food
    17. 17. Improved User Experience
    18. 18. Improved User Experience
    19. 19. Filters per query in sequence1 2 3 4 5
    20. 20. Number of Queries in a Session1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    21. 21. Queries/Sessions per Language110100100010000100000100000010000000100000001E+09en sv za es de ko fr jp nl tu br pl zh it he ca ar cs th gb da no gr fa ru mi ms pt fiTotal SessionsTotal Queries93% of UsersWork in English