Mosiac Search Engine


Published on

The Mosaic search engine is a prototype of an bibliographic search engine with personalisation facilities produced as part of the JISC-funded Mosaic Project

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Mosiac Search Engine

  1. 1. The Mosaic Search Engine<br />Mark van Harmelen<br />Hedtek Ltd<br /><br />
  2. 2. Aim<br />Provide a proof of concept that <br />Users can have personalised search results according to their place and stage of studies<br />Users can adopt other personas or points-of-view to explore academic resources<br />We can exploit ‘mass’ attention data as revealed by library circulation information<br />So far only working with ISBN identified books<br />
  3. 3. HEI<br />circulation data<br />build Solr index<br />anonymise<br />partial Copac records annotated with use and reading list data<br />reading lists<br />Solr<br />HEI<br />anonymise<br />front-end<br />HEI<br />anonymise<br />
  4. 4. Anonymisation<br />Level 1: Current prototype, enables faceting<br />Level 2: With extra information, enables“people who borrowed this also borrowed”and“people who borrowed this went on to borrow”<br />Anonymisationutility provided<br />DPA compliant, can also use fair processing agreements<br />
  5. 5. Augmenting Solr’s index<br />Solr’s search index is loaded with items and any associated use information<br />Use information is: institution course progression level year of use count of number of uses in that year<br />Use information enables faceting<br />Also add reading list info to items<br />
  6. 6. Solr<br />OPAC<br />resultset<br />itemquery<br />item data<br />query<br />client-side front-end (browser)<br />
  7. 7. Narrowing and broadening<br />Thoughts (NB, ‘thoughts’) of narrowing of choice led to two features to broaden choice<br />Don’t believe that the Mosaic demo in itself narrows when used for browsing<br />Broadening features<br />More like this link<br />Reading lists<br />
  8. 8. The Harry Potter ‘problem’ and scale<br />The Harry Potter ‘problem’: Balderdash!<br />We can control this using Library of Congress subject categories and Dewey Decimal shelfmarks<br />Paul Miller raises questions of scale<br />Dave Pattern has shown success use of use data at a single (small) institution<br />We want to leverage reasonably large scale: 3.5-4M students in HE, over say the last five years<br />
  9. 9. User context and attention<br />Has been relatively simple to parameterise an open source search engine with user context<br />Institution, course, progression level, academic year<br />This is only part of the user context, can add<br />Location<br />Attention data, e.g., search history<br />Further social search information<br />
  10. 10. Disclaimer<br />The next slide is independent of any decisions on a pure data approach<br />Could be a pure data approach in there<br />Or maybe not<br />
  11. 11. Where is this going? A personal view<br />Bind together<br /><ul><li>FRBRish cataloguebetter search UX and persistent URLs for personalisation purposes
  12. 12. Mosiac searchpersonalised/point-of-view search </li></ul>Massively parallel search for blindingly fast response times<br />Data mining for library ‘stewardship’<br />We have prototypes for the first two, and we’re about to start experimenting with parallel search using Hadoop+Lucene<br />
  13. 13. Building institutional contributions<br />Propose union-cat-local: Search in local library<br />Mosaic-like search utilises local loan data if it is available<br />Two ways to encourage library contribution of loan data (thoughts in progress)<br />Narrow: Libraries which contribute loan data to the pool get Mosaic search over the pool<br />Broad: Offer the contextual/PoV search available everywhere; users will agitate if they don’t see local data<br />
  14. 14. This is a Just Do It moment<br />A national union catalogue with contextual search and local library interfaces<br />Relatively cheap to do<br />Potentially massive gains for learners, teachers and researchers<br />Portends the development of shared services across the library domain and large cost savings<br />Doesn’t preclude / agnostic on an open data approach<br />Could incorporate a pure data service approach and/or a centralised service<br />
  15. 15. Questions<br />