WEB TECHNOLOGIES FOR LIBRARIES - 2 WORKSHOPS June 28.-29. 2011 in PetrozavodskWeb-scale discovery systems Karen J. Buset, NTNU University Library Trondheim, Norway
WHYGoogle has set the standard for searching not only for our users, butalso for a lot of librarians.Federated search was implemented at libraries in an attempt tocompete with Google/Google scholarThis failed because of the limitations of this technology: • the small number of resources that could be searched simultaneously • the speed and the problems encountered merging results • dealing with all the different and constantly changing interfaces.
WHATContent • harvest content from local and remotely hosted repositories • create a centralized index—to the article level • suited for rapid search and retrieval of results ranked by relevancy. • harvesting of local library resources, combined with brokered agreements with publishers and aggregators allowing access to metadata and/or full-text contentDiscovery • single search box providing a Google-like search experienceDelivery • quick results ranked by relevancy • modern interface offering functionality such as faceted navigation to drill down to more speciﬁc resultsFlexibility • agnostic to underlying systems, • open compared to traditional library systems and allow a library greater possibility to customize the services
HOWEach vendor has agreements with several content suppliers fromwhom they harvest materials. In addition, they harvest locally heldmaterial such as existing library catalogues and institutionalrepositories within the library using protocols such as OAI-PMH andFTP.Pre-harvesting eliminates the need to merge results as was the casewith federated search, which in turn makes de-duplication andrelevancy ranking easier.Users can search all available metadata, but authentication is neededto get access to full text. In this way, Google-like functionality isprovided to a delimited collection of resources.
PROBLEMSThe system vendors agree that there will still be a need for directaccess to specialised search interfaces because: • Some resources are not indexed • Some resources are not full-text indexed • Some resources are not available • Some databases might offer specialised search tools not available in web-scale-discovery systems
SYSTEMS«The big 4» • OCLCs WorldCat Local • ExLibris’ Primo • Serials Solutions Summon • EBSCO Discovery ServicesFind links to system here:https://sites.google.com/site/urd2comparison/home
USER OPINIONSeveral surveys at the library show that the users want a simple,Google-like interface that will quickly provide them with relevantresults.It seems probable that any of the these systems have enoughcoverage that most users will be satisﬁed.Relevance ranking in these systems cannot be compared with Googlepagerank; it might be a challenge to provide good relevance ranking ina service aggregating such a diversity of metadata.
SOURCESA brief overview of three web-scale-discovery systems: Summon,Primo and OCLC Worldcat. NTNU University Library 2011Jason Vaughan. Web Scale Discovery What and Why? LibraryTechnology reports. 2011