Printed catalogues Author browse Title browse Series browse Call Number browse Subject browse Shelf list (inventory)
Traditional (Web)OPAC (Web)Server Application ILS Database (Bibs)
Traditional (Web)OPACPros Cons Keyword search! Uses database queries Author, title, subject „LIKE‟ statements ISBN/LCCN search Exact/partial match Boolean queries Limited use of search Proximity search algorithm Browse index No relevance ranking Authority headings Only physical collection Title, Call Number and e-books Real-time item status! Copies & availability info Link to URL (tag 856)
Integrated OPAC Portal Enrichment Web services Services Web Server ApplicationWebsite ILS Database ILS Databasecontent (Bibs) (Patrons)
Integrated OPAC PortalPros Cons All WebOPAC features Uses database queries Keyword search „LIKE‟ statements Headings browse Exact/partial match Availability info Limited use of search Library website integration algorithms Patron empowerment Circ/Account details No relevance ranking Online renewal Still limited to only physical Online hold placement collection & e-books SDI services New arrivals OPAC enrichment Book cover/reviews Thesaurus integration
Federated Search Service360 Search dbWizResearch Pro Pazpar2 Full-text links Web Server Application Library Digital Science Catalog Repository ProQuest EBSCO Direct PubMed … Emerald
Federated Search Service Muse Content Architecture Supports 6300+ databases!http://www.museglobal.com/technology/contentIntegration.html
Federated Search ServicePros Cons Single search broadcast Not all databases are Real-time search results standards compliant Based on standards Requires custom search scripts Z39.50, SRU/W Requires metadata crosswalk MARC, ISO2709, XML Supports large set of Network intensive databases Performance issues 7000+ in “360 Search” Mostly available as hosted 6300+ in Muse platform Merging and sorting service No local index Annual subscription (maintenance free!)
Discovery Interface Enrichment Web services Services Web Server ApplicationFull-text link Availability/Holds Digital Central Index Repository (Solr/Lucene) ILS Database DC XML data MARC Bib data
Discovery Interface Word stemming Phrase query „fishing‟, „fished‟, „fish‟, „Did you mean?‟ „fisher‟ => „fish‟ Spell Checker Fuzzy search insertion: cot coat Relevance ranking deletion: coat cot TF-IDF / Term Vector substitution: coat cost Term weights Auto-suggest Lucene scores N-gram, Edge N-gram Faceted browsing analysis Who are main authors and their count? What are main subjects and their count?
Discovery InterfacePros Cons Google-like search box Searches only locally hosted Advanced features collections Fuzzy searching Relevance ranking Word stemming algorithms Social tagging/reviews “Did you mean?” feature Auto-suggest (type ahead) Faceted browsing Availability/Hold requests Metadata enrichment Linking Amazon/Google/Wikipedia Digital repository integration
Can we combine the two? Modern discovery interface Local collections + Remote databases Unified search result
Web-scale Discovery Services EBSCO ProQuestABI Inform Web Server Application PubMed Availability Full-text link Science Direct Library … Central Index MARC data Catalog Full-text and metadata DigitalLexis-Nexis Repository DC data
Web-scale Discovery ServicesSummon Service Content types include: Library catalog records Conference proceedings E-journal articles Grey literature Institutional repositories Cited references Newspaper articles Reports E-books Digital library Dissertations Databases and more.
Web-scale Discovery ServicesPros Cons Google-like single search box Supports limited number of Pre-indexed licensed content databases (1000-1500) Inclusion of local collection Requires huge investment to OAI-PMH, MARC updates maintain centralized index Advanced features Publisher partnerships Relevance ranking (Licensing/legal issues) “Did you mean?” Regular pre-publication indexing Auto-suggest (type ahead) Mostly hosted-only service Faceted navigation Content bias? (ranking) Availability/Full-text links Vendor lock-in? Mobile friendly Web-service APIs Annual subscription Easier off-campus access No installation/maintenance
Can we have best of both worlds?Modern discovery interface Supports large number of databases Local collections + Based on open standards Remote databases (extensible) Can be maintained locally Unified search result (No subscription!) Web Server Application Remote Remote Remote Remote Digital ILS database database database database database Repository (Bibs) Remote Remote Remote Remote database database database database
Integrated Discovery Platform Pazpar2 Architecture Open source (GPL) Build your own connector!https://www.indexdata.com/pazpar2
Conclusion Each platform has its own goals: Pure library catalog can provide expressive search (high precision) Federated search improves content coverage in single search Discovery interfaces are designed to improve user experience for local collections Web-scale discovery provides unified search experience for local and remote collections (still way short in content coverage) Integrated platform provides extensibility (but requires significant effort in development and maintenance) One size does not fit all. No single system is perfect. As content becomes more open, the focus of discovery solutions should be on open platforms that are extensible as well as affordable.