20091120 Vlengel Maastricht


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

20091120 Vlengel Maastricht

  1. 1. [email_address] [email_address] Lucene @ Ghent @ Lund Vlengel - November 2009 Maastricht
  2. 2. http://lib.ugent.be
  3. 3. http://elin.ugent.be
  4. 4. The Numbers 5.000.000 Bibliographic Records Full-Text: ca 20% 490.000 Google Books Hathi 136.000 18 th Cent. Coll. Online 100.000 Early English Books 32.000 Google Books Gent 82.000 Gutenberg, DBNL, SFX,… Ghent 16 Collections 120.000 visits/month 34% via search engines
  5. 5. The Numbers 54.000.000 Bibliographic Records ELIN Full-Text: 100 % 29 customers worldwide 6 timezones 17.000.000 electronic journals 25.000.000 Ebsco 4.000.000 JSTOR 3.400.000 Proquest ABI 1.670.000 IEE/IEEE standars/proceedings 1.300.000 E-print archives
  6. 6. The Parts Searching/Portal Verity Sesat Endeca Indexing Fast Autonomy Zebra Lucene/Solr Sphinx Drupal Liferay JBoss Zope Primo Aquabrowser VuFind
  7. 7. Indexing ALEPHSEQ OAI-PMH MySQL DUMP XSL indexML rug01_xml rug02_xml hath01_xml dbnl_xml Cmdline tools Tomcat Servlet Tuned Solr Perl MVC Java/Spring
  8. 8. Searching/Portal Search Engine Plugins Lucene SOLR ISI/WOS YouTube OpenSearch SRU Models KeyVal XML MARC Configuration Files I18N Props Default Open Search UnAPI HTML Velocity JS Meercat Controllers Views
  9. 9. The ‘Haves’ Facets Filters RSS OpenSearch OpenURL OAI-PMH Mobile TicToc Cover Art Statistics Cool URI’s unAPI Zotero Google Maps Stemming Flexible Sort Image Browsing Zoomers Pagers Basket Plugin/Integration libX Real-Time Availability Check Requesting Global Holdings Full-Text Links Lists: Journals, Databases, Collections Diacrit translation
  10. 10. The ‘Have Nots’ <ul><li>Nice Administrative Interface </li></ul><ul><li>ILS integration (requests, renewals, …) </li></ul><ul><li>Personalization (saved searches, alerts,…) </li></ul><ul><li>Tagging, Rating, User Contributed Content </li></ul><ul><li>Deduplication </li></ul><ul><li>Excerpts, Table of Contents </li></ul><ul><li>Word clouds </li></ul><ul><li>Expand Searches (see also) </li></ul><ul><li>Highlighting </li></ul><ul><li>Federated Search </li></ul><ul><li>Browsing </li></ul><ul><li>Advanced Search </li></ul><ul><li>Extended FRBR </li></ul>
  11. 11. The Characteristics <ul><li>Lightweight , Tunable </li></ul><ul><ul><ul><ul><li>In Lund 54.000.000 indexed on 1 Linux 4-core machine 16GB RAM +/- 2000 records/second </li></ul></ul></ul></ul><ul><ul><ul><ul><li>In Ghent indexation runs on Aleph server during business hours </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Continuous 100 simultaneous users on 1 Linux 2-core machine 4GB RAM </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Simple, easy web interface. Less is more </li></ul></ul></ul></ul>
  12. 12. The Characteristics <ul><li>Flexible </li></ul><ul><ul><ul><ul><li>Used in 6 different projects in Gent, 2 in Lund </li></ul></ul></ul></ul><ul><ul><ul><ul><li>KeyVal , XML , MARC models can be used internally </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Indexes anything that can be turned into our XML index format </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Total control on every aspect of interface. We do text , images , video , mobile , RSS , … </li></ul></ul></ul></ul>
  13. 13. The Characteristics <ul><li>Very Large Developer Community </li></ul><ul><ul><ul><ul><li>Open Source used in thousands of projects worldwide in all major (computer) languages </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Extensive Documentation , many articles, presentations, research </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Books, User Group, Conferences, Social Networks,… </li></ul></ul></ul></ul>
  14. 14. But…
  15. 15. Acknowledgements <ul><li>Kjell Lotigiers (UGent)– Java/Spring development </li></ul><ul><li>Salam Baker Shanawa (Lund) – Perl/ELIN development, System tuning </li></ul><ul><li>Nicolas Steenlant (UGent) – Ajax/CSS development </li></ul><ul><li>Geert Roels (UGent) – Web Design </li></ul><ul><li>Paul Bastijns (UGent) – SFX integration </li></ul>
  16. 16. Refs <ul><li>Calhoun, K., & Cellentani, D. (2009). Online catalogs: What users and librarians want : an OCLC report . Dublin, Ohio: OCLC. </li></ul><ul><li>http://lib.ugent.be </li></ul><ul><li>http://www.lub.lu.se </li></ul>