Successfully reported this slideshow.
KohaCon12Adding browse to Koha     using Solr        <http://tinyurl.com/solr-browse>             Stefano BargioniPontific...
The PUSC Library●   160,000 volumes       –   147,000 bibs       –   111,300 auth●   Aleph 300; Amicus 3.5●   Koha 3.2.7 f...
Why we need browse at PUSC?●   Aleph 300 and Amicus offered it●   Our users and cataloguers frequently used it●   We have ...
How do you say?●   Alighieri, Dante or●   Dante Alighieri or●   Allighieri, Dante ?●   Ratzinger, Joseph, 1927- or●   Bene...
Grouping●   Uniform Titles●   Dewey●   Series, ...                     Adding browse to Koha using Solr   5
Browse Functionalities●   Headings from authority as well as    bibliographic records●   Starting from●   Previous Heading...
Browse Requirements●   Indexes fed by headings coming from        –   more than one auth tag        –   more than one bib ...
The engine●   Why Solr?       –   Schema flexibility       –   Facets       –   High performance in update and query      ...
The architecture  Web                                        Perl CGIbrowserKoha          loader.pl SQL                   ...
The Solr Document (1)Field name                                    Valueid                                            uniq...
The Solr Document (2)Field name                        Example authid                                au_a_1234_100_0authid...
The Solr Document (3)Field name         Example bibid                 tl_b_5678_245_0sysno              5678tl            ...
The Solr Document (4)Id structure    –   List name                                      au | tl | se ...    –   Source    ...
The Solr Document (5)The sort form:   –   Diacritics to simple letter (àÀ to aA, ...)            ●   use Text::Unidecode; ...
Loading & Synchronizing (1)●   The same cron based Perl script loads the Solr db for    the first time and updates it     ...
Loading & Synchronizing (2)●   The XML config file (XML::Simple → YAML?):         –   Two main sections: auth and bib     ...
Loading & Synchronizing (3)●   Special records in Solr, type:system        –   Created if not exist, otherwise incremented...
Querying (1)A new page in Koha: Browse list of indexes                                                     Ac             ...
Querying (2)        # of documents                                            RelatedC4::AuthoritiesMarc::CountUsage      ...
Querying (3)   Titles list contains standard titles and series titles       Multivolume work Adding browse to Koha using S...
Statistics        Only for PUSC   Public  Staff onlyWill be public                                                        ...
Security●   Solr db can be erased with a single http    request●   Many ways to add admin security●   For instance, modify...
License and portability●   The same as Koha●   Tested on Koha 3.2 and Koha 3.6●   Needs work to be included in Koha       ...
Thank you – Grazie!     Adding browse to Koha using Solr   24
Upcoming SlideShare
Loading in …5
×

Adding browse to Koha using Solr

707 views

Published on

  • Be the first to comment

  • Be the first to like this

Adding browse to Koha using Solr

  1. 1. KohaCon12Adding browse to Koha using Solr <http://tinyurl.com/solr-browse> Stefano BargioniPontifical University Santa Croce – Rome bargioni@pusc.it
  2. 2. The PUSC Library● 160,000 volumes – 147,000 bibs – 111,300 auth● Aleph 300; Amicus 3.5● Koha 3.2.7 from May 1st, 2011● PUSC belongs to the URBE Network – 17 academic libraries – 2 of them using Koha Adding browse to Koha using Solr 2
  3. 3. Why we need browse at PUSC?● Aleph 300 and Amicus offered it● Our users and cataloguers frequently used it● We have a lot of ancient authors, Popes, …, requiring “seen from”, “see also”● We started to add subjects to our bibliographic records Adding browse to Koha using Solr 3
  4. 4. How do you say?● Alighieri, Dante or● Dante Alighieri or● Allighieri, Dante ?● Ratzinger, Joseph, 1927- or● Benedictus PP. XVI, 1927- or● Papi (2005- : Benedictus XVI) ?We have to help users and cataloguers to use thecorrect form. Adding browse to Koha using Solr 4
  5. 5. Grouping● Uniform Titles● Dewey● Series, ... Adding browse to Koha using Solr 5
  6. 6. Browse Functionalities● Headings from authority as well as bibliographic records● Starting from● Previous Headings, Next Headings● Number of documents● Related headings (see, see also, seen from)● Go to authority record, if any● Additional Links Adding browse to Koha using Solr 6
  7. 7. Browse Requirements● Indexes fed by headings coming from – more than one auth tag – more than one bib tag● Sort form for Latin-1 (non-latin scripts?)● Consider non-filing characters● Synchronize frequently● Integrated in Koha opac● MARC flavour independence Adding browse to Koha using Solr 7
  8. 8. The engine● Why Solr? – Schema flexibility – Facets – High performance in update and query – Better than MySQL – Will be part of Koha, maybe replacing Zebra Adding browse to Koha using Solr 8
  9. 9. The architecture Web Perl CGIbrowserKoha loader.pl SQL Solr dbtables cron job Adding browse to Koha using Solr 9
  10. 10. The Solr Document (1)Field name Valueid unique identifierauthid | sysno intau | tl | se ... string (display form)sortform_au | sortform_tl | sortform_se... stringtimestamp ISO 8601type acc | see | also ... Adding browse to Koha using Solr 10
  11. 11. The Solr Document (2)Field name Example authid au_a_1234_100_0authid 1234au Alighieri, Dantesortform_au alighieri.dantetimestamp 2012-05-23T19:10:54Ztype acc Adding browse to Koha using Solr 11
  12. 12. The Solr Document (3)Field name Example bibid tl_b_5678_245_0sysno 5678tl Gesù Cristo secondo la dottrina di S. Tommaso dAquinosortform_tl gesu.cristo.secondo.la.dottrina.di.s.tommaso.d.aquinotimestamp 2012-05-23T18:15:44Ztype acc Adding browse to Koha using Solr 12
  13. 13. The Solr Document (4)Id structure – List name au | tl | se ... – Source a|b – Source authid or sysno nnn – Tag ttt – Occurrence # n 0 based Adding browse to Koha using Solr 13
  14. 14. The Solr Document (5)The sort form: – Diacritics to simple letter (àÀ to aA, ...) ● use Text::Unidecode; – Lowercase – Strip out non-filing characters (titles) – Replace non a-z0-9 with dotUsed for facets Adding browse to Koha using Solr 14
  15. 15. Loading & Synchronizing (1)● The same cron based Perl script loads the Solr db for the first time and updates it – use C4::Context; – use C4::AuthoritiesMarc; – use WebService::Solr;● 2000 # of docs modified before issuing a commit● 5 # of commits before issuing an optimize● 38 minutes to load 662,400 headings● Configured through an xml file Adding browse to Koha using Solr 15
  16. 16. Loading & Synchronizing (2)● The XML config file (XML::Simple → YAML?): – Two main sections: auth and bib – each section lists tags that feed indexes<tag> <tag> <code>400</code> <code>130</code> <list>au</list> <list>tl</list> <type>see</type> <type>acc</type> <subfields>*</subfields> <subfields>*</subfields> <suffix>.</suffix> <skip_indicator>2</skip_indicator></tag> </tag><tag> <tag> ... ...</tag> </tag> Adding browse to Koha using Solr 16
  17. 17. Loading & Synchronizing (3)● Special records in Solr, type:system – Created if not exist, otherwise incremented – An usage counter for each index – Last update timestamps● Search for new, modified or deleted records – MySQL tables auth_header, biblioitems, deletedbiblioitems, deleted_auth_header – Modified AuthoritiesMarc.pm to fill deleted_auth_header for auth deletion● Cron once a minute, using a lock file Adding browse to Koha using Solr 17
  18. 18. Querying (1)A new page in Koha: Browse list of indexes Ac l as ook res t use ie st ul t d l i or e s p st s er and pa ge Adding browse to Koha using Solr 18
  19. 19. Querying (2) # of documents RelatedC4::AuthoritiesMarc::CountUsage headings Search VIAF Show Koha auth record Adding browse to Koha using Solr 19
  20. 20. Querying (3) Titles list contains standard titles and series titles Multivolume work Adding browse to Koha using Solr 20
  21. 21. Statistics Only for PUSC Public Staff onlyWill be public We started some weeks ago Adding browse to Koha using Solr 21
  22. 22. Security● Solr db can be erased with a single http request● Many ways to add admin security● For instance, modify – jetty.xml – webdefault.xml – realm.properties Adding browse to Koha using Solr 22
  23. 23. License and portability● The same as Koha● Tested on Koha 3.2 and Koha 3.6● Needs work to be included in Koha – I18N – .tt instead of AJAX – Branches? – Integration with Koha system preferences● … Solr experts... (BibLibre?) Adding browse to Koha using Solr 23
  24. 24. Thank you – Grazie! Adding browse to Koha using Solr 24

×