RUGCombine & Livetrix


Published on

Presentation & workshop at
Norwegian Knowledge Centre for the Health Services Olso, January 15th 2007 &
NTNU Library (UBiT) Trondheim, January 17th & 18th 2007

Guus van den Brekel
Coördinator Electronic Services,
Central Medical Library
University Medical Center Groningen

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Building different modules around Metalib and ILS configurations, combing data, exchanging and interacting
  • All searches (Combine users and LiveTrix users = Development group) that are running through Metalib’s X-server, are treated as follows in a background process :  The search terms from every query are being sent to all databases.  The first 30 titles from each database are retrieved. These titles are then analyzed and indexed both as a string - and word for word - as well as counted. This is a normalized count, taking into account the size of the database. - Relations between all the words in a phrase are also indexed. Thus a relation network is built up. Filler words like particles, prepositions, articles etc ae filtered out. - Words that occur only once are thrown away, to avoid typos seeping into the indexes. So now we know in which databases these terms occur. These indexes are expanding all the time, obviously.  When a new query is peformed, we can make several recommendations and suggestions: about which databases are the best to use for that specific query (“likely sources”)  Dito for suggestions about word relations.  The indexes are also used for correcting typos and word/phrase suggestions.  Suggestions about similar sounding phrases are given upon phonetic matches (“Soundex”).
  • Startscreen This is where users enter the system: a simple screen, not too many frills, buttons and the like. I can fill in my search terms here and I don’t have to select any databases, if I don’ t want to. If the system recognises my search term, it will start its search in the resources most likely to give a result. If it does not yet have indexed my search term, it will search in a predefined set of large generic catalogues and bibliographic databases, including our own library catalogue (and further the Library of Congress, Web of Science, Academic Search Premier)
  • However, I can select a discipline however, if I want to, with primary, secondary and even teriary resources I can select from these sets, throw away resources, if I think I don’t need them and add resources from another discipline for a multidisciplinary approach.
  • Suppose I need information about a film by Ingmar Bergman, more specifically the notorious film Wild Strawberries, or, in Swedish: Smultronstället . I select the discipline Art - The LiveTrix-database may know the word, or similar words and phrases, and may give suggestions for completions while typing. - Message of too many databases selected, max. 10 - I can click away databases that I think are unsuited for my purpose, and the number eleven database is then movig up. - I can also add databases.
  • using the Search By Name - mode, (Page divider), I can search for Libris , to add a Swedish resource, and once I found it, click on it to add it to my resource list. I make sure that I have It in my top ten, in order to have it searched. I may just as well get rid of the Art Index and RILA in my list, since I know they don’t do film. And add the Library of Congress and MUSE for instance. Fill in smultronstället , and click Search
  • Libris, the Swedish “Union catalogue” gives some recall, as expected, And I click on it to get a table view. Now this is not all about Bergman’s film, since smultronstället not only means a place where wild strawberries grow, but also a nice, favourite little spot, a personal paradise, haunt. (In the first title for example, Smultronstället Gotska Sandön, it is in this meaning that it is used her, although Gotska Sandön is the islet north of Gotland where Bergman lives.)
  • If we click on this title  Smultronstället och dödens ekipage by Margareta Wirmark (Wild strawberries and the horse and carriage of death) We get a full view and here we have some new features as well: using the ISBN we fetch a book cover from Amazon , which in this case is not available, but there is a clickthrough to Amazon that one may use.  We also use for the same purpose and we do a search for other editions at Amazon, if available, for instance to draw your attention to a newer or older edition, a paperback or a hardback edition etc. In some cases this may lead to a screen full of book covers, for instance when there is a real classic is among the retrieved hits. Here, at the RUGlinks icon , we have a full text lookup , if available (in the case of electronic journal articles), using SFX inline ; I will show you an exampel later. Then we have functions like Print record, View raw XML record, and This view in new window. And Search for additional title information in : A9, Google, Google Scholar. Of course this can be extended to othet search engines. Keywords , fetched from the source in question, can be used for a clicktrough to A9 and to our Catalogue. Example Bergman, Ingmar -->  Catalogue Now: the success of such an action is of course very much depending on the structure, correctness, language etc. of the keywords used (Swedish keywords in a Dutch OPAC)
  • Another example. I’m not going to select any databases beforehand and I will be searching with anatomy ….lesson Since the item is known, the most likely databases are know searched, that is the databases of which LiveTrix knows it holds these terms.
  • Here is the result, let’s try Science Direct Try this title: Rembrandt s anatomy lesson as a metaphor for education, 2003 KA De Ville LM Kopelman Now we see an inline full text lookup no new windows or pop-up An inline Catalogue lookup Which are well known SFX-services , though te inline element is new as is the possibility to create a persistent tiny URL for later use.
  • Let us see what COPAC (a union Catalogue of large British and Irish libraries) has in store: The anatomy lesson, a novel by Philip Roth Obviously with such a query, on may expect hits from both Art databases, Medical ones and, like in this case a hit from the field of American literature Full view: here you see an examples of the editions lookup in Amazon .
  • Amazon editions
  • Query analysis Now this may give some pretty good results. Maybe not in this case; the string is not yet known or Like in this case It gives likely resources [depending on what LiveTrix is doing] Dissects a phrase into Substrings And we are working on a graphical repesentation, a so called word cloud. I cannot demonstrate this live, because it somewhat destabilised the system, but here is a screen dump from what it looked like, when it was working.
  • This is the statistics home page It is not very fancy, but We have the possibiliy to get statistical information about Searches , our Users , about the use of Resources , about the types of searches and search behaviour in Combine and we can generate Error-reports.
  • An example from the time before we made some improvements and we weren’t yet experimenting with our X-server (which disrupted the validity of the figures) Searches : 2005-10-01 untill 2005-10-15: 3975 searches - (1) picking AATA, first alphabetically - (1) search with medical terms in AATA (Art and Archeology) Zero results (checkbox): 1690 searches - (2) searching journal title + author - (3) when this doesn’t work, doing complex things like using quotation marks (quotes) - (1288-1300) all sorts of things are going wrong.: trying all sorts of name forms, using a Dutch search term, phrases that are not handled nicely by euther Pubmed or Metalib or the parsers we used. Unsucsessfull searches : 309 searches - all sorts of things are going wrong, not all of which can be counted against the users.
  • RUGCombine & Livetrix

    1. 1. Presentation & workshop at Norwegian Knowledge Centre for the Health Services Olso, January 15th 2007 & NTNU Library (UBiT) Trondheim, January 17th & 18th 2007 Guus van den Brekel Coördinator Electronic Services, Central Medical Library University Medical Center Groningen Website: Blog: RUGCombine & Livetrix the search for a perfect search interface …?
    2. 2. RUGCombine : Library Resources Portal <ul><li>Federated search (Quick & Advanced) </li></ul><ul><li>All databases </li></ul><ul><li>All E-Journals </li></ul><ul><li>Personalization : My Library </li></ul>
    3. 3. Our Metalib: RUG Combine <ul><li>Introduced v2 as pilot 2004, Faculty of Medicine and University Medical Centre (Hospital staff) </li></ul><ul><li>Migrated to v3 in January 2005, completing introduction to the whole of the University september 2005 </li></ul><ul><li>NOT only gateway (OPAC, Guide, Lists); usage modest so far </li></ul><ul><li>Our SFX: RUGlinks </li></ul><ul><li>Introduced 2002, v3 since 2005 </li></ul>
    4. 4. RUGCombine = Metalib out of the box <ul><li>But with extensive modifications IN the box </li></ul><ul><li>Layout and resticted functionality </li></ul><ul><li>Problem: no content </li></ul>
    5. 5. Shocking Statistics (1) <ul><li>15% errors: </li></ul><ul><li>technical measures > reducing to 1.5% errors/flaws </li></ul><ul><li>50% zero or false results: </li></ul><ul><li>Misspellings and typos in search terms </li></ul><ul><li>Picking databases at random </li></ul><ul><li>Unable to understand QuickSearch, MetaSearch, Find Database </li></ul><ul><li>Using the wrong search keys </li></ul>Metalib statistics
    6. 6. Shocking Statistics (2) <ul><li>Using search keys wrong </li></ul><ul><li>Using Dutch search terms in English language databases </li></ul><ul><li>Using non-specific terms, phrases that are too broad </li></ul><ul><li>Lack of understanding of Boolean logic or database peculiarities </li></ul>
    7. 7. LiveTrix = X-Server +Metalib <ul><li>xml </li></ul><ul><li>Working outside the box </li></ul><ul><li>Modulair, you can add/connect stuff, flexible </li></ul><ul><li>Harvesting & analysing records </li></ul><ul><li>Statistics modules (from logfiles) </li></ul><ul><ul><li>Metalib statistics </li></ul></ul><ul><ul><li>SFX-statistics </li></ul></ul>
    8. 8. A Short Term example <ul><li>“ Building around the box” </li></ul><ul><li>From statistics to new services </li></ul><ul><li>From Metalib -> LiveTrix </li></ul><ul><li> </li></ul>
    9. 9. Already implemented in LiveTrix (1) <ul><li>Discovery/suggestion tool (no need to select resource/database or subject first! </li></ul><ul><li>Spellcheck/adviser </li></ul><ul><li>Translation </li></ul><ul><li>Query Analysis </li></ul><ul><li>Inline- SFX & lending info with OPAC records </li></ul><ul><li>Impact factor info with Journals </li></ul><ul><li>Related strings info </li></ul>
    10. 10. Already implemented in LiveTrix (2) <ul><li>Relevant help at point of need </li></ul><ul><li>Alert service (RSS & Email) </li></ul><ul><li>Relation Databases </li></ul><ul><li>WorkBench </li></ul><ul><li>Bookmarks </li></ul><ul><li>TinyUrl creation </li></ul><ul><li>Just in production: </li></ul><ul><ul><li>Cumulated lending figures OPAC </li></ul></ul><ul><ul><li>Separate record selection to WorkBench </li></ul></ul><ul><ul><li>Fulltext availability on First Selection Screen </li></ul></ul>
    11. 11. LiveTrix: under the hood <ul><li>Search terms are sent to all databases in the background. </li></ul><ul><li>The first 300 titles from each database are retrieved. These titles are indexed both as a string and word for word, as well as counted. A relation network is built up. From then on, the system knows in which databases these terms occur. </li></ul><ul><li>In the next query, suggestions about databases (“likely sources”) are given on the basis of these ever expanding indexes. </li></ul><ul><li>Dito for suggestions about word relations. </li></ul><ul><li>The indexes are also used for correcting typos and word/phrase suggestions. </li></ul><ul><li>Suggestions about similar sounding phrases are given upon phonetic matches (“Soundex”). </li></ul>Source: Redesigning our Combine Harvester (MetaLib) (Ane W. van der Leij, University of Groningen, NL)
    12. 27. Complex phrases Dutch ! (obesity) Quotes Name handling Name handling
    13. 28. <ul><li>new central information system: component-based software, no monolithic ILS </li></ul><ul><li>Open standards and protocols: inter-operability </li></ul><ul><li>modular web-based library services </li></ul><ul><li>Web 2.0 social software </li></ul>On The Long Term (1) On The SHORT Term (1)
    14. 29. On The SHORT Term (2) <ul><li>use of logs stats, evaluate and re-use </li></ul><ul><li>“ Make the data work harder”. Combinations of data-sets can be used to create new services. </li></ul><ul><li>in more ways than one; platform independent </li></ul>