• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Rise presentation-2012-01
 

Rise presentation-2012-01

on

  • 3,348 views

Presentation given at O

Presentation given at O

Statistics

Views

Total Views
3,348
Views on SlideShare
639
Embed Views
2,709

Actions

Likes
0
Downloads
4
Comments
0

5 Embeds 2,709

http://www.open.ac.uk 2705
http://www.open.ac.uk.libezproxy.open.ac.uk 1
http://131.253.14.66 1
http://www.ranksit.com 1
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • If we go back to 2009, it became obvious that library search simply didn’t work as well as users expected it towe were getting the sort of comments you see on screen which showed that library users were struggling with the federated search system that we were usingSo the library embarked on some work to improve search, with a new discovery search system and other changes
  • we changed the search system to a new generation of library search system from EBSCO. Instead of searching library resources individually and telling you how many results are in each database it now searches one index and shows the results in a single list
  • We started thinking whether there was more that we could do to improve the user experience. For a while we’d been following with interest some JISC work looking at whether activity data could be used by libraries to improve services, in projects such as TILE and MOSAIC. So we started to think whether there was an opportunity to look at whether using activity data could improve the user experience of library search
  • So when we knew that JISC were going to be funding some more work on activity data, we thought about what we’d want to do, and came up with this hypothesis
  • The project we came up with was RISE – Recommendations Improve the Search ExperienceWe set out to test two thingsCan you use search data to make recommendationsAre recommendations useful for these new systems.
  • RISE was funded as part of the Activity Data strand of the JISC Infrastructure for Education and Research programmeIt was a very short project, just six months, with a small team – developer, project managerAnd there were seven other projects in the programme. Some of which were working with libraries such as SALT and LIDP, others of which are looking at activity data in a range of other areas from VLEs, through repositories, to student systems to video-conferencing data, and including the UCIAD project in Kmi looking at a user-centred approach to web clickstream data.
  • The business sector, particularly companies such as Tesco, Amazon and Wal-Mart exploit the data they have about customer activities to support decision making.Some early research by JISC, in the TILE and MOSAIC projects identified that the HE sector also had extensive user data and there was some potential to make use of it, but it was greatly underused. So this JISC programme has set out to explore this area in more detail. Across the sector we are being told to be more business-like and the use of customer data is one of the areas that businesses seem to be exploiting far more than we do
  • For a traditional ‘bricks and mortar’ university these are some of the ways that you’d typically interact with your customers.Well, for the OU things are a bit different
  • We don’t really loan many books to students or have many accessing the library. All our students are distance learners so they interact with us online and use our resources electronically. And with more than 450,000 unique users of our website and over 100,000 unique users of our e-resources each year then there’s a fair amount of activity data for us to use
  • So, if we are concentrating on our e-resources then the systems we use are SAMS single sign on. The EZProxy system from OCLC which allows students to access our resources as if they were locally within the library We are using SFX from ExLibris as our resources knowledge base and as the OpenURL link resolver and then finally the Ebsco Discovery Solution in place of an older federated search system
  • The stages of the project were to build the database fill it with activity data, write some software to create the recommendations create a search interface to show the recommendations test it with some users
  • We push as much as possible through ezproxy, so we use it for access through our discovery solution, for links from SFX, for links placed in our VLE. So it seemed the obvious choice as the place to start to look at e-resource activity data. We didn’t have access to the Ebsco Discovery log files and we hadn’t been using that system for long whereas we did have a few months of log files from EZProxySo we started with the EZProxy log files as the core dataset
  • So when we start to look in detail at what data is contained within the log files you’ve got some useful data and other data that isn’t so useful for activity data purposes.We know the user name – that’s the oucu the Open University Computer User account name. You know the request, that is the website that is being accessedSo when you look at the detail of the record what you get is…
  • Something that looks like this (we’ve anonymised the oucu for obvious reasons).this is one record out of tens of thousands of rows but with a bit of work you can break it down
  • So you’ve got the date and time – useful to be able to know when something happened
  • And the oucu of the user
  • And the request that has been made – in this case an ebsco host search
  • So our database starts to build up with details of userand resources
  • So we can get data about the courses that students were studying from our internal student information system
  • So that added a bit more to the mix
  • So, the data we have so far can tell us which courses people are on, so we can make recommendations based on that, i.e. these are the most popular resources that people on your course are looking at. We can also start to say that if you looked at resource C and then straightaway looked at resource D that there is a likelihood that there is some relationship between resource C and resource D.And we can also say which overall are the most popular articles or journals.But there are limitations to the ezproxy data, we don’t have the search terms that are used to find these resources.
  • But there are limitations. From the logs you don’t always know what search terms were used or have much information about the item that is being accessedAnd if you want to make a recommendation you don’t even have an article or journal title to show as the recommendationSo looked at how we could improve the data. At the moment we use another EDS API call to extract bibliographic details that are used to extract data from Crossref that we can store in the database
  • So we decided that we could use the EDS API to retrieve some bibliographic data.Originally we’d hoped that we would be able to store basic metadata from EBSCO in the system but after discussion with them we realised that the license terms wouldn’t let us do that.So we had to look for other metadata sources that we could use. So we set the system up to retrieve data keys from EBSCO and use them to search Crossref. The Crossref data license allows you to store that data locally.
  • We created a test search interface to test recommendations with users using the Ebsco Discovery Solution API.
  • And when you get your search results, you also get recommendations based the articles viewed by people who used similar search terms
  • If you view one of the recommended resources it will open the record in another window and you are given the chance to rate the usefulness of the recommendation.
  • We also built a second interface – this one is a Google Gadget version with pretty much the same functions as the main interface.
  • Log in sorted out by working with SocialLearn team
  • We also then started to capture search terms used in the RISE interface
  • Now we can add search terms that are being used
  • So we’ve ended up with a set of data that can give us a range of different types of recommendationsFrom ‘people on your course are looking at these articles’ through ‘people who looked at this article also looked at this article’ and ‘to people using this search term looked at these resources’And we are sure that you could put the data to other types of use.
  • When we were looking at recommendations we thought that the simplest approach was just to start with something very basicWhat drives the recommendations is a set of relationship values. Values are assigned based on resource views and subsequent ratings by usersThe relationships are ranked according to value so the top ones get shown as recommendations.
  • Each relationship starts as value 0 +1 each time the resource is viewed +1 each time the recommendation is viewed +1 each time the recommendation is rated as ‘Useful’ -2 each time the recommendation is rated as ‘Not Useful’Recommendations are displayed in value order
  • Any system that deals with personal data has to be mindful of privacy and data protection requirements. After discussion within the Activity Data programme and some helpful information particularly from EDINA’sOpenURL project we put together a specific privacy policy and discussed it with our data protection people at the University. The policy explicitly covered activity data and we have linked to it from the RISE interfaces, from our main EDS page and from SFX. The policy gives people an opt-out to have their data removed from the recommendations, even though they aren’t identified personally in any of the recommendations.With the new EU ‘cookies’ legislation we are doing some more work to ensure that we are legally compliant. Ideally we would want any institutional ‘cookie’ policy and agreement to cover permission to use data for this type of activity.
  • The original plan with the project was to be able to release an open data set of search data. And we spent quite a lot of time looking at methods of anonymising the data, by removing oucus, genericising courses to broad subjects and looking at whether there was a threshold of students that we needed on a course to be able to release any data from that course.We faced a major challenge because the activity data we had was fairly meaningless without some article metadata and at the time we could only find data we could use ourselves and nothing we could make available in an open data set.So unfortunately it wasn’t possible to release the data. But others at EDINA, LIDP and SALT were able to do so.
  • Google Gadget will go into list of tools for students alongside those being developed by DOULSWe are migrating the database so we can use it for more mainstream use. We plan to use it for the new MACON mobiles search project. And we’re interested in how this data could be used by Learning Analytics
  • We are also looking at how we can use these approaches to provide personalised services to users through the library website, so have been looking at being able to show people what articles are being looked at and have been developing some beta services to demonstrate this
  • EZProxy data – on its own it there are limits to the recommendations you can make, they would mostly be about which are the most popular resourcesOur main issue is to get access to bibliographic data about the articles being accessed and recommended.You need to combine the ezproxy data with other stuff, such CIRCE dataThe more data you can get the better. The more data you get hold of the better you can make the recommendationsLicense restrictions on article level metadata limit what you can store in your database
  • I’m now going to hand over to Liz who will take you through the findings of the testing with users

Rise presentation-2012-01 Rise presentation-2012-01 Presentation Transcript

  • Recommendations Improve the Search ExperienceProject OutlineRichard Nursehttp://www.open.ac.uk/blogs/rise
  • Why? “The search engine on the library is not very user friendly. I had to find a specific article recommended in the text and it “The search took several facility is poor attempts to and doesn’t locate it.” find stuff that is supposed tohttp://www.flickr.com/photos/james_lumb/3921968993/sizes/z/in/photostream be there”
  • New search system New generation Discovery System from EBSCOhttp://www.flickr.com/photos/jiscimages/435135071/sizes/m/in/photostream/
  • Could we do more?http://www.flickr.com/photos/davepattern/5808712333/sizes/z/in/photostream/
  • Recommendations Improve theSearch Experience? “That recommender systems can enhance the student experience in new generatione-resource discovery services”
  • Recommendations Improve the Search Experience? Can you use search data to make recommendations? Are recommendations useful in Discovery systems?http://www.flickr.com/photos/davepattern/3473326634/sizes/z/in/photostream/
  • JISC Activity Data Programme JISC funded project February – July 2011 One of eight projects [list at http://bit.ly/gwCmNS]http://www.open.ac.uk/blogs/rise
  • Why activity data? "Every day I wake up and ask, how can I flow data better, manage data better, analyse data better?" Rollin Ford, the CIO of Wal-Marthttp://www.flickr.com/photos/zerimski/5215633183/sizes/z/in/photostream/
  • Library activity data Computer Loans Holds bookings Library e- access resourceshttp://www.open.ac.uk/blogs/rise
  • Library activity data Computer Loans Holds bookings Library e- access resources
  • Library systems environment Athens DA authentication built into local (SAMS) login system EZProxy remote resource access SFX knowledge base and OpenURL link resolver Ebsco Discovery Solution
  • Scope of the project Algorithms &Activity data recommender Search code interface
  • What data is RISE using?bookmarklet
  • So what is in the EZProxy logs?• Remote host• Date/Time• Oucu• Request• Status• Size of response• Referrer• User agent• Session http://www.flickr.com/photos/vixon/116447718/sizes/m/in/photostream/
  • So what is in the EZProxy logs?"0"|||"137.108.143.168"|||20110115235421|||“nn1234"|||"GET http://libezproxy.open.ac.uk:80/connect?Session=st3ShtizgtrS7tU5&url=http://search.ebscohost.com/login.aspx?direct=true&site=edslive&scope=site&type=0&cli0=FT&clv0=Y&cli1=FT1&clv1=Y&authtype=ip&group=VCStud&bquery=War%20Against%20the%20PanthersHTTP/1.1“|||302|||0|||http://library.open.ac.uk/|||"Mozilla/5.0 (X11; U; Linux i686; en-US;rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10(maverick) Firefox/3.6.13"|||"t3ShtizgtrS7tU5"
  • So what is in the EZProxy logs?"0"|||"137.108.143.168"|||20110115235421|||“nn1234"|||"GET http://libezproxy.open.ac.uk:80/connect? date and timeSession=st3ShtizgtrS7tU5&url=http://search.ebscohost.com/login.aspx?direct=true&site=edslive&scope=site&type=0&cli0=FT&clv0=Y&cli1=FT1&clv1=Y&authtype=ip&group=VCStud&bquery=War%20Against%20the%20PanthersHTTP/1.1“|||302|||0|||http://library.open.ac.uk/|||"Mozilla/5.0 (X11; U; Linux i686; en-US;rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10(maverick) Firefox/3.6.13"|||"t3ShtizgtrS7tU5"
  • So what is in the EZProxy logs?"0"|||"137.108.143.168"|||20110115235421|||“nn1234"|||"GET http://libezproxy.open.ac.uk:80/connect? UserSession=st3ShtizgtrS7tU5&url= namehttp://search.ebscohost.com/login.aspx?direct=true&site=edslive&scope=site&type=0&cli0=FT&clv0=Y&cli1=FT1&clv1=Y&authtype=ip&group=VCStud&bquery=War%20Against%20the%20PanthersHTTP/1.1“|||302|||0|||http://library.open.ac.uk/|||"Mozilla/5.0 (X11; U; Linux i686; en-US;rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10(maverick) Firefox/3.6.13"|||"t3ShtizgtrS7tU5"
  • So what is in the EZProxy logs?"0"|||"137.108.143.168"|||20110115235421|||“nn1234"|||"GET http://libezproxy.open.ac.uk:80/connect?Session=st3ShtizgtrS7tU5&url=http://search.ebscohost.com/login.aspx?direct=true&site=edslive&scope=site&type=0&cli0=FT&clv0=Y&cli1=FT1&clv1=Y&authtype=ip&group=VCStud&bquery=War%20Against%20the%20PanthersHTTP/1.1“|||302|||0|||http://library.open.ac.uk/|||"Mozilla/5.0 (X11; Request U; Linux i686; en-US;rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10(maverick) Firefox/3.6.13"|||"t3ShtizgtrS7tU5"
  • RISE database
  • RISE databaseRemote host | Date/Time | Oucu | EZProxyrequest | status | size of response |referrer | user agent | sessionCIRCE user type | course code(s)
  • RISE database
  • What can the data tell us? People who looked at resource ‘C’ alsoPeople on course ‘A’ viewed resource ‘B’ looked at resource ‘D’Which are the most popular resourcesThis resource is being used by people studying this course
  • But what isn’t there? ISSNs DOI Article Subject information termshttp://www.flickr.com/photos/kevharb/5466661946/sizes/z/in/photostream/
  • So how do you improve your data?Remote host | Date/Time | Oucu | request | status EZProxy| size of response | referrer | user agent | session user type | course code(s) CIRCEEDS Bibliographic data matchingCrossref
  • So what about collecting more data? http://library.open.ac.uk/rise www.open.ac.uk/libraryservices/rise/http://www.open.ac.uk/blogs/rise
  • http://www.google.com/ig/directory?type=gadgets&url=library.open.ac.uk/rise/google_gadget/risesearch.xml
  • So how do you improve your data?Remote host | Date/Time | Oucu | request | status | size of EZProxyresponse | referrer | user agent | session user type | course code(s) CIRCEEDS Bibliographic data matchingCrossrefRISE Searches in RISE
  • RISE database
  • What can the data tell us? People on course ‘A’ viewed People who looked at resource People who searched for subject resource ‘B’ ‘C’ also looked at resource ‘D’ ‘E’ looked at resource ‘F’People are looking at resources on this subjectThis resource is being used by people studying this course
  • Getting a recommendation
  • Getting a recommendation User A Views Resource B Views +1 Resource B Module A123 RV=14 RV=15 User C Recommended Resource B Views +1 Resource B Module A123 RV=15 RV=16 User C Rate Useful +1 Resource B Module A123 RV=17 User C Rate Not Useful Resource B Module A123 -2 RV=14
  • Data Protection and privacy Added a privacy policy to RISE, EDS and SFX interfaces Provided an opt-out featurePrivacy and opt-out URLhttp://library.open.ac.uk/rise/?page=privacy
  • Open Datahttp://www.flickr.com/photos/okfn/6262973028/sizes/z/in/photostream/
  • What next? • Google Gadget search tool • Recommendations database • MACON • Learning Analyticshttp://www.flickr.com/photos/shandrew/2102808886/sizes/m/in/photostream/
  • What next?
  • Findings and lessons learnt• EZProxy data• Use other data sources• Search terms• Need more data• License restrictions on metadata
  • Resources Blog: www.open.ac.uk/blogs/RISE Code: http://code.google.com/p/rise -project/source/browse/trunk/rise/http://www.flickr.com/photos/madeleinerobertson/5612180756/sizes/z/in/photostream/