Too Difficult


Published on

Presentation for Gaining Business Intelligence from User Activity Data 14 July 2010

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • I thought I’d start with a couple of quotes to illustrate how other sectors view the importance of data and data analysis. In retail data analysis has long been embedded as a critical business tool So if you take Wal-Mart – one of the largest US retailer (who own Asda in the UK) 2m staff 1m transactions per hour Revenue of $400 billion Their CIO has this to say…
  • Craig Mundie of Microsoft and Eric Schmidt, the boss of Google, sit on a presidential task force to reform American health care. They see data as absolutely central to improving health care I think that in that second quote there’s a key lesson for libraries and the HE sector in that we often seem to view user activity data as something that is produced as a by-product of our activities, that tells us what we have produced, sometime what impact that has had - but we don’t use the data to continually improve the design, delivery and customisation of our services
  • I thought I’d do a quick run through of what might constitute success from the perspective of students, researchers and institutions Then look at the sort of solution we might need, consider some of the barriers and challenges and then give some thoughts from an OU perspective.
  • I’ve broken down the stages for Students using our Student Journey model So step one takes the student through to making that decision about where and what to study. And this is quite an obvious area for recommendations to help with this process of finding the right course – and I thought it was interesting that the two OU entries to the MOSAIC competition last year (from Tony Hirst and Owen Stephens) were in this area – either by matching you up with a suitable course based on what you like to read or by showing you what students on a course are reading
  • Moving on to the time when students are studying and you’ve got recommendations based on what other students are doing, and maybe comparisons across institutions
  • And then, the final step of the user journey – looking at next steps and building longer-term affiliation as students follow a career path – that’s particularly important to the OU where students sign-up a module at a time – there’s the old retail adage that it’s cheaper to keep your existing customers than to find new ones
  • Looking at how it can help researchers then you could include resources in their field that are newly published, or data about what is being cited
  • For institutions there is a big element around decision making, around being able to target your marketing effectively, being able to measure the impact of your work, and to be able to use your likely to be increasingly sparse resources to the best effect
  • OK – so that may be what success looks like – but what sort of solution do we want? Are we looking for a solution that links everything together A solution that pulls data from all relevant sources, that links it together to makes connections between course codes, students and their activity Should that be sector-wide rather than institutional based? That then feeds out that data to where the student is (in the VLE, library search systems) or to staff via dashboards or portals And what tools, standards and applications will be need to do this? I’ve flagged up EBSM – which is Evidence-Based Stock Management – and it is something that has been developed and adopted in some public libraries to take library loans data and produce a suite of reports to help decision-making on what stock to buy or how to rotate stock around libraries
  • So – as a crude model what you end up with as a diagrammatic representation is something like this A great pool of institutional systems feeding data into some form of business intelligence platform – using appropriate data standards and processes and then feeding the results out a recommender systems and visualizations onto front-end systems And that leads to a question? How comprehensive does this system need to be? Is anything less than a comprehensive view of the world inadequate
  • So, lets start with the view that ‘only a comprehensive view of the world of data is acceptable’ Why might you say that? No one system has sufficient data in it to uncover new insights – so for example until Huddersfield matched up loan data with student achievement data no one had the evidence of the impact of library loans on student grades If there are gaps in the data then how do we know we have an accurate picture? Some students may prefer print others electronic versions – if you only have one dataset you have a distorted view And I think there’s a parallel with CRM systems - organisations have increasingly invested in Customer Relationship Management systems to pull together all their customer contacts – data transactions need to come into the same picture So, I’d suggest that we want a comprehensive view
  • What in the Business Intelligence world is often called “A single version of the truth” – one representation of critical data - unique, complete, and consistent, the most reliable and authoritative information for the entire organisation And it has become something akin to a search for the holy grail But it is a view that has been challenged - not least because different users will have different perspectives
  • Turning to some of the challenges and barriers and looking first at cultural and institutional barriers Can key decision makers be convinced that this should be a high priority Can you demonstrate the benefits clearly, in terms of improving efficiency, saving money, or improving services, or in providing unique value Can you show some exemplars of how the data can be used and what the impact is The second barrier is the data itself Do you know which system has the data Can you get access to it? Can you convince data owners and your institution to share it The third area is the view that universities are competing against each other One argument to that is to point to the retail sector who see it as a tool to increase competitive advantage I would also say that as a sector we need to make the best use of diminishing resources and sharing this data across the sector rather than every institution doing their own thing would be more cost-effective Finally, cost – probably the biggest barrier – and we have to be clear about the benefits and clearly articulate them
  • Looking at the technical challenges Can you get the data out of the systems? Do you have people with the right technical skills and often you will need programmers and developers rather than library systems administrators How far back does the data go? – often log files aren’t kept of data isn’t migrated when people change systems Can you feed recommendations back into your systems – can you customise them? And finally – there’s the sheer number of different systems, with different data, built for different purposes – and you may want to not only extract data from those systems but also feed it back in to OPACs, VLEs, eportfolio systems, maybe using gadgets or widgets or RSS feeds?
  • Once you have the data there are still more challenges you’ve got to make sure that individual students can’t be indentified so the data needs to be anonymous You can relate to courses – but need to be careful that it isn’t possible to deduce information about individuals You need to make sure that you aren’t recording the same data in different systems and duplicating it You may have to rethink what data you store where – what fields in a record do you use to match data from one source against another And then how do you cope with potentially huge volumes of data?
  • Are there suitable standards to help with extracting and processing this data? Are there suitable IMS standards – IMS now have a Libraries Project Group under formation who are looking at how library systems and VLEs are integrated – currently their priority has been around the adoption of standards such as Basic Learning Tools Interoperability – but there’s the opportunity for other UK HE libraries to get involved The MOSAIC documentation helps but more will need to be done And finally, there is the whole issue of Data protection, data ownership and permissions that we will hear about later
  • So, that’s a quick run through some of the potential benefits, solutions and challenges, and in the last few minutes I want to go through some of the OU approaches
  • As a distance-learning university there are some big differences The LMS isn’t central to the student experience Students rarely visit the library, borrow books and only register in the LMS if they want a SCONUL card to use another library
  • And – we are accelerating a move to eresources OU students sign-up to individual modules rather than a full degree course in it’s entirety So we don’t have much LMS data
  • But we have a vast array of data from students visits to online as students engage with the OU through course websites in the VLE, through the library website and Student Home
  • But different systems are run by different departments So the VLE is run by a large Learning and Teaching solutions department that create new course websites in the VLE for Faculties Student systems are run by Student Services who are responsible for contact with students Library, Online Services, AACS also are involved in providing systems for students
  • And all the systems have their own reporting tools which don’t make it easy to connect up the data So you rely on programmatically extracting data from the systems into another database and matching the data together instead of being able to get a single view of all the data
  • Some of the early work we’ve been doing include Trialling the bX recommender service from ExLibris This is collecting SFX search data from users across the world But it’s buried quite deep within systems There’s an API available which we need to investigate
  • We’ve also been collecting records of searches via our federated search system So we are collecting a time-stamp, the type of search, the words used in the search and the user login Image is a sample of 1,000 searches carried out in the system When we first looked at the data it surprised us how often students just typed in their course code and expected relevant search results – but thinking about it – why wouldn’t it be logical for a student to expect that typing in a course code would bring back results relevant to that course, maybe organised according to what they need to read each week? we’ve branded it One-Stop search – and put One-Stop search in the search box to explain what it is – so the most common search is ‘One-stop search’ So always beware of the quality of your data
  • A final couple of examples As part of the TELSTAR project we’ve built a reference management tool into the VLE that we’ve called MyReferences MyReferences uses SFX to link to resources and by using this tool for course resources we can track which resources are being used in which courses
  • The last example is a brand new project that we’ve just started working with the Knowledge Media Institute at the OU Project Lucero is about exposing OU data as Linked Data and we are planning to build a couple of prototypes that will include recommender systems As a future direction – Linked Data may uncover some interesting possibilities for user activity data
  • So – to conclude We want a comprehensive view of the world … if we can get it The main challenges to be overcome are: A comprehensive set of tools, standards and case studies Acquiring the necessary skills A commitment that this is the way forward A change in culture towards ‘open’ data The slides will be up on slideshare shortly Thank you
  • Too Difficult

    1. 1. Richard Nurse The Open University Library Gaining Business Intelligence from User Activity Data London 14 July 2010 Too difficult? Content Management Perspective
    2. 2. "Every day I wake up and ask, 'how can I flow data better, manage data better, analyse data better?" Rollin Ford, the CIO of Wal-Mart A special report on managing information: Data, data everywhere Economist, The (London, England) - February 27, 2010 Page: 71
    3. 3. “ Look, if you really want to transform health care, you basically build a sort of health-care economy around the data that relate to people” Eric Schmidt, Google "You would not just think of data as the 'exhaust’ of providing health services, but rather they become a central asset in trying to figure out how you would improve every aspect of health care. It’s a bit of an inversion." Craig Mundie, Microsoft A special report on managing information: Data, data everywhere Economist, The (London, England) - February 27, 2010 Page: 71
    4. 4. <ul><li>What does success look like? </li></ul><ul><li>A comprehensive solution? </li></ul><ul><li>Barriers </li></ul><ul><li>An Open University perspective </li></ul>
    5. 5. <ul><li>What does success look like? </li></ul><ul><ul><ul><li>for students? </li></ul></ul></ul><ul><ul><ul><li>for researchers? </li></ul></ul></ul><ul><ul><ul><li>for institutions? </li></ul></ul></ul>
    6. 6. How can User Activity Data help students? <ul><li>Recruitment </li></ul><ul><li>What is the right course for me? </li></ul><ul><li>‘ Read to Learn’ </li></ul><ul><li>‘ Course Compatibility indicator’ </li></ul><ul><li>Find a course based on your interests </li></ul><ul><li>Loans data, search data, student profile data, open data </li></ul>
    7. 7. How can User Activity Data help students? <ul><li>Progression </li></ul><ul><li>What else are students on my course reading? </li></ul><ul><li>What are students on my course searching for? </li></ul><ul><li>What are students citing? </li></ul><ul><li>What did students who got high grades on my course read last year? </li></ul><ul><li>What is on course reading lists in my subject at other institutions? </li></ul><ul><li>Loans data, search data, reading lists, reference management </li></ul>
    8. 8. How can User Activity Data help students? <ul><li>Post-course </li></ul><ul><li>What courses have people who have done my course done next? </li></ul><ul><li>What courses might I be interested in hearing about? </li></ul><ul><li>What new publications, journal articles have been published in my area of interest? </li></ul><ul><li>Course data, student profiles, E-resources data, Institutional repository </li></ul>
    9. 9. How can User Activity Data help researchers? <ul><li>What have other researchers in my field written? </li></ul><ul><li>What resources are being cited in a particular field? </li></ul><ul><li>E-resources data, search data, Institutional repository, citation data </li></ul>
    10. 10. How can User Activity Data help institutions? <ul><li>Stock management, e-resource decision making </li></ul><ul><li>Who is using (and not using) resources? </li></ul><ul><li>Marketing </li></ul><ul><li>Academics, managers and librarians </li></ul><ul><li>‘ Collection Development Dashboard’ </li></ul><ul><li>Achievement and retention </li></ul><ul><li>Service design </li></ul><ul><li>Loans data, search data, ERM data, E-resources data, search data, Institutional repository, student data </li></ul>
    11. 11. <ul><li>What sort of solution do we want? </li></ul><ul><li>- local, national? </li></ul><ul><li>- a platform we can submit data to? </li></ul><ul><li>- a set of standards, case studies and guidelines? </li></ul><ul><li>- some tools, APIs and applications? </li></ul><ul><li>- some of these?, all of these? </li></ul><ul><li>open source, commercial? </li></ul><ul><li>EBSM? </li></ul>
    12. 12. What sort of solution? User activity data portal APIs Web Services Data standards Processes Visualizations Recommendations LMS Link resolver ERM VLE Search systems E- portfolio Student registry Finance websites E- resources IR Dashboard
    13. 13. <ul><li>Anything less than a comprehensive view of the world is inadequate </li></ul><ul><li>No one system has sufficient data in it to uncover new insights </li></ul><ul><li>an accurate picture? </li></ul><ul><li>User preferences </li></ul><ul><li>Customer Relationship Management </li></ul>
    14. 14. Is anything less than a comprehensive view of the world inadequate? “ A single version of the truth” One representation - reliable - authoritative Consistent Assets Complete Products Unique Customers
    15. 15. <ul><li>Barriers and challenges – Cultural and Institutional </li></ul><ul><li>Making the case </li></ul><ul><li>Data </li></ul><ul><li>Competition </li></ul><ul><li>Cost </li></ul>
    16. 16. <ul><li>Barriers and Challenges – Technical </li></ul><ul><li>Can you extract the data? </li></ul><ul><li>Do you have the skills and access? </li></ul><ul><li>Historical data </li></ul><ul><li>Publishing recommendations </li></ul><ul><li>Wide range of different systems </li></ul>
    17. 17. <ul><li>Barriers and Challenges – Data processing </li></ul><ul><li>What steps do you need to take with the data once you have it? </li></ul><ul><ul><ul><li>Anonymisation </li></ul></ul></ul><ul><ul><ul><li>De-duplication </li></ul></ul></ul><ul><ul><ul><li>Removing records that might identify an individual </li></ul></ul></ul><ul><ul><ul><li>Do you need to change your practices about what data you store in your systems? </li></ul></ul></ul><ul><ul><ul><li>How do you cope with the large volume of data? </li></ul></ul></ul>
    18. 18. <ul><li>Barriers and Challenges – Standards </li></ul><ul><li>Are there suitable standards to help with extracting and processing data from a range of different systems? </li></ul><ul><ul><ul><li>IMS? </li></ul></ul></ul><ul><ul><ul><li>MOSAIC data spec and case studies </li></ul></ul></ul><ul><li>Barriers and Challenges – Rights </li></ul><ul><li>Data protection </li></ul><ul><li>Data ownership </li></ul><ul><li>Permission from owners/users </li></ul>
    19. 19. An Open University perspective
    20. 20. <ul><li>An Open University perspective </li></ul><ul><li>Most Open University students never visit the Open University library </li></ul><ul><li>Most OU students never borrow a book from the OU library </li></ul><ul><li>Most OU students aren’t on the OU Library Management System </li></ul><ul><li>So the LMS isn’t very useful for user activity data </li></ul>
    21. 21. <ul><li>An Open University perspective </li></ul><ul><li>Our print collection is being reduced in size </li></ul><ul><li>OU students sign-up to a module rather than a full-degree course </li></ul>
    22. 22. <ul><li>An Open University perspective </li></ul><ul><li>However </li></ul><ul><li>The OU has over 120,000 visits a month to the Moodle VLE </li></ul><ul><li>100,000+ visits to the library website a month </li></ul><ul><li>170,000 visitors to Student Home each month </li></ul><ul><li>And a single sign-on solution </li></ul>
    23. 23. <ul><li>An Open University perspective </li></ul><ul><li>But - learning systems are run by different departments </li></ul><ul><li>VLE – Learning and Teaching Solutions </li></ul><ul><li>Student registry – Student Services </li></ul><ul><li>Library – SFX and search systems </li></ul><ul><li>Public and student search – Online Services and Academic and Administrative Computer Services </li></ul>
    24. 24. <ul><li>An Open University perspective </li></ul><ul><li>And – different systems have their own reporting tools </li></ul><ul><li>VLE bespoke reporting tool </li></ul><ul><li>Site Intelligence </li></ul><ul><li>Google Analytics </li></ul><ul><li>SFX, Crystal reports </li></ul>
    25. 25. <ul><li>Some early steps </li></ul><ul><li>bX from ExLibris </li></ul>
    26. 26. <ul><li>Some early steps </li></ul><ul><li>Collecting search data </li></ul>
    27. 27. <ul><li>Some early steps </li></ul><ul><li>TELSTAR JISC project </li></ul><ul><li>Integrating reference management into the VLE </li></ul><ul><li>Using SFX to link to resources </li></ul><ul><li>Usage data collected by SFX that can identify which resources are being used in which courses </li></ul><ul><li> </li></ul>
    28. 28. <ul><li>Developments </li></ul><ul><li>LUCERO </li></ul><ul><li>JISC-funded project </li></ul><ul><li>Linking University Content for Education and Research Online </li></ul><ul><li>Expose Open University content as linked data </li></ul><ul><li>Will include recommender features </li></ul><ul><li> / </li></ul>
    29. 29. <ul><li>Finally </li></ul><ul><li>We want a comprehensive view of the world … if we can get it </li></ul><ul><li>The main challenges to be overcome are: </li></ul><ul><ul><li>A comprehensive set of tools, standards and case studies </li></ul></ul><ul><ul><li>Acquiring the necessary skills </li></ul></ul><ul><ul><li>A commitment that this is the way forward </li></ul></ul><ul><ul><li>A change in culture towards ‘open’ data </li></ul></ul>
    30. 30. Image Credits Clevercupcakes IRRI Images Paolo Margari Scorpions and Centaurs Blprnt_van
    31. 31. Image Credits Patrick Hoesley Juliette Culver MOSAIC final report Cushing Memorial Library and Archives CompoundEye Ian S’ photostream