Usage Statistics & Information Behaviors: understanding User Behavior with Quantitative Indicators
Upcoming SlideShare
Loading in...5
×
 

Usage Statistics & Information Behaviors: understanding User Behavior with Quantitative Indicators

on

  • 958 views

Presentation at the NISO usage data forum 2007

Presentation at the NISO usage data forum 2007

Statistics

Views

Total Views
958
Views on SlideShare
958
Embed Views
0

Actions

Likes
0
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Good morning and I’d like to thank you all for coming. My name is John McDonald and I am the Assistant Director for User Services & Technology Innovation at the Libraries of the Claremont Colleges. I’m pleased that the forum organizers have invited me to present about committee for this position has invited me to campus today for this interview. Today I’m going to talk about the Library’s Dilemma: The Future of Innovative Library Services in the Academic Environment. I’ll be covering a lot here today, but feel free to ask questions at any time.
  • The Library’s Dilemma is a phrase that I’ve co-opted from the Innovator’s Dilemma, a book by Clayton Christensen, where he argues that disruptive technologies help new companies bring about products that can challenge and eventually replace established products and businesses. The dilemma is how does the established company or business innovate, while still producing their products or services, in order to avoid being replaced by new companies that do not have the same established product to continue to sell. This chart is one I’ve adjusted from a presentation by Gary Flake, a microsoft researcher, who was writing about Internet singularity. He was comparing the offline world to the online world, but I’ve adjusted it to compare the Library Environment to the Internet Environment, relative to academic research. You can see the dichotomy between each of these problematic points. For libraries to innovate, we have huge costs, in real dollars, personnel, or space. Web applications have no such costs. The tail of the library’s content is limited by our space, our selection criteria and mission statements, and our organization schemes. On the web, the tail is theoretically unlimited, even if it is functionally limited by the available retrieval systems and abilities of users. In libraries, to innovate, we usually need to work harder, since new and better services or systems require more employees or require those employees to develop new skills. On the web, innovation is driven by working smarter – using technological advances to improve the efficiency of the worker or user. To provide quality products and services, libraries typically have to have high intervention – much manual labor and interaction with our users or our resources to design innovative services. On the web, quality is improved through technology developed outside the information world or through integrating tools and services. And finally, in libraries innovation follows demand. And quite often, that’s great demand, or demand followed by financial support, or the demand is so great that it imposes innovation. On the web, innovation predicts and precedes demand. Innovation is typically driven by single users and can be developed, tested, and then accepted or rejected with few costs to the network. The solution to this dilemma is for the library to move along a continuum on each of these points from the left to the right. And the challenge is how to do this while still serving our users to the same standard as we have in the past.
  • The Library’s Dilemma is a phrase that I’ve co-opted from the Innovator’s Dilemma, a book by Clayton Christensen, where he argues that disruptive technologies help new companies bring about products that can challenge and eventually replace established products and businesses. The dilemma is how does the established company or business innovate, while still producing their products or services, in order to avoid being replaced by new companies that do not have the same established product to continue to sell. This chart is one I’ve adjusted from a presentation by Gary Flake, a microsoft researcher, who was writing about Internet singularity. He was comparing the offline world to the online world, but I’ve adjusted it to compare the Library Environment to the Internet Environment, relative to academic research. You can see the dichotomy between each of these problematic points. For libraries to innovate, we have huge costs, in real dollars, personnel, or space. Web applications have no such costs. The tail of the library’s content is limited by our space, our selection criteria and mission statements, and our organization schemes. On the web, the tail is theoretically unlimited, even if it is functionally limited by the available retrieval systems and abilities of users. In libraries, to innovate, we usually need to work harder, since new and better services or systems require more employees or require those employees to develop new skills. On the web, innovation is driven by working smarter – using technological advances to improve the efficiency of the worker or user. To provide quality products and services, libraries typically have to have high intervention – much manual labor and interaction with our users or our resources to design innovative services. On the web, quality is improved through technology developed outside the information world or through integrating tools and services. And finally, in libraries innovation follows demand. And quite often, that’s great demand, or demand followed by financial support, or the demand is so great that it imposes innovation. On the web, innovation predicts and precedes demand. Innovation is typically driven by single users and can be developed, tested, and then accepted or rejected with few costs to the network. The solution to this dilemma is for the library to move along a continuum on each of these points from the left to the right. And the challenge is how to do this while still serving our users to the same standard as we have in the past.
  • Starting: activities associated with the initiation of information seeking Browsing: scanning for information in areas of interest or near relevant items Accessing: physical & intellectual act of locating & acquiring information Chaining: following chains of citations and hyperlinks Differentiating: using differences between resources to filter information Extracting: identifying and selecting information systematically from a source Verifying: using other resources to establish the authenticity of information Networking: communicating and interacting with others to build information archives, gather information, and share information Monitoring: keeping up to date with information in the area Managing: filing, organizing, and storing information for later use or re-use Manipulating: re-use and re-purposing of the data, or in Web 2.0 terms, mash-ups Ending: making sure nothing was missed earlier, or finalizing the project
  • One set of methods is to match our information systems and the metrics they produce to these behaviors. For we’ve designed these based on research, observation, and through analysis of our researchers needs and expectations.
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • This table shows the results of this analysis. Journals were grouped into 9 broad subject categories and I analyzed publisher provided online journal usage for our collection from before we had the resolver (in 2000) to after we had it (2002). The results were significant. The negative and large z-scores indicate that usage was higher after SFX was released and the results were significant for almost all subject areas at the .05 or .01 level. The three disciplines that had no signficance to the increase in usage were the 3 with the fewest number of journals with publisher provided usage. It is clear that SFX led to more article downloads by our faculty and students.
  • Further extending the theme of the effect of one decision on other resources, I wondered what the impact of a new access tool might have on the usage of other access tools. Specifically I wondered what the impact of resource groupings in the quicksets, subject categories might have on usage of those resources. Metalib is a powerful discovery and access tool and the metasearch functionality exposes more content to the user and promotes resources that might otherwise have been overlooked.
  • This table shows the stats for Metalib since its public release. It’s interesting to note that WoS, CLAS, and WorldCat are virtually identical. As you know, those are the three databases grouped into the default “Basic resources” quickset. So, it led me to wonder how the use of Metalib and its return of records
  • Here is an illustration of the number of searches in CLAS in 3 month blocks beginning with the month on the x-axis. The blue line is the searches in 2005 while the orange line is 2006. Generally, this shows a drop in total searches in CLAS, maybe explainable by the Google effect or lower reliance on books by our community. But, more interestingly is that the trendlines indicate that the searches for 2006, while initially lagging well behind the 2005 rate until an increase in august and a spike in September. By November, the search rate had returned to previous levels. In September we released Metalib and this could explain some of the effect.
  • Here is an illustration of the number of searches in WorldCat for 2005 and 2006. The blue line is the searches in 2005 while the orange line is 2006. Generally, this shows an increase in searches for 2006, but a drastic jump in searches starting in September and continuing into October, November and December. Again, this mirrors the release of Metalib and indicates that the positive recommending of both CLAS and WorldCat as part of the default Basic Quickset. Database Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2006 413 672 701 478 677 497 662 659 1331 2041 979 1398 2005 207 336 323 503 295 244 422 566 457 533 408 288
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • The final example that I wanted to discuss today is the effect of adding JSTOR on our faculty information usage behavior, specifically its affect on citations on Caltech authored publications. We bought JSTOR in 2002, so I selected articles published in 2000 (in blue) compared to those published in 2004 (in orange). Overall, the test of the entire set of journals in the group was not significant since there were only 51 articles in each sample year with about 200 citations to JSTOR journals each year. There were a total of 27 journals from JSTOR cited in both samples, and this chart illustrates the ranked values of the citations plus a powerlaw trend line. This illustration indicates the the most popular journals were positively affected by provision of JSTOR, while there was no effect on the little used titles.
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • Example 2 is the effect that Metadata notation has on information usage. In July 2006, we concluded negotiations with Wiley for our site license to their journals for 2006 and 2007. We had access to a shared title list through the SCELC consortia, but our new license was considered outside the consortia by Wiley and thus our shared title access ended. We changed the SFX thresholds on July 1 st , 2006. Users would not be presented with a Fulltext link for those 200+ titles that we no longer had 2006 access to, although links to the 1997-2005 material was still presented. Fortunately for us, Wiley never stopped online access to most titles. So this provides an interesting live experiment in how library service decisions affect usage behaviors, particularly through the previously mentioned access and discovery tools. I would expect, from the prior research, that reducing the active presentation of the links to 2006 material reduced the number of downloads of Wiley articles and the amount of use recorded through SFX.
  • This Chart shows the usage, as reported by Wiley, of their Journals by us in 2005 and 2006. It is limited to just the SCELC shared list that we no longer have a valid subscription for. It’s a little hard to see, but the Blue lines are the 2005 usage and a linear trendline. Compare that with the 2006 usage, in orange. You’ll notice a huge spike in July in 2006 usage. That’s contrary to our theory that our use should have dropped. This was because Melody individually checked access to each journal to verify that Wiley was not reducing our valid access. But more important is the trend that shows that while the usage does trend down, it should be dropping faster, and 2006 should be less than 11% more than 2005 (We expect usage to increase about 11% per year as additional years of online material were added to the database). Therefore, our lack of access to 2006 material should have affected our overall stats. It didn’t because Wiley didn’t really reduce our access.
  • This is just 2006 represented, with the blue being the SCELC titles and Orange our titles. Here we see a similar trend, but without the spike in July (since Melody wasn’t using SFX to get to the PDFs). We see a general downward trend in the Clickthrough Rate. I used Clickthrough rate because that signifies a user clicking from the Menu to the fulltext. We would expect this to go down, since there are fewer fulltext presentations since 2006 wasn’t being offered. And indeed, the rate dropped in July and never returned to the previous levels. This indicates that the lack of presentation of FT affected SFX users and their usage of these titles. Contrast this to the previous slide where we found that Wiley reported usage didn’t suffer to the extent that we thought and a clearer picture exists that users are still getting articles, even when we don’t indicate ownership.
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • The first question is “What was the impact of a new access tool on resource usage?” Specifically, I wondered what affect SFX had on the usage of journals in our collection. I speculated that the increase in discoverability and accessibility provided by SFX led to our faculty and students using journals more heavily than they had in the past. The test of this hypothesis is that there is no relationship between the release of the resolver and journal usage.
  • Any questions, thoughts, or reasons to call me crazy?

Usage Statistics & Information Behaviors: understanding User Behavior with Quantitative Indicators Usage Statistics & Information Behaviors: understanding User Behavior with Quantitative Indicators Presentation Transcript

    • Usage Statistics
    • &
    • Information Behaviors:
    • Understanding User Behavior with Quantitative Indicators
    • John McDonald
    • Assistant Director for User Services & Technology Innovation
    • The Libraries of the Claremont Colleges
    • November 2, 2007
    • NISO Usage Data Forum
  • Correlation: Boba Fett and Ladybugs
  • We have the data, now what do we do?
    • What we have done:
    • Cancel journals
    • Inform purchase decisions
    • What we should do:
    • Understand usage behaviors
    • Guide our decision making processes
    • Understand our impact on our patrons
    View slide
  • Information Usage Behaviors
    • Starting
    • Browsing
    • Accessing
    • Chaining
    • Differentiating
    • Extracting
        • Ellis (1993), Ellis & Haugan (1997) & Meho & Tibbo (2003), McDonald (2007)
    • Verifying
    • Networking
    • Monitoring
    • Managing
    • Manipulating
    • Teaching
    • Ending
    View slide
  • Accessing Managing & Ending Chaining & Differentiating Accessing & Browsing
  • How do we observe & measure?
      • Pose a Question
          • How will a new service affect our users?
      • Develop a Theory
          • Explain what you think happened.
      • Test the Theory
          • Develop metrics, collect data, analyze.
  • Example 1: Starting & Accessing
      • Question : How will a new service affect our users?
      • Theory: If we improve the user’s ability to identify relevant material (starting) and retrieve it (accessing), we either save them time or effort and allow them to access more material.
      • Test: There will be a significant increase in the usage of material.
  • Starting & Accessing: Use Before & After OpenURL *significant at .05 level **significant at .01 level Publisher Use 2000 Publisher Use 2001 Publisher Use 2002 Wilcoxon Signed-rank Test Subjects Journals Mean SD Mean SD Mean SD z P>z Astronomy 1 347 0 813 0 1408 0 -1.00 0.32 Biology 104 638 1625 847 2079 957 2351 -5.88 0.00** Chemistry 42 1388 3248 1553 3889 2542 7294 -4.85 0.00** Comp. Sci. 14 197 429 224 490 175 239 -1.63 0.10 Engineering 20 92 200 164 310 174 312 -2.41 0.02* Gen. Sci. 3 16243 15571 20938 20345 26553 26506 -1.39 0.17 Geology 22 46 183 44 143 144 374 -3.10 0.00** Mathematics 29 59 155 80 153 121 182 -3.68 0.00** Physics 28 198 313 1081 2107 1526 2933 -4.00 0.00** Total 263 701 2730 975 3527 1301 4953 -10.39 0.00**
  • Example 2: Differentiating
      • Question : Do our choices affect our users ability to differentiate between resources?
      • Theory: If we group resources together, we allow users to identify relevant resources and provide efficient methods to differentiate between resources.
      • Test: There will be a significant increase in searches across common resource groupings.
  • Differentiating: Federated Search Statistics Database Searches Web of Science 3823 OPAC 3314 WorldCat 3267 PubMed 238 INSPEC 233 MathSciNet 183 Faculty of 1000 Biology 176 Compendex 132
  • Differentiating: OPAC Searches ( 2005 v. 2006 )
  • Differentiating: WorldCat Searches
  • Example 3: Chaining
      • Question : Do our users move from one information resource to another?
      • Theory: If users are moving from resource to resource, usage of resources in the same environment (one provider) and results of that usage (citations) will increase.
      • Test: There will be a significant increase in the usage and/or results of usage of a resource’s material.
  • Chaining: JSTOR Citations ( 2000 v. 2004)
  • Example 4: Managing, Teaching
      • Question : Are our users managing or utilizing content differently?
      • Theory: A stable online archive allows users to re-access or re-use content more efficiently (utility usage or virtual vertical file), or utilize it for instructional purposes in different ways (virtual syllabus).
      • Test: There will be a significant increase in the systematic re-use of current, locally produced content.
  • Managing, Teaching: Use of local content
  • Example 5: Service Effects
      • Question : How do our choices in libraries affect user behavior?
      • Theory : When we change the display options (e.g. cataloging) for journals, did that affect either publisher usage or SFX usage?
      • Test : Changing cataloging results in decreased local journal usage as measured by the publisher and SFX.
  • Service Effects: Usage of Journals ( 2005 v. 2006)
  • Service Effects: SFX Clickthrough Rate ( Local v. Shared )
  • Example 5: Services Related Behaviors
    • What else do users want or need?
      • Are there services related behaviors that we can observe? Providing content is one option, but how are researchers using associated information services?
      • If we provide them the article they want in fulltext, we see that sometimes they ask for other types of things.
      • Can we match these things to those user behaviors?
  • Services Related Behaviors Information Service Requests Order Article via Document Delivery 951 See References for this Article 790 Search Library Catalog 580 Read Abstract 283 Search Article Title on the Web 170 Send Feedback to Library 15 See Articles citing this Article 11
  • What else could we be studying?
    • Monitoring
      • Many information providers have e-alerts, repeat saved searches, etc.
    • Networking
      • Users may want to email a citation to a colleague or another student.
    • Extracting
      • Passing the bibliographic information to another database to search.
    • Analyzing
      • Including user behavior information in the statistical measurement tools.
  • Questions? John McDonald November 2, 2007