CARLI Usage Stats Keynote 20130325

20130325 mcdonald price carli usage statistics final McDonald and Price

Published in: Education
• ### CARLI Usage Stats Keynote 20130325

1. 1. Whats the Use: ASymposium on UsageStatistics John McDonald & Jason Price, PhD CIO & AVP Interim Library Director Claremont Colleges Library March 25, 2013CARLI Electronic Resources and Collections Working Groups
2. 2. Overview: a Keynote in three parts 1. Broad perspective: Where are we now? 2. Detailed perspective : Addressing the challenges of usage statistics 3. The latest: our present & future projects
3. 3. Where arewe now?
4. 4. The Promise & The Peril
5. 5. How many people do you need in a room before it ishighly likely that two share a birthday? a) Less than 30 b) 30 – 60 c) More than 60 d) 367
6. 6. Which treatment for kidney stones is moresuccessful? Treatment A Treatment B Success Treatment A 78% Treatment B 83% Rates (273/350) (289/350) Small Group 1 Group 2 Stones 93% (81/87) 87% (234/270) Group 3 Large Group 4 73% Stones 69% (55/80) (192/263) Success 78% 83% (289/350) Rates (273/350)
7. 7. So what? What willthe data tell us…
8. 8.  Harvesting the Crop: Implementing a Usage Statistics Management System at Georgia State Social Media, ROI and Cookie Day How Do E-Resources Contribute to Teaching and Learning? Findings from the Lib-Value Project Using Data Visualization Tools for Collection Analysis To Keep, or Not to Keep: The Effect of Discovery Tools on Licensed Resources Everything Thats Wrong with E-book Statistics - A Comparison of E- book Packages Discovery & Usage: The Foundation of a Powerful Collection
9. 9. Standards
10. 10. Counter at 10
11. 11. Progress Commonly agreed upon measures Routine methods of transmission Regular formatting of files Standard dates for delivery Audits of reports Certification of compliant vendors Established process for refinement
12. 12. Still evolving Comprehensive coverage of publishers Sophistication on Ebooks & Databases Automation Further granularity Measures for non-text usage (article parts) Article level metrics
13. 13. Usage in Practice
14. 14. Usage Statistics informingdecisions about acquisitions
15. 15. # of Total \$ of Additional Total Savings Ebooks Ebooks not STL Costs over Existing Purchased Purchased PlanPurchase on 89 \$17,382.31 \$3,327.20 \$14,055.11 Cost Projections - GVSU4th LoanPurchase on 58 \$24,512.55 \$4,621.09 \$19,891.465th LoanPurchase on 34 \$25,722.11 \$5,041.64 \$20,680.476th LoanPurchase on 22 \$26,899.83 \$5,324.84 \$21,579.997th Loan Doug Way and Julie Garrison, “Financial Implications of Demand-Driven Acquisition,” in David Swords (ed.) Patron-Driven Acquisitions: History and Best Practices. (Berlin: De Gruyter Saur, 2011), p. 148.
16. 16. Usage Statistics informingdecisions about print collectionmanagement
17. 17. DU Storage studyLevine-Clark, Michael, “Analyzing and Describing Collections Use: Strategies forManaging a Library Move,” LYRASIS Ideas and Insights, Webinar, May 4, 2012.http://www.slideshare.net/MichaelLevineClark/
18. 18. Usage Statistics informingdecisions about shared printprojects
19. 19. Each “Title-Holding” has different characteristics Dominguez Fullerton Long Beach Los Angeles Northridge Pomona Hills Total Circulations 0 circs 19 circs 16 circs 12 circs 13 circs 8 circs Last Circulation Date -none- 11/30/11 12/16/08 5/30/07 4/27/07 3/11/08 Date added to Collection 6/27/02 4/23/02 9/21/01 5/03/00 11/11/02 8/11/00Sustainable Collections Services, Maine Shared Collections Strategy PlanningMeeting, http://www.slideshare.net/Maine_SharedCollections/mscs-scs-planning-meeting-rick- 21lugg-andy-breeding
20. 20. Sample Pilot Group - Title-Holdings by Holdings Level2,000,000 Sample Pilot Group - Title-Holdings by Holdings Level1,800,000 2,000,0001,600,000 1,800,000 779,7561,400,000 1,600,000 4+ circs 4+ circs 779,756 1,400,0001,200,000 1-3 Circs 1-3 Circs 1,200,0001,000,000 0 circs 0 circs 1,000,000 800,000 305,438 539,718 800,000 305,438 539,718 600,000 257,739 600,000 311,240 257,739 400,000 311,240 400,000 220,071 220,071 560,107 200,000 560,107 362,050 200,000 362,050 239,202 239,202 - - 1 1 22 3-6 3-6 # of Pilot Group Libraries Holding Title # of Pilot Group Libraries Holding Title
21. 21. Resource Sharing: CAMINO Collections CUC LMU Oxy Pep UOP CSTWstmtCalArts CBU Dom WJUWUHS AJU HNU 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 Books held only by library Books held by BOTH library and the rest of Camino Books held only by the rest of Camino
22. 22. Usage Statistics informingdecisions about print & onlineresources
23. 23. Holy grail: Understanding UserBehavior
24. 24. 30
25. 25. 32
26. 26. Tracking Impact Beyond Articles
27. 27. http://www.zazzle.com/statistics_means_never_having_to_say_youre_certai_tshirt-235669028746970031 March 25, 2013
28. 28. Part 2: Addressing theChallenges of Usage Stats1. Comparability • Package price per use • Defining the appropriate range(s) of cost per use • Practical applications2. Reliability • Impact of mobile, discovery & harvesters3. Prediction • Demand Driven Acquisition • Number of books available <> Size of budget4. Context – Data about our data
29. 29. Apples and oranges are both round(ish)…
30. 30. Challenge 1: Comparing Package Price Per ViewpkgIDTotal Use SubsCost UnSubsCost Overall PPV S3.140048 \$1,652,000 comparison\$13.10 1 Cross-package \$182,000 2 20341 \$333,000 \$10,000 \$16.86 3 13572 \$282,000 \$21,000 \$22.33So Pkg 1 is a better value than Pkg 3? It might not be…
31. 31. html to pdf Ratios vary widely for these packages 50000 48047 html views 40000 pdf downloads 32688 # of views 30000 1:1.3 1:23 20000 13004 1:12 10000 4066 352 568 0 1 2 3 PackageHow many pdfs in Pkg 1 are duplicates of html views? (fmi: See Davis & Price, 2006 JASIST 57(9))
32. 32. Getting a pdf from Package 1…‘Get article’ links directly to the html version… then the user downloads the pdf… …2 uses are recorded for 1 pdf
33. 33. Total full text views suffer from duplication issues
34. 34. pkgID Package value revisited S3. Use SubsCost UnSubsCost Overall PPV Total 1 140048 \$1,652,000 \$182,000 \$13.10 2 20341 \$333,000 \$10,000 vs. \$16.86 3 13572 \$282,000 \$21,000 \$22.33 pdf requests only tell a different story!pkgID Est. pdf Use SubsCost UnSubsCost Overall PPP 1 83469 \$1,652,000 \$182,000 \$21.97 2 18734 \$333,000 \$10,000 \$18.31 3 13287 \$282,000 \$21,000 \$22.80
35. 35. Addressing Challenge 1: Comparing Package Price Per View When comparing packages, both total views and PDF downloads should be compared Extension of principle: Journal report 1B JR 1a JR 1b ARCHIVE FRONTFILE
36. 36. Challenge 2: Defining acceptable range(s) ofcost per use Among packages Within packages
37. 37. Reality CheckShould we expect cost per use to beequivalent among packages?Content QualityBusiness Model For Profit vs Cost RecoveryExposure in Discovery toolsTitle list accuracyBackfile access ASSUMPTIONS
38. 38. Reality!Acceptable CPU range? a) 0-\$6 b) 0-\$12 c) 0-\$24 d) 0-\$50 e) It depends on _________ f) Can’t say / Don’t know
39. 39. Consortial Benchmarks SCELC Package W Overall Price per Use \$50.00Price per full text article view \$40.00 \$30.00 Use data not avaliable \$20.00 \$10.00 \$0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Consortium Member (Sorted by decreasing spend)
40. 40. Consortial Benchmarks SCELC Package W Overall Price per Use \$50.00Price per full text article view \$40.00 \$30.00 Use data not avaliable \$20.00 \$10.00 \$0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Consortium Member (Sorted by decreasing spend)
41. 41. Consortial Benchmarks SCELC Package W Overall Price per Use \$50.00Price per full text article view \$40.00 \$30.00 Use data not avaliable \$20.00 \$10.00 \$0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Consortium Member (Sorted by decreasing spend)
42. 42. Consortial Benchmarks SCELC Package W Overall Price per Use \$50.00Price per full text article view \$40.00 \$30.00 Use data not avaliable \$20.00 \$10.00 \$0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Consortium Member (Sorted by decreasing spend)
43. 43. Subscribed titles w/in a pkg- Apples to apples? \$1,200 \$4300/year; 11 uses \$1,000 Price Per Use (PPU) \$800 \$8800/year; 33 uses \$600 \$3400/year; 17 uses \$400 \$200 \$0 0 10 20 30 40 50 Subscribed Title # (ordered by PPU)
44. 44. Strict title level cost per view is misleading Cost per view by access type \$35 \$30.51 \$30 \$25 Cost per view \$20 \$13.41 \$15 \$10 \$5 \$0.81 \$0 All Titles Subscribed Titles Unsubscribed [n=537] [n=192] (Leased) Titles [n=345] Access Type
45. 45. Best practices for usage comparison tasks 1. Goal - Identify pricing inequity a. Best accomplished by consortial benchmarking b. Requires readily available package level cost per use across consortial participants c. Leverage COUNTER consortial reports and economy of scale of consortial specialist 2. Goal - Identify lower value packages a. Use both total views and pdf download comparisons 3. Goal - Identify lower value titles a. Only after targeting specific lower value packages b. Recognize that by title price per use comparison is only valid within a package
46. 46. Challenge 2: The convenience/reliability trade off COUNTER R4: Search activity generated by federated search engines and other automatedCase Search Inflation Full text search agents should be included in separate impact Impact inflationDirect from Google IP Low? [Cost is granularity Low? “Searches_federated and automated” countsaccess of usage stats] …and are NOT to Unlikely to be significant “RegularMobile devices be included in the Low to None Searches” counts.Federated search Significant, but COUNTER Noneengines (built into some rules require separatediscovery tools) automate reporting & number ofsearches searches has always had dubious meaningHarvesters (e.g. Quosa) Same as federated search Potentially veryautomate article highdownloads from searchresults
47. 47. Harvesters (like Quosa): the real threat?
48. 48. Usage factor may address the harvester challenge
49. 49. Challenge 3: Prediction – Coming soon! Observations: • Libraries prefer predictability over savings! • Title level journal usage is remarkably predictable year on year • Usage driven purchasing is ripe for modelling based on this predictability
50. 50. Example: Demand driven ebook forecastingEstimated List Size -OR-Estimated Annual Expenditure=List Size × (% visible list purchased × mean book price) + (% visible list w STL × mean cost per STL × mean STL per title)
51. 51. Challenge 4 – Context = metadata!• We do need good data about our data • Data quality is more than just accuracy • Retrospective studies require history! • Circulation Statistics • Dates of profile changes • Cross library comparisons• In an ideal world we’d share datasets with rich metadata• Library science is far from this ideal world• An example of the power of good retrospective data…
52. 52. Total Books & Usage User- Pre- Usage by Usage ReadLibrary Model Selected Selected Download Online A MIX 1131 552 6773 9888 B MIX 5246 2612 42880 38329 C USER 2198 102 0 11801 D USER 3010 48 697 15126 E MIX 4159 909 17396 25604 F PRE 0 1451 4905 3082 G PRE 31 2154 7001 4459 H USER 801 0 556 415 I MIX 305 336 3334 2568 J USER 2799 53 5 13349 K MIX 147 276 2436 2283 TOTAL 19,831 8,496 85,983 126,904
53. 53. Total Books & Usage User- Pre- Usage by Usage ReadLibrary Model Selected Selected Download Online A MIX 1131 552 6773 9888 B MIX 5246 2612 42880 38329 C USER 2198 102 0 11801 D USER 3010 48 697 15126 E MIX 4159 909 17396 25604 F PRE 0 1451 4905 3082 G PRE 31 2154 7001 4459 H USER 801 0 556 415 I MIX 305 336 3334 2568 J USER 2799 53 5 13349 K MIX 147 276 2436 2283 TOTAL 19,831 8,496 85,983 126,904
54. 54. Total Books & Usage User- Pre- Usage by Usage ReadLibrary Model Selected Selected Download Online A MIX 1131 552 6773 9888 B MIX 5246 2612 42880 38329 C USER 2198 102 0 11801 D USER 3010 48 697 15126 E MIX 4159 909 17396 25604 F PRE 0 1451 4905 3082 G PRE 31 2154 7001 4459 H USER 801 0 556 415 I MIX 305 336 3334 2568 J USER 2799 53 5 13349 K MIX 147 276 2436 2283 TOTAL 19,831 8,496 85,983 126,904
55. 55. Total Books & Usage User- Pre- Usage by Usage ReadLibrary Model Selected Selected Download Online A MIX 1131 552 6773 9888 B MIX 5246 2612 42880 38329 C USER 2198 102 0 11801 D USER 3010 48 697 15126 E MIX 4159 909 17396 25604 F PRE 0 1451 4905 3082 G PRE 31 2154 7001 4459 H USER 801 0 556 415 I MIX 305 336 3334 2568 J USER 2799 53 5 13349 K MIX 147 276 2436 2283 TOTAL 19,831 8,496 85,983 126,904
56. 56. Librarian Acquired
57. 57. Data required• Book purchase date• Book purchase type• Many years of use• Different types of use• Library purchasing profile• Library list profile (what content was excluded)• Individual user IDs (anonymized)• Came from 4 files per library with a total of 69 data elements….• We found one vendor that invested in library facing reports the level of data needed, there are few others…• Addressing the challenge: a consortial solution?
58. 58. Part 3: Our present & future1. Improving usage stats collection a. (External) Consortial paperstats b. (Internal) Dublin Six AUDITOR2. Improving usage stats visualization a. Excel Conditional formatting b. Splunk for Dashboard Creation…3. Better database metrics4. Improving on Journal number comparisons5. Usage Factor for Journal Evaluation
59. 59. Consortia: EnhancementsTrack stats for each member Automatic import of consortia stats
60. 60. SCELC PaperStats by the numbersTotal number of full text downloads tracked for SCELC: 312,908,657Total counter reports downloaded: 2000+Total number of logins: 387Number of month records: 20.3MEarliest year covered: 2003Total number of reports being harvested: 15Total number of institutions covered: 95Total number of participants: 14
61. 61. Better technology
62. 62. Click through to article and user level detail!!!
63. 63. Visualization (Excel conditional formatting)
64. 64. Visualization: Splunk
65. 65. Splunk for dashboard visualization
66. 66. Better database metrics (beyond searches & sessions)
67. 67. # of Online Journal Subscriptions: meaningful?50000 Claremont Colleges450004000035000 2nd Quartile30000250002000015000 Median10000 1st Quartile5000 0 2004 2005 2006 2007 2008 2009
68. 68. Beyond numbers of journals & total usage • Knowledge base & Usage statistics comparisons • Selected group of peers with same knowledgebase & stats consolidation vendor • Run comparisons in Access & Excel
69. 69. Usage Factor Formula Usage Factor =Total usage over period ‘x’ of articles published during period ‘y’ ÷ Total articles published during period ‘y’
70. 70. Impact and usage factor ranks are not related
71. 71. (lower)-->RANK-->(higher) 0 20 (lower)-->RANK-->(higher) 40 60 80 100 IF_rank num art? UF_rank(All) 120 UF_rank(All)--not ISI rated 140