Upcoming SlideShare
×

# Meeting the Challenge / NISO update

847 views

Published on

Oliver Pesch presentation on SUSHI and IOTA projects, part of the LITA/ALCTS Electronic Resources Management Interest Group meeting held at ALA Midwinter in Seattle, WA, on January 27, 2013

Published in: Education
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
847
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
0
0
Likes
1
Embeds 0
No embeds

No notes for slide
• Lets run through a quick example. This table shows the core elements for an article link… and for the simplicity of this example we will assume all elements are equally important so each gets a weight of 1 – a perfect OpenURL will get the maximum score of 8.
• Now lets look at some OpenURL elements…. In this OpenURL we have…&lt;CLICK&gt;Date … so we add one point&lt;CLICK&gt;ISSN… add another point&lt;CLICK&gt;Volume… another point&lt;CLICK&gt;ISSUE… another&lt;CLICK&gt;And Article Title… and another point&lt;CLICK&gt;… the result is a total of 5 points.&lt;CLICK&gt;The calculation is Sum of the weights for this OpenURL divided by the total for all weights&lt;CLICK&gt;Which is five divided by 8&lt;CLICK&gt;Or .625
• We needed a better way of determining the element weights, so we sought help from Phil Davis – a researcher with some experience in statistical modeling. Phil’s suggestion was to perform stepwise regression to see the effect of individual elements on a sample of OpenURLs. And that is what we did…We started with a set of “perfect” OpenURLs – ones that not only included all core data elements, but that also resolved to match a full text target on both LinkSource and 360 Link… we used a set of 1500.&lt;CLICK&gt;We then ran several series of tests where we ran the OpenURL past the link resolver with a different element removed for each test series.&lt;CLICK&gt;We recorded the success (or rather failure rates) associated with each element. The elements with the higher failure rates are more important to the success of the OpenURL than the ones with lower failure rates.&lt;CLICK&gt;We then used the failure rates as a basis for weights.&lt;CLICK&gt;Then we used the weights and re-ran our 15,000 sample test.
• So how’d it turn out? Again, here are numbers for LinkSource.&lt;Click&gt;You can see Volume was a key element with 74% of OpenURLs failing when it was removed.&lt;Click&gt;Author last name was not very important with less than a 10th of a percent failure rate&lt;Click&gt;Date was surprising low too. This could be for a few reasons – the level of forgiveness in the holdings matching logic (e.g. treat no date as “any date”), the ability for the link resolver to discover the date by looking up the article citation in the knowledge base using volume/issue/start page coupled with the fact that a lot of full text providers don’t use date explicitly in the outbound links.
• We created article weights. &lt;Click&gt;Rather than use raw failure rates, we used logarithmic values of the failure rates – the number of failures per 10,000.
• Then we ran our 15,000 record sample again. You can see from the graph that average completeness score and average success score for the OpenURL providers align very closely, and the Correlation Coefficient of these two values across all 15,000 test OpenURLs is .80 – which indicates a strong correlation. Good news for the test.This tells us that the Completeness Index can be used as a predictor of OpenURL success from a particular content provider – a low Completeness Index is a good indicator there is a problem.
• ### Meeting the Challenge / NISO update

1. 1. Meeting the Challenge: SuccessfulElectronic Resources Management in theAbsence of a Perfect SystemNISO Update on IOTA and SUSHIOliver PeschChief Strategist, E-Resources, EBSCO InformationServices
2. 2. Overview SUSHI IOTA Other NISO Initiatives
3. 3. SUSHIStandardized Usage Statistics HarvestingInitiative
4. 4. SUSHIWHAT IT IS ... An ANSI/NISO Standard (NISO Z39.93-2007) Defines automated request and response model for harvesting e-resource usage data Designed to work with COUNTER, the most frequently retrieved usage reports
5. 5. SUSHIHOW TO USE IT … Works behind-the-scenes It is a client-server technology used by usage consolidation solutions (e.g. ERM systems) and content providers Content providers develop a SUSHI Server to deliver COUNTER statistics Usage consolidation solutions include a SUSHI client to automatically retrieve usage on a scheduled basis or on demand
6. 6. SUSHIWHY YOU SHOULD USE IT … It replaces the time-consuming user-mediated collection of usage data reports The protocol is generalized and extensible, meaning it can be used to retrieve a variety of usage reports
7. 7. SUSHICURRENT STATUS… Many resources available on SUSHI web site: http://www.niso.org/workrooms/sushi 40+ content providers support SUSHI (SUSHI Server Registry: https://sites.google.com/site/sushiserverregistry) Works with all COUNTER reports Ready for COUNTER Release 4 SUSHI support is an enforced requirement for COUNTER compliance with Release 4
8. 8. SUSHITHE COMMITTEE… Bob McQuillan, Innovative Interfaces Inc. (Co-chair) Oliver Pesch, EBSCO Information Services (Co-chair) Marie Kennedy, Loyola Marymount University Chan Li, California Digital Library John Milligan, ScholarlyIQ Paul Needham, Cranfield University James Van Mil, University of Cincinnati Libraries
9. 9. SUSHICURRENT ACTIVITES… ◦ Continued education and awareness ◦ Renovating the web site ◦ Exploring “SUSHI Lite” – a protocol that would be based on JSON
10. 10. IOTA Improving OpenURL Through Analytics
11. 11. IOTAWHAT IS IT… ◦ A working group focused on OpenURL quality… ◦ Using analytics to provide a quantitative measure of quality of OpenURLs provided by “Sources” ◦ Created the Completeness Index as a measure of quality ◦ Developed an interactive online tool to provide analysis and reporting on real OpenURL log file ◦ Producing a Technical Report and Recommended Practice related to OpenURL quality
12. 12. IOTACOMPLETENESS INDEX… Based on premise that the success of a link can be affected by the data provided in the OpenURL Identify the required metadata elements Determine a “weight” for each element to reflect importance Score an OpenURL by adding weights for all elements provided divided by the total if all elements appeared
13. 13. IOTA  Simple example assuming equal element weightsElement Description Weight This OpenURLATitle Article title 1AuLast Author’s last name 1Date Date of publication 1ISSN ISSN 1Issue Issue number 1SPage Start page 1Title Journal Title 1Volume Volume number 1TOTAL 8
14. 14. IOTA SAMPLE OPEN URL DATA ?date=2/4/2008 &issn=1083-3013  Simple example assuming equal element weights &volume=13 &issue=20 Completeness Score... &atitle=the+casualties+of+warElement Description Weight This OpenURL(Total for This OpenURL) Total WeightsATitle Article title 1 1AuLast 5 / 8Author’s last name 1Date 1 = .625 of publication Date 1ISSN ISSN 1 1Issue Issue number 1 1SPage Start page 1Title Journal Title 1Volume Volume number 1 1TOTAL 8 5
15. 15. IOTARECOMMENDED PRACTICE… Defines a technique for determining element weights Tested with real link resolvers and real OpenURLs Based on research which looked for a correlation with data elements on the OpenURL and “success” of the OpenURL
16. 16. A Statistical Approach toDetermining Element Weights Select a set of “perfect” OpenURLs ◦ include all key data elements and resolve to full text Perform step-wise regression ◦ Test failure rates for each element by removing that element Use failure rates as basis for weights Use weights to calculate Completeness Scores and to test for correlation between weights and success for larger sample
17. 17. Failure Rates from 1500 OpenURL test sampleAuthor’sElement removed last name is least Description Failure Percentage important OpenURL from the ATitle Article title .74% Date is AuLast surprisingly low Author’s last name .07% Date Date of publication .4% ISSN ISSN (either online or 22.02% print ISSN) Issue Issue number 20.27% SPage Volume is most critical Start page 33.27% Title Journal Title (either .61% Title or Jtitle) Volume Volume number 74.14%
18. 18. Calculated Element WeightsElement Description Weight*ATitle Article title 1.87AuLast Author’s last name 0.83Date Date of publication 1.61ISSN ISSN (either online or 3.34 print ISSN)Issue Issue number 3.31SPage Start page 3.52Title Journal Title (either Title 1.78 or Jtitle)Volume Volume number 3.87 *Element weight calculation: log10 (failure-rate-per-10,000 OpenURLs)
19. 19. Results1.20001.0000 Average of0.8000 Completeness0.6000 Score0.40000.2000 Average of Success Score0.0000 Correlation Coefficient .80 Tests conducted on sample of 15,000 OpenURLs randomly pulled from IOTA database
20. 20. IOTAINTERACTIVE ONLINE TOOL… 23.3+ million OpenURLs processed Reporting interface ◦ Analyze data elements (metrics) across vendors or database (Source) ◦ Analyze (Source) for all data elements
21. 21. Analysis of vendors by element (metric)
22. 22. Analysis of elements by vendor
23. 23. IOTAHOW TO USE IT… ◦ The Technical Report provides suggestions for improving OpenURLs ◦ The interactive tool offers a means to pin- point irregularities in data provided on OpenURLs ◦ The Recommended Practice describes how to create a Completeness Index ◦ Completeness Index allows OpenURL quality problems to be quantified
24. 24. IOTAWHY YOU SHOULD USE IT… ◦ Link resolver vendors can implement the Completeness Index in their products to help identify problematic OpenURL sources ◦ Librarians can use suggestions and Completeness Index to more effectively communicate quality problems to content providers ◦ Content providers can use the online interactive tool to identify problems with the data they provide
25. 25. IOTATHE WORKING GROUP… Adam Chandler (Chair) Database Management and E-Resources Librarian, Cornell University Library Rafal Kasprowski Electronic Resources Librarian, Rice University Susan Marcin Licensed Electronic Resources Librarian, Continuing & Electronic Resources Management Division, Butler Library Columbia University Oliver Pesch Chief Strategist, E-Resource Access and Management Services, EBSCO Information Services Clara Ruttenberg Electronic Resources Librarian, University of Maryland Elizabeth Winter Electronic Resources Coordinator, Georgia Tech Library, Collection Acquisitions & Management Department Jim Wismer Manager, Software Engineering, Thomson Reuters Aron Wolf Data Program Analyst, Serials Solutions
26. 26. IOTACURRENT STATUS… ◦ Technical Report in final draft ◦ Recommended Practice has been submitted to NISO ◦ Interactive Online Tool remains available
27. 27. Active NISO Initiatives DAISY Standards Demand-Driven Acquisition (DDA) of Monographs Digital Bookmarking and Annotation E-book Special Interest Group (SIG) IOTA: OpenURL Quality Metrics I2 (Institutional Identifiers) ISO Project 25964 JATS: Journal Article Tag Suite (Also known as Standardized Markup for Journal Articles) KBART (Knowledge Base and Related Tools) (NISO/UKSG) NCIP (NISO Circulation Interchange Protocol) Standing Committee Open Discovery Initiative PIE-J (Presentation & Identification of E-Journals) ResourceSync SERU Standing Committee Standard Interchange Protocol (SIP) Supplemental Journal Article Materials (NISO/NFAIS) SUSHI Standing Committee and SUSHI Servers Z39.7 (Data Dictionary) Standing Committee
28. 28. References Active NISO Groups http://www.niso.org/workrooms/#active SUSHI Web Site http://www.niso.org/workrooms/sushi IOTA Web Site http://www.niso.org/workrooms/openurlquality SUSHI Server Registry https://sites.google.com/site/sushiserverregistry
29. 29. Have an idea for a standard or recommended practice? Email… Nettie Lagace, Associate Director for Programs, NISO nlagace@niso.org THANK YOU!