Michael Levine-Clark, University of Denver, Jason Price, SCELC Library Consortium
As transformative agreements emerge as a new standard, it is critical for libraries, consortia, publishers, and vendors to have consistent and comprehensive data – yet data around publication profiles, authorship, and readership has been shown to be highly variable in availability and accuracy. Building on prior research around frameworks for assessing the combined value of open publishing and comprehensive read access that these deals provide, we will address multi-dimensional perspectives to the challenges that the industry faces with the dissemination, collection, and analysis of data about authorship, readership, and value.
Similar to UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the data that librarians need to manage transformative agreements (20)
4. What did we read?
Before OA / Trans Agreements
Basic
● Full Text Downloads: html/pdf
Advanced/Derived
● What journals did we cite?
Assessment
● Cost per use
After OA / Trans Agreements
Basic
● Controlled use: Unique item requests
Advanced/Derived
● Use of articles by their year of publication
● Use of articles not freely available
Assessment
● Cost per controlled use
○ by YOP?
○ Not freely available?
5. What did we Publish?
Before OA / Trans Agreements
Basic
● ‘Any Author’ Articles
● All article types
Advanced/Derived
● Faculty on editorial boards?
After OA / Trans Agreements
Basic
● Corresponding Author Articles
● Controlled | Hybrid OA Jnl | Fully OA Jnl
Advanced/Derived
● Before & After
● Grant-funded
● Discipline/Subject
Assessment
● Savings - APC Cost Avoidance ($)
● Impact - OA Uptake (%)
● Impact - Global open article readership
● Values - Sustainability, Diversity, Author rights
6. The questions we settled on:
Can institutions effectively predict and/or evaluate corresponding author
publishing patterns using one or more of the commonly available article data
sources?
How consistent are the results across these four data sources?
7. Building a Data Set from SCELC Consortium Agreements
● Six transformative agreements starting between 2021 and 2024
● Four institutions
○ Examples from Chapman (Doctoral High Research) and Univ of Southern California (Doctoral
Very High Research)
● Scopus
● Web of Science
● Dimensions
● Publisher Reports
8. SCELC Transformative Agreement timeline
2020 2021 2022 2023
ACM (*July Start) Year -1 Year 0 Year “½”
ACS Year -1 Year 0 Year 1* Year 2
Cambridge Year 0 Year 1 Year 2 Year 3
IOP Year -1 Year 0 Year 1
Springer Year -1 Year 0 Year 1 (Hybrid
OA)
Wiley Year -1 Year 0 Year 1 Hybrid
OA
Year 2 (Hybrid +
Fully OA)
9. Scopus & Web of Science Data challenges
● Need to consolidate publisher data
● Corresponding author:
○ WoS - “reprint address” contains corresponding author info [“(corresponding author),
Institution.edu”]
○ Scopus - corresponding author address field (need to limit to the university’s email - will miss
anyone using another email)
○ Publisher inconsistencies
● Open Access
○ WoS: Open Access designations
○ Open Access [“all open access (with subcategories)”]
● Document Type
○ What is an article?
11. Corresponding Author
“Correspondence Address”
K. Diki; Schmid College of Science and Technology, Chapman University,
Orange, One University Drive, 92866, United States; email: diki@chapman.edu
“Reprint Addresses”
Diki, K (corresponding author), Chapman Univ, Schmid Coll Sci & Technol, One
Univ Dr, Orange, CA 92866 USA.
12. Corresponding Author
Scopus: “Correspondence Address”
T.D. Simon; Children's Hospital Los Angeles, Los Angeles, United States; email:
tsimon@chla.usc.edu
“Reprint Addresses”
Simon, TD (corresponding author), Childrens Hosp Los Angeles, Los Angeles, CA
90027 USA.
In Scopus, University of Southern California has 38,130 publications 2019-2023.
Of these, 7,631 (20%) have an empty Corresponding Address field.
In WoS, USC has 41,292 publications. Of these, 6,207 (15%) have an empty
Reprint Addresses field.
13. Corresponding Author Discrepancies
M.T. Ballew; Yale Program on Climate Change Communication, Yale School of the
Environment, Yale University, New Haven, 06511, United States; email:
mballew@chapman.edu
Ballew, MT (corresponding author), Yale Univ, Yale Sch Environm, Yale Program
Climate Change Commun, New Haven, CT 06511 USA.
14. USC Publishing 2019-2023
Publisher All Publications Corresponding Author Corr Author + Doc Type:
Article
Springer Nature 5,256 (2) 2,113 (2) 40.2% 1,317 (2) 25.1%
Wiley 3,240 (3) 1,283 (3) 39.6% 1,037 (3) 32.0%
ACS 695 (11) 441 (7) 63.5% 427 (6) 61.4%
ACM 584 (12) 37 (T34) 6.3% 3 (T112) 0.0%
CUP 403 (14) 149 (12) 37.0% 121 (13) 30.0%
IOP 254 (18) 85 (20) 33.5% 69 (20) 27.2%
38,130 total publications over five years
Percentages are of the All Publications total for each publisher
Number in parentheses is the rank order (with T indicating a tie)
15. More Corresponding Author Issues
In Scopus and WoS ACS publications routinely list multiple corresponding Authors:
W. Li; School of Chemistry and Materials Science, Hunan Agricultural University,
Changsha, 410128, China; email: weili@hunau.edu.cn; S. Giannini; Laboratory for
Chemistry of Novel Materials, University of Mons, Mons, Place du Parc, 20, B-7000,
Belgium; email: samuele.giannini@umons.ac.be; O.V. Prezhdo; Department of
Chemistry, University of Southern California, Los Angeles, 90089, United States;
email: prezhdo@usc.edu
In Scopus, Correspondence Author field is blank for 87% of ACM publications (not the
case in WoS), but WoS indexes far fewer ACM publications (584 USC articles in
Scopus, 120 USC articles in WoS)
16. Chapman University Publishing 2019-2023
Publisher All Publications Corresponding Author Corr Author + Doc Type:
Article
Springer Nature 477 (1) 225 (1) 47.2% 147 (2) 30.8%
Wiley 205 (4) 72 (5) 35.1% 67 (5) 32.7%
ACS 67 (11) 52 (6) 77.6% 45 (6) 67.1%
CUP 38 (12) 11 (T11) 28.9% 7 (14) 18.4%
ACM 29 (15) 2 (T37) 6.9% 0 0.0%
IOP 26 (16) 8 (T13) 30.8% 6 (16) 2.3%
2,827 total publications over five years
Percentages are of the All Publications total for each publisher
Number in parentheses is the rank order (with T indicating a tie)
17. Chapman University (2019-2023) Web of Science vs Scopus
All Publications
Publisher
Springer
Nature
477 (1) 343 (2)
Wiley 205 (4) 266 (3)
ACS 67 (11) 79 (9)
CUP 38 (12) 59 (12)
ACM 29 (15) 3 (T63)
IOP 26 (16) 23 (16)
Articles + CA
Publisher
Springer
Nature
147 (2) 150 (2)
Wiley 67 (5) 80 (4)
ACS 45 (6) 46 (6)
CUP 7 (14) 20 (11)
ACM 0 1 (T58)
IOP 6 (16) 10 (13)
20. More data!
- What data can be added from Dimensions for
this analysis?
- How does it compare to Scopus and Web of
Science data?
- How does it compare to Publisher reported
data?
- Is publisher reported data reliable and
accurate?
- Is the data accurate enough to draw confident
conclusions?
21.
22.
23.
24.
25.
26.
27. 2020 2021 2022 2023
ACM (*July Start) Year -1 Year 0 Year “½”
ACS Year -1 Year 0 Year 1* Year 2
Cambridge Year 0 Year 1 Year 2 Year 3
IOP Year -1 Year 0 Year 1
Springer Year -1 Year 0 Year 1 (Hybrid
OA)
Wiley Year -1 Year 0 Year 1 Hybrid
OA
Year 2 (Hybrid +
Fully OA)
SCELC Transformative Agreement timeline
31. ONLY 2
ALL FOUR
ONLY 3
UNIQUE
251
332
275
418
1267 Articles in 2023
for USC & Chapman
32. ONLY 2
ALL FOUR
ONLY 3
UNIQUE
20%
26%
22%
33%
1267 Articles in 2023
for USC & Chapman
33. OA Models by Data Source
In this dataset, across the 251 articles that matched, they generally agree on OA model
34. Unmatched Records - Dimensions v. Publisher
When they don’t match, you have to inspect the data. In this example, Dimensions
says Bronze but the Publisher didn’t report them as Bronze
41. Challenges with Publisher-provided data
Different Datasets provided
● All Articles (very rare)
● All Corresponding Author Articles
● All Corresponding Author Articles that are
eligible
● Only articles made open by the agreement
Publication history incomplete
● Negotiation year missing
Differing terminology
● For Field Naming
● For categories within the matched fields
Wide diversity of data fields
● Across Publishers
● Within publishers across years
42.
43. Distilling Publisher-Reported Publication Data
3+ Annual reports x 6 publishers x ~2 formats per pub > Lots of data to merge
[Institution Name Mapping] Cross-publisher translator
[Field Name Matching]
[Within Field Category Matching]
[Filling in missing core data]
44. Aligning Dimensions & Publisher Article Access Data
Dimensions Access Type Designation Publisher Data Access Type Designation
Gold OA Gold OA
Hybrid OA
Hybrid OA
Bronze OA
Green OA
Closed
Closed
Ineligible [Article Type] Ineligible [Article Type or Agreement Structure]
48. Tentative answers:
Can institutions effectively predict and/or evaluate corresponding author
publishing patterns using one or more of the commonly available article data
sources?
● Yes, but it takes a great deal of effort and expertise
How consistent are the results across these four data sources?
● Patterns are relatively consistent, specifics (& therefore value) are not
49. Reflection: Context is Crucial
● Eligible and Ineligible
● Closed and Open
● Before and After the agreement
● Read value and Publish Value
● Cross-publisher averages
● Peer Comparisons
● Controlled Usage by year of publication: All year’s content & this year’s content
50. ● Reflection: The ideal Transformative agreement Dashboard / DSS will have:
○ Predictive & Evaluative views
○ Extensive time series (5 year minimum including before & after)
○ Single Agreement Specifics
■ All fields available from each publisher
■ Distinct & combined Read value & Publish value charts
○ Extensive Data Augmentation
■ Journal Subject Addition / Alignment
■ Missing data - e.g. closed articles may need to be added
■ Missing core fields - e.g. Subject/Discipline, article title, ISSN, Reuse License, etc
○ Usage Data integration
■ Controlled unique item requests
■ Controlled unique use by year of publication (excluding usage year)
○ Sophisticated agreement comparison functionality
■ Number of articles toggles to proportion of articles
■ Ability to align agreements by start year rather than calendar year
■ Context from other/comparable institutions?