2. specialists in developing and
delivering projects including
large-scale online services,
mobile apps and digital tools.
Based at Part of Information Services
Significant expertise in Library Support and Geospatial technology and services
3. Service Portfolio includes…
• SUNCAT
• Keepers Registry
• UK LOCKSS Alliance
• CLOCKSS
• Statistical Accounts of Scotland
• Digimap
• Agcensus
• New services in development for 2018…
• Entitlement Registry
• Site2Cite
• Noteable (UoE focused)
• Text and Data Mining Platform (UoE focused)
3
4. The Keepers Registry
Who is looking after what e-journals: http://thekeepers.org
• Extract information about title-level archiving
• Gather evidence for decisions
– Transition to e-only; Retention and disposal of
print collections.
• Generate coverage reports for lists of serials at
publisher and collection level.
• Understand the extent still "at risk of loss” and
segment responsibility for action.
• Progress at January 2014:
– 21,557 e-journals reported as archived
– 154,745 ISSN assigned to online serials
= Ratio of 14% as ingested and archived
• Progress at June 2018:
– 38,570 e-journals reported as archived
– 224,761 ISSN assigned to online serials
= Ratio of 17% as ingested and archived
5. Next generation tools for the e-Reader
• Text and Data Mining: Computational tools and processes to extract
information from unstructured text
– Automatically give structure and assist discovery and navigation
• Machine learning will open up new possibilities for content synthesis,
analysis and interaction
• Initiating a text mining Platform-as-a-Service to radically open
unstructured text collections
– Enable discovery of information previously buried in text corpuses
– Create new knowledge sources for evidence-based decision making
6. Researcher datasets
- Consultation outputs,
- Clinical Trials and EHRs
University
Instituted Data
- E-Thesis
- Open Access Outputs
- Research Proposals and
Grants
- Special Collections
Edinburgh Region
- External interactions
- Regional datasets and
local usage
Licensed Data
- Academic literature
- Geospatial Data
- National Library of
Scotland
- Newspapers
TDM Platform
- 1. Infrastructure: Dataset
management
- 2. Capabilities: Data pre-
processing
- 3. Tools: Data analysis
- 4. Skills: for 100k data science
training
By making generic tools available across a range of
stakeholders, we will increase competencies and
cater to more datasets and needs.
There is cumulative value in this approach:
• We avoid duplicated or siloed effort
• We retain knowledge once resources stop
• We maximise readability by reusing tools
• We develop unique organisational knowledge and strengths
• We develop staff skills as a valuable non-tangible asset
Virtual research
environment
Secure Virtual research
environment
- Offer a walled garden if
required
Commercially-permitted
Virtual research
environment
Virtual research
environment
Building a shared platform helps develop
practical skills and raise awareness of possible
applications.
7. Research and Education
- Significant time savings for the e-
reader on analysis, leading to
completion of more research.
- ‘Drag and drop’ interface to lower
barriers of entry
- Search literature for molecules and genes
- Augment the systematic reviews process.
- Evolution of language and policy in e-thesis
- Identify problems and outcomes in Clinical
Trial data or Electronic Health Records.
University Corporate Structure
- Acquire rich Business Intelligence
- Improve Student Satisfaction by analysing
survey data and targeting comms to adjust
perceptions.
- Interrogate Pure and Worktribe to identify
research strengths and potential partnerships
- Transform access to the Library and
University’s digitized Special Collections.
- Connect text mining with speech-to-text
recognition systems.
Local Authorities and Regional SMEs
- Faster and richer response to
community and customers
- Local authorities collect case files, planning
applications or free text responses to
consultations:
- accurately gauge public sentiment
- identify unmet needs
- influence opinion
- shape public policy
- identify hidden risks
Use Cases across the Edinburgh region
8. Developing Partnerships
EDINA has an excellent track record in dataset management, digital
infrastructure, service delivery
• Research Libraries
– Research Libraries -> supporting Open Science
• Academic Publishers (AAAS/Science, IoP, EUP, more)
– Supporting Scholarship: Content Distribution, Reproducibility
• Archiving Organisations (e.g. CLOCKSS, Portico)
9. Role of Entitlement Information
• Evidence that an institution has perpetual access to material
• Entitlement information used during renewals / cancellations
– What will we retain access to if we cancel subscriptions?
• Notable big-deal cancellations now taking place across Europe
– Libraries get broad current access under the Big Deal
– Should model change, libraries will need to know what they purchased
9
11. Current Challenges of Entitlement Information
• Variable quality of records
– For both libraries and publishers
– Especially further back in time towards 2000
• No common location to record information
– Spreadsheets, emails, memories of key staff
• Library systems are not designed to capture this
• Publisher systems are not designed to export this
11
13. How this helps the sector
• "An accurate entitlement registry will reduce time spent agreeing renewals and alleviate
pressure on the team… Our busiest period is the first teaching block which coincides with
subscription renewals. "
– Bristol University
• "Having an efficient way of checking our perpetual entitlements would be invaluable. We
could be confident that our users will have access to all the content we should have"
– Cardiff University
• ’By minimize the time spent liaising with publishers to ascertain legacy access rights, our
department would be much more efficient and we could ensure our users have access to
everything they should have"
- University of West of Scotland
13
14. Initial Publisher Response
• Significant publisher recognition of the problem
• Sales reps regularly hear library concerns
• They struggle themselves during journal transfers
– Difficult to get accurate entitlement information during transfers
– Responsibility for providing access to pre-transfer content can vary
14
15. 15
We tried to identify the most
common issues, or ‘flags’ that
might occur.
16. 16
T&F supplied data ->
CUP supplied data ->
Multiple publishers for a title
Indicates a journal transfer
Clarify who is providing
access before date of transfer:
T&F or CUP
17. 17
U.Edinburgh supplied data ->
CUP supplied data ->
Lost access rights:
EUP no longer host the content,
but the historic perpetual access
has not transferred to CUP
18. 18
U.Edinburgh supplied data ->
CUP supplied data ->
Non-matching entitlements
indicates discrepancies between
publisher and library supplied data
(In this case, UoE have only
supplied records back to 2009)
19. 19
U.Edinburgh supplied data ->
CUP supplied data ->
Gaps in entitlements are another
noteworthy flag
If this is a flaw with publisher data, we can
make corrections for all participating
institutions and save individual libraries
from trying to resolve issues
20. 20
CUP supplied data ->
We flag up titles where we only have
data from a single source.
Seek a shared view between libs
and publishers: that comes either
from two sources of data, or sign off
of the other view
21. Strategic Benefits
• Entitlement data is a foundation for strategic initiatives
• Institution-level: Improve the renewal cycle
• UK-level: support inter-library loan, print rationalisation, shared
subscription management
• Need to acquire this data at scale so that it’s cost effective to use
this data operationally
21
22. Pilot
• Small scale pilot (Winter 2017 – Summer 2018) to understand
– Access to data and scalability of data ingest workflows
– Priority features libraries and publishers need at service-launch
• Participating Universities
– Aberystwyth, Birmingham, Bristol, Brunel, Cardiff, Edinburgh, St Andrews,
University of the West of Scotland, York
• Main lesson: interface is simple, data is hard!
22
23. Publishers
– Discussions with Cambridge University Press, Emerald Publishing, IOP
Publishing, Oxford University Press, Sage, Springer, Taylor and Francis,
Wiley, AAAS, American Chemical Society, British Medical Journal,
Elsevier
– Spectrum of progress
• Some have now supplied data for pilot institutions
• Others are scheduling this into their activity plans for 2018
• Others need more demand from libraries before acting
23
24. Data Sources and Quality
• Library data
– Variable data structure: depends on local practices and procedures
– Makes scalability a challenge
– Staff time is limited, and this is a relatively low priority
– Less dependency on publishers
• Publisher data
– Standard data structure so a more scalable approach
– Lower cost
– Secures more buy in from publishers that data is accurate
24
25. Resolving data discrepancies
• What happens when library and publisher disagree?
– Modeled a dispute resolution process
– Use Registry to capture the conversations that already take place
• Help share data corrections out across the UK HE sector
• Data quality improvement process still in development
– First step is to capture information about discrepancies
– The entitlement registry gives a tool to streamline this
25
26. Supporting Team Communication
• Annotations and comments to support current workflows
– Establish good practice with a shared management tool
– Support record keeping by maintaining ‘receipts in a kitchen drawer’
– Inbox and alerting system so that colleagues can be notified and/or
given tasks
– Show a timeline of events: data additions, removals, comments,
queries, disputes
26
27. Scaling Up
• Currently working with nine library institutions
– Growing pilot activity: would welcome your participation
• Increase publisher participation
– Contact 30 publishers by end of 2018
– Initiate positive discussions with subscription agents
– Use Jisc and community leverage to encourage faster adoption
27
28. Service launch
• Aiming for a service launch by end of 2018
• Interest received from a variety of European countries
• Initial objective is to provide a service for UK HE institutions
28