Charleston Conference 2017 - What's Past is Still Messing With Our Workflows

•Download as PPTX, PDF•

0 likes•117 views

Scholars Portal has been aggregating locally loaded ebooks for Ontario universities on an eBrary-backed platform since 2009— eons ago in the world of library technology! Over the last year, Scholars Portal Books has received a rewrite from the ground up, and this time the focus is on building a platform that anticipates the future of ebook access and usage. After extensive community consultation, we knew we wanted a front-end that was accessible beyond legal requirements, a back-end where metadata was harmonized using the BITS standard, and an admin module where local eresource staff could generate usage stats, MARCs, and KBART title lists on the fly. Finally, we needed all of it set up to move through a Trustworthy Digital Repository. No surprise, there have been many challenges along the way, and none of them are unique to consortia: How do we handle corrections to old PDFs? What do we do with six ISBNs? More broadly, how do we support local scholarship at scale, and how can we make space for the open educational resources increasingly being integrated into higher education? This session will look at the complex ebook landscape through a consortial lense — from licensing and entitlements management, to wrangling a dozen XML schemas and implementing ever-changing DRM restrictions, towards the ultimate goal of preserving Ontario university’s books content for the long term

What’s past is…
still messing with our
workflows
Jacqueline Whyte Appleby
Scholars Portal
Ontario Council of University Libraries

What is the future
of ebooks management?

Context - Ontario
● 21 universities
● All are public, all have a research mandate.
● Range in size from 1,300 to 83,000 students
● Their libraries work together through OCUL,
the Ontario Council of University Libraries.

Context - Scholars Portal
● Scholars Portal builds & maintains digital services for Ontario university libraries.
● Mix of content & member services
● Locally loading journals since 2001, books since 2009
● Journals platform has been TDR certified since 2013, Books is next

Context - Books platform
● Hosts about 250,000 commercial texts, 400,000+ OA or public domain texts
● PDF & XML-based texts
● Platform released in 2009, software it’s built on sunsetted in 2011
● Platform redevelopment 2016-2018

1. Files will be sent in standard packages
& formats

2. We’ll get MARCs for everything
The dream:
1. Get 1000 PDFs
2. Get 1000 MARCs
3. Match
4. Load everything
5. Celebrate
The reality:
1. Get some PDFs
2. Maybe get some MARCs?
3. Try to match
4. ¯_(ツ)_/¯
5. Load what we can
6. Cry

In sum
● Lack of standardization
● Third party miscommunication
● Licenses are a wild ride

Harmonized metadata
● BITS (Books Interchange Tag Suite) is the sister
XML tag suite to JATS
● Allows for book-level and chapter-level metadata
● All publisher metadata is crosswalked to BITS,
then ingested into MarkLogic

Accessible, accessible, accessible
● ACE: respond to Accessibility for Ontarians with Disabilities Act, reduce duplication, offer our
students more.
● With a token, students can access scanned copies of books from their local collections
● Now: access the whole of their ebooks entitlements, request alternate formats on the fly.

Admin tool for all w/ hierarchical collections
● KBART on the fly
● MARC packages on the fly
● COUNTER 5 stats on the fly
...it’ll be pretty fly

DRM - the friendly version
● The friendly version is no DRM
● The second friendliest version replicates the experience
non-DRM restricted content as closely as possible

Corrections
● There is no standard for how corrections to an already-loaded book are sent.
● If it’s the whole book - is it a duplicate?
● If it’s a page or chapter - how to integrate?

Adding local content
● ETD
● Other IR content
● Other local publications
● Workflows?
● Metadata?
● Entitlements?

OER
● The development of a provincial OER strategy is a hot topic
● First of all: preservation
● But what about copy-editing, remixing, peer-review within the system?

Web archiving
● Archive-It use is on the upswing
● How can we make a home for non-PDF content?
● How can institutions contribute their own collections?

What is the future
of managing ebooks?
Stewardship

Thanks for listening!
Questions?
jacqueline@scholarsportal.info
The Scholars Portal Books team is:
Bartek Kawula, Sadia Khwaja, Ivan Jankovic,
Sunil Manikonda, Ravit David, Annie Thomas
Selvarajan, Jacqueline Whyte Appleby
With support from: Kate Davis, Amaz Taufique,
Bikram Singh, Harpinder Singh, and Carlos
McGregor Muro.

This presentation is about evaluating open source software, and it presents a group of guidelines, which can be divided into five categories: - Evaluating features and functionality - Evaluating technologies and software architecture - Evaluating software licensing - Evaluating the community - Evaluating my organization and its resources The presentation introduces useful guidelines for organizations considering an open source strategy. It will also present a case study which focuses on evaluating open source software as a part of the New Library System (NLS) project coordinated by the National Library of Finland (NLF).

Applying and Extending Semantic Wikis for Semantic Web Courses

Open University in the Netherlands

This work describes the application of semantic wikis in distant learning for Semantic Web courses. The resulting system focuses its application of existing and new wiki technology in making a wiki-based interface that demonstrates Semantic Web features. A new layer of wiki technology, called “OWL Wiki Forms” is introduced for this Semantic Web functionality in the wiki interface. This new functionality includes a form-based interface for editing Semantic Web ontologies. The wiki then includes appropriate data from these ontologies to extend existing wiki RDF export. It also includes ontology-driven creation of data entry and browsing interfaces for the wiki itself. As a wiki, the system provides the student an educational tool that students can use anywhere while still sharing access with the instructor and, optionally, other students. Lloyd Rutledge and Rineke Oostenrijk. Applying and Extending Semantic Wikis for Semantic Web Courses, In: Proceedings of the 1st International Workshop on eLearning Approaches for the Linked Data Age (Linked Learning 2011) at the 8th Extended Semantic Web Conference (ESWC 2011), Heraklion, Greece, May 29th, 2011. http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-717/paper9.pdf

Its2 ontology-localization

Felix Sasaki

Data exchange over internet (XML vs JSON)

Wajahat Shahid

Are Library Services Ready for the Mobile Web?kevinreiss

Empowering the Reader in a Digital World

St. Petersburg College

Many of us have been working with the EPUB standard for several years, but one long-standing challenge for Canadian publishers, particularly small ones, is that the requirements for accessible EPUB 3 can seem very daunting. Recent work by Laura Brady shows that many Canadian publishers are still producing EPUB 2, and those who are producing EPUB 3 are not making great use of accessibility features. To try to address some of the problems, we’re working on an accessible publishing summit to be held in November or January. The purpose of this summit is to better understand who can do what to create accessible EPUB files, and then explain this work to stakeholders along the EPUB publishing chain. Our overall goal is to support publishers in making their books accessible from the start — this approach both maximizes the market for reading, and saves taxpayer funds on alternate-format production for students and other readers who have print disabilities. If a book is produced correctly, it will work with a variety of assistive technology tools without requiring further intervention. To get to that point, we need to agree about what our standards are, and then build an understanding of the workflows that are most likely to result in meeting those standards. This is what we hope to do at our summit, and then present at ebookcraft 2019. March 19, 2019 ebookcraft.booknetcanada.ca #Ebookcraft

Of Dodos, 'Karma' & Free Software in the Library

Indranil Das Gupta

2012 Software Freedom Day Presentation about Koha ILMS

RYAN T.

Kerscher, Gunderson, and Wise "Unprecedented Access: Improving the User Expe...

National Information Standards Organization (NISO)

2011 ATE Conference Panel Session

American Association of Community Colleges

PDA, DDA, UDA --- OMG!

klm-shsu

Day3 edupub tokyo_idpf

Japan Electronic Publishing Association

Managing eResources at Universities

PK Mishra

Rapid progress in information technology and electronic communications in the last few decades have profound impact on the way we gather, store, disseminate and consume information. Methodologies and tools for converting information to knowledge have also been very successful. All these have put a lot of pressure on traditional content storehouses like libraries to harness the new technologies for the benefits of their users. Since Universities around the world own most of these libraries, they have been trying to embrace these newer technologies and have devised suitable methods that are beneficial to their users. They have created a new category of content called e-resources out of all forms of electronic documents and media. In the last few years, investments in these e-resources have increased many folds. Groups of universities have come together to collaboratively address the situation. In many cases, including India, Governments have also funded much of these efforts. There are parallel efforts of creating additional knowledge resources by individuals for the consumption of individuals. In fact, the volume of effort in this area has been so large that lot of younger people are beginning to break away from traditional library and university system and greatly depend on these open sources. Proliferation of sources like Google, YouTube, Edx, Moocs and ResearchGate have been nothing short of explosive and has perhaps created the largest knowledge democracy. In this talk, we look at the much of these developments, their implications and discuss a few use cases. We have also suggested an architecture based on contemporary IT scenarios that will help to plan and setup an e-Resources infrastructure in a University that may be making efforts to either start it or upgrade their existing setup. The talk concludes by suggesting a few areas of cooperation between the Universities and creating a scale that can dominate in the area of spreading validated information and create a widely spread knowledge-based society.

Object-oriented analysis and design

Ahmed Elnaggar

eResources for Ontario Universities

Jacqueline Whyte Appleby

One Button Publishing

Clint Lalonde

Kerscher "Accessibility in a Nutshell: What Every Publisher, Educator, and Li...

National Information Standards Organization (NISO)

Open Chemistry, JupyterLab and data: Reproducible quantum chemistry

Marcus Hanwell

The Open Chemistry project is developing an ambitious platform to facilitate reproducible quantum chemistry workflows by integrating the best of breed open source projects currently available in a cohesive platform with extensions specific to the needs of quantum chemistry. The core of the project is a Python-based data server capable of storing metadata, executing quantum chemistry calculations, and processing the output. The platform exposes RESTful endpoints using programming language agnostic web endpoints, and uses Linux container technology to package quantum codes that are often difficult to build. The Jupyter project has been leveraged as a web-based frontend offering reproducibility as a core principle. This has been coupled with the data server to initiate quantum chemistry calculations, cache results, make them searchable, and even visualize the results within a modern browser environment. The Avogadro libraries have been reused for visualization workflows, coupled with Open Babel for file translation, and examples of the use of NWChem and Psi4 will be demonstrated. The core of the platform is developed upon JSON data standards, and encouraging the wider adoption of JSON/HDF5 as the principle storage mediums. A single page web application using React at its core will be shown for sharing simple views of data output, and linking to the Jupyter notebooks that documents how they were made. Command line tools and links to the Avogadro graphical interface will be shown demonstrating capabilities from web through to desktop.

eReaders and ePublishing: developing a model for flexible and open distance l...

Centre for Distance Education

Application of Library Management Software: NewGenLib

David Nzoputa Ofili

2016 EDRLab roadmap at epubsummit

Laurent Le Meur

Building data "Py-pelines"

Rob Winters

Messaging

Sean Kelly

DITA, HTML5, and EPUB3 (Content Agility, June 2013)

Contrext Solutions

It's Hard to Say Goodbye

Jacqueline Whyte Appleby

OLA Super Conference Hackfest

Jacqueline Whyte Appleby

Similar to Charleston Conference 2017 - What's Past is Still Messing With Our Workflows

Interactive E-Books

Christian Glahn

Who Does What to Make Great EPUB? How to Build an Airplane in Mid-Air - Sabin...

BookNet Canada

Of Dodos, 'Karma' & Free Software in the Library

Indranil Das Gupta

2012 Software Freedom Day Presentation about Koha ILMS

RYAN T.

Kerscher, Gunderson, and Wise "Unprecedented Access: Improving the User Expe...

National Information Standards Organization (NISO)

2011 ATE Conference Panel Session

American Association of Community Colleges

PDA, DDA, UDA --- OMG!

klm-shsu

Day3 edupub tokyo_idpf

Japan Electronic Publishing Association

Managing eResources at Universities

PK Mishra

Object-oriented analysis and design

Ahmed Elnaggar

eResources for Ontario Universities

Jacqueline Whyte Appleby

One Button Publishing

Clint Lalonde

Kerscher "Accessibility in a Nutshell: What Every Publisher, Educator, and Li...

National Information Standards Organization (NISO)

Open Chemistry, JupyterLab and data: Reproducible quantum chemistry

Marcus Hanwell

eReaders and ePublishing: developing a model for flexible and open distance l...

Centre for Distance Education

Application of Library Management Software: NewGenLib

David Nzoputa Ofili

2016 EDRLab roadmap at epubsummit

Laurent Le Meur

Building data "Py-pelines"

Rob Winters

Messaging

Sean Kelly

DITA, HTML5, and EPUB3 (Content Agility, June 2013)

Contrext Solutions

Similar to Charleston Conference 2017 - What's Past is Still Messing With Our Workflows (20)

Interactive E-Books

Who Does What to Make Great EPUB? How to Build an Airplane in Mid-Air - Sabin...

Of Dodos, 'Karma' & Free Software in the Library

2012 Software Freedom Day Presentation about Koha ILMS

Kerscher, Gunderson, and Wise "Unprecedented Access: Improving the User Expe...

2011 ATE Conference Panel Session

PDA, DDA, UDA --- OMG!

Day3 edupub tokyo_idpf

Managing eResources at Universities

Object-oriented analysis and design

eResources for Ontario Universities

One Button Publishing

Kerscher "Accessibility in a Nutshell: What Every Publisher, Educator, and Li...

Open Chemistry, JupyterLab and data: Reproducible quantum chemistry

eReaders and ePublishing: developing a model for flexible and open distance l...

Application of Library Management Software: NewGenLib

2016 EDRLab roadmap at epubsummit

Building data "Py-pelines"

Messaging

DITA, HTML5, and EPUB3 (Content Agility, June 2013)

More from Jacqueline Whyte Appleby

It's Hard to Say Goodbye

Jacqueline Whyte Appleby

OLA Super Conference Hackfest

Jacqueline Whyte Appleby

More Licenses, More Problems

Jacqueline Whyte Appleby

How to talk to your users about why eBooks are terrible. In 2013, the OCUL consortium purchased scholarly eBook collections with much stricter DRM. This session will explore the implications of this new model on technological support and infrastructure within the consortium, and will examine usage data and user feedback to illustrate how library users are accessing (or not accessing) borrowable eBooks. Presented at ER&L 2014 Austin, Texas Jacqueline Whyte Appleby & Meghan Ecclestone

Social media for conference networking

Jacqueline Whyte Appleby

Communicating Changes in Digital Services - #OLASC14

Jacqueline Whyte Appleby

Collaborative Data Mark-up & Distribution

Jacqueline Whyte Appleby

More from Jacqueline Whyte Appleby (6)

It's Hard to Say Goodbye

OLA Super Conference Hackfest

More Licenses, More Problems

Social media for conference networking

Communicating Changes in Digital Services - #OLASC14

Collaborative Data Mark-up & Distribution

Recently uploaded

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx

Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia

Biological Screening of Herbal Drugs in detailed.

Ashokrao Mane college of Pharmacy Peth-Vadgaon

Biological screening of herbal drugs: Introduction and Need for Phyto-Pharmacological Screening, New Strategies for evaluating Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and Antifertility, Toxicity studies as per OECD guidelines

A Strategic Approach: GenAI in Education

Peter Windle

Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction. This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.

The basics of sentences session 5pptx.pptx

heathfieldcps1

World environment day ppt For 5 June 2024

ak6969907

South African Journal of Science: Writing with integrity workshop (2024)

Academy of Science of South Africa

Group Presentation 2 Economics.Ariana Buscigliopptx

ArianaBusciglio

Digital Artefact 1 - Tiny Home Environmental Design

amberjdewit93

DRUGS AND ITS classification slide share

taiba qazi

PCOS corelations and management through Ayurveda.

Dr. Shivangi Singh Parihar

Lapbook sobre os Regimes Totalitários.pdf

Jean Carlos Nunes Paixão

Thesis Statement for students diagnonsed withADHD.ppt

EverAndrsGuerraGuerr

Natural birth techniques - Mrs.Akanksha Trivedi Rama University

Akanksha trivedi rama nursing college kanpur.

"Protectable subject matters, Protection in biotechnology, Protection of othe...

SACHIN R KONDAGURI

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...

National Information Standards Organization (NISO)

The Diamonds of 2023-2024 in the IGRA collection

Israel Genealogy Research Association

বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf

eBook.com.bd (প্রয়োজনীয় বাংলা বই)

বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত .. আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ... তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...। বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...

Digital Artifact 1 - 10VCD Environments Unit

chanes7

Your Skill Boost Masterclass: Strategies for Effective Upskilling

Excellence Foundation for South Sudan

MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...

NelTorrente

Recently uploaded (20)

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx

Biological Screening of Herbal Drugs in detailed.

A Strategic Approach: GenAI in Education

The basics of sentences session 5pptx.pptx

World environment day ppt For 5 June 2024

South African Journal of Science: Writing with integrity workshop (2024)

Group Presentation 2 Economics.Ariana Buscigliopptx

Digital Artefact 1 - Tiny Home Environmental Design

DRUGS AND ITS classification slide share

PCOS corelations and management through Ayurveda.

Lapbook sobre os Regimes Totalitários.pdf

Thesis Statement for students diagnonsed withADHD.ppt

Natural birth techniques - Mrs.Akanksha Trivedi Rama University

"Protectable subject matters, Protection in biotechnology, Protection of othe...

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...

The Diamonds of 2023-2024 in the IGRA collection

বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf

Digital Artifact 1 - 10VCD Environments Unit

Your Skill Boost Masterclass: Strategies for Effective Upskilling

MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...

Charleston Conference 2017 - What's Past is Still Messing With Our Workflows

1. What’s past is… still messing with our workflows Jacqueline Whyte Appleby Scholars Portal Ontario Council of University Libraries

2. What is the future of ebooks management?

3. Context - Ontario ● 21 universities ● All are public, all have a research mandate. ● Range in size from 1,300 to 83,000 students ● Their libraries work together through OCUL, the Ontario Council of University Libraries.

4. Context - Scholars Portal ● Scholars Portal builds & maintains digital services for Ontario university libraries. ● Mix of content & member services ● Locally loading journals since 2001, books since 2009 ● Journals platform has been TDR certified since 2013, Books is next

5. Context - Books platform ● Hosts about 250,000 commercial texts, 400,000+ OA or public domain texts ● PDF & XML-based texts ● Platform released in 2009, software it’s built on sunsetted in 2011 ● Platform redevelopment 2016-2018

6. ~ ASSUMPTIONS ~

7. 1. Files will be sent in standard packages & formats

8. 1. Files will be sent in standard packages & formats

9. 2. We’ll get MARCs for everything The dream: 1. Get 1000 PDFs 2. Get 1000 MARCs 3. Match 4. Load everything 5. Celebrate The reality: 1. Get some PDFs 2. Maybe get some MARCs? 3. Try to match 4. ¯_(ツ)_/¯ 5. Load what we can 6. Cry

10. 3. Everyone will buy everything

11. 4. DRM will be loose or non-existent

12. In sum ● Lack of standardization ● Third party miscommunication ● Licenses are a wild ride

13. Now what?

14.

15. #BOOKSGOALS

16. Harmonized metadata ● BITS (Books Interchange Tag Suite) is the sister XML tag suite to JATS ● Allows for book-level and chapter-level metadata ● All publisher metadata is crosswalked to BITS, then ingested into MarkLogic

17. Accessible, accessible, accessible ● ACE: respond to Accessibility for Ontarians with Disabilities Act, reduce duplication, offer our students more. ● With a token, students can access scanned copies of books from their local collections ● Now: access the whole of their ebooks entitlements, request alternate formats on the fly.

18. Long-term preservation

19. Admin tool for all w/ hierarchical collections ● KBART on the fly ● MARC packages on the fly ● COUNTER 5 stats on the fly ...it’ll be pretty fly

20.

21. DRM - the friendly version ● The friendly version is no DRM ● The second friendliest version replicates the experience non-DRM restricted content as closely as possible

22.

23.

24. Unsolved challenges ~ an interlude ~

25. Corrections ● There is no standard for how corrections to an already-loaded book are sent. ● If it’s the whole book - is it a duplicate? ● If it’s a page or chapter - how to integrate?

26. Chapter-level mark-up vs

27. #BIGGERBOOKSGOALS

28. Adding local content ● ETD ● Other IR content ● Other local publications ● Workflows? ● Metadata? ● Entitlements?

29. OER ● The development of a provincial OER strategy is a hot topic ● First of all: preservation ● But what about copy-editing, remixing, peer-review within the system?

30. Web archiving ● Archive-It use is on the upswing ● How can we make a home for non-PDF content? ● How can institutions contribute their own collections?

31. What is the future of managing ebooks? Stewardship

32. Thanks for listening! Questions? jacqueline@scholarsportal.info The Scholars Portal Books team is: Bartek Kawula, Sadia Khwaja, Ivan Jankovic, Sunil Manikonda, Ravit David, Annie Thomas Selvarajan, Jacqueline Whyte Appleby With support from: Kate Davis, Amaz Taufique, Bikram Singh, Harpinder Singh, and Carlos McGregor Muro.

Editor's Notes

Good morning, my name is Jacqueline Whyte Appleby and I’m the Scholarly Resources Librarian with Scholars Portal, the Ontario Council of University Libraries Despite the title, I only plan to spend a little while talking about what’s messed up…the real question I want to address now isn’t why everything is so messy but
How do we anticipate we anticipate the future of ebooks? My organization is building an ebooks platform now, fully aware that the ebooks landscape will be vastly different in five years. How do we get ready for what’s ahead? That’s not just a technology question, it’s a licensing question, it’s a budget question, it’s a staff development question. For today I’m mostly going to treat it as a technology question.
Scholars Portal is the service arm of OCUL Content: local hosting & discovery point for books, journals, microdata, geospatial data Member: ILL, chat reference, Somewhere in the middle: accessible texts repository, research data management support Most of them do not participate in Portico, We are the long term preservation strategy of many of our members. This means that almost every license that OCUL negotiates for journals and books has a local load clause, ariculates the rights we need for long term preservation.
Publishers loaded include IEEE, Wiley, Taylor & Francis, Springer, Morgan & Claypool, Oxford, Cambridge, all Canadian university presses, many other UPs Theplatform was released in late 2009, and the back end built using software that was shortly thereafter purchased by a major library vendor and not developed any further after 2011. So this is olllld. And kind of a black box in a lot of ways. We began a redevelopment of the platform in the summer of 2016. It’s available to all of our library staff in beta right now, and its public beta is scheduled for release in January. Full release in April. This is a really exciting opportunity for us because
We made some assumptions way back in 2009. I do want to talk briefly about these assumptions, because I think a lot of us are still carrying around these assumptions, maybe not explicitly, but buried in our workflows.
When we get journals, they come to us as issues, with a whole bunch of PDFs, one for each article. They always come like this. The formats and structures in which books are packaged are much, much more diverse. These are some pretty standard formats. The blue represents a folder, pink and red represent different file types.
But there are also ways that we might get the content. For stuff that crosses many years, different books might have different file types. So we might have really good XML mark up, but only for books from 2016 onward. Or we might get some books with chapters, some books as single PDFs. Might get TEI. Might just get an excel file. So what this means is that it’s impossible to write loaders that can account for all of these configurations. It needs to know where to find Front Matter, it needs to know if it should be looking to concatenate chapters. And automation has been a real challenge too because this will change from load to load — a new person will be packaging the files and will do it differently. It’s not to blame them— there’s no standard for how these should come.
We assumed this one so hard, we purchased a platform is dependent on it. We cannot load a book without an associated MARC record. That’s the only place the software is capable of grabbing metadata from. But the workflow for sending PDFs and MARCs together is a dream and not a reality. The reality of course is that there are always going to be some records missing, or a delay in sending them. But the reality is also that often third parties are hired to create the records, so there’s a communication disconnect. And in these cases what happens is we get emails from librarians and faculty going, “my book just came out where is it!!!!” and we can’t do anything until we have a record.
Think back to the magical time that was 2009 when the big deal was still generally accepted as the way to get content. So our entitlements module was set up with the understanding that folks either bought the Oxford ebooks collection, or they didn’t. Fast forward to 2017 and we have: schools buying incredibly granular packages schools dropping out of deals halfway through a year, which means breaking up those entitlements. We have mergers and acquisitions which mean metadata no longer distinguishes between two publishers, while backlist is still sold separately. We have package configurations that change over time as publishers build or cease certain subject areas. And we have some schools buying through third party vendors while other schools are buying directly through publisher. Same books, but possibly sold in different configurations, and also possibly with different licensing terms.
Which leads me to my last assumption— we went many many years only signing deals without concurrent use restrictions. And then one day a concurrent use restricted license showed up, and our software was not designed to deal with that. In many cases users are not allowed to print or copy even a single page from high demand content. We wound up integrating Adobe Content Server into the system, which did not go over well. A major selling point of our platform is a consistent interface for many different ebook packages, and this broke this. It was a surprise and an annoyance to users who were used to always being able to access everything in the browser.
So we made some assumptions, and we were wrong, because of these things… - And I don’t think any of these issues have really gone away, so they need to be at the front of our minds as we go forward.
So: we have an amazing opportunity to do things differently, to learn from past trials & tribulations. We got funding to hire two new programmers full time for two years. It has been fantastic to have new staff not only because it’s more people working on everything but because they are not bogged down in the history of the project. They are able to step back and say, “why?” Our new books platform is running on MarkLogic, which is the software we’ve been using for Journals for years. It’s also what Healthcare.gov runs on. It’s also what the NSA uses, I hear. It can handle a lot of data.
This is what the reading experience will be like on the new platform. All PDFs are rendered as HTMLs, which makes for a much more fluid reading experience. The bar on the left can be toggled closed.
In broad terms, we want a better user experience - (like everyone). But we’re really trying to think broadly about who who a user might be it’s a user at an OCUL institution It’s a user at an OCUL institution with a visual or perceptive disability It’s a library staff member at an OCUL institution It’s a publisher It’s our own staff It’s anyone in the world interested in the growing open access and public domain content we load
BITS is the sister XML tag suite to JATS, which we use for journals. The plan is that a certain point books and journals content can be more integrated. They currently sit on two different platforms. Having related metadata standards is one of the most important steps we can take in that direction. We don’t get any metadata in BITS, we need to write crosswalks for 100% of the content we receive. We didn’t need to used to do that because a MARC is a MARC, just ingest what you get. We’re now able to make use of much richer metadata...but it’s always go a unique DTD. Onix 2 is not the same from publisher to publisher. Long term we hope BITS will be more widely adopted. JATS is pretty prevalent now, it’s been deemed pretty useful.
As you probably know, students with a disabilties can request that a print book be scanned for them but this takes time and once they have the file, where is kept? The idea with ACE was to centtralize that process - scan once, then make available to all schools that have a local copy. And host it on the books platform so that students with disabilities can search and discover other works they have access to. BUT it was its own separate portal. It’s now been fully integrated into the Books platform. Authenticated students with a registered disability will have access to the ACE collection but they will also be able to search and download any book they’re entitled to in the system – that’s 250,000 titles from most commercial publishers.. And they’ll be able to request an alternate format if a standard PDF doesn’t work for them. Internet Archive has a tool that generates alternate formats, and turnaround is a couple of days. Just to be clear - our schools do not sign licenses that do not give us the right to transform content for users with disabilities.
Journals is a TDR, Books is not yet. So part of building the new platform has been figuring out the workflow for this. The workflow will be different from journals, the landscape has changes since we began preserving that content. We will likely be using Archivematica, a tool for creating Archival Information Packages, and we’ll store the final product in the Ontario Library Research Cloud, our distributed cloud storage network which has nodes at five universities across the province. Preservation is as much about policy as it is about technology, so we’re really confirming our rights with each renewal. We need the right to locally load, the right to perpetual access, and the right to transform content over time. We cannot preserve a collection without those three grants, and our schools do not sign licenses that do not allow for long term preservation.
As I said, eresource staff are users too, and we want to build a library of MarkLogic queries that mean staff can at any time grab a current list of titles loaded, can get MARCs if they use them—and a list of books that don’t have an associated MARC! —, and can also get COUNTER 5 stats. It does not make sense for us to work in COUNTER 4 at this point, so some crosswalking will be necessary.
Collections are nested and hierarchical, so we can break them down as far as we need to.The admin tool is live, but the buttons are a mock-up.
As I said, we weren’t set up to monitor concurrent use in the browser, we now feel fairly confident that we can…but we don’t have good data on user behaviour because no one’s had an option to not use ADE. Are there people who do actually prefer that format
So we’re going to offer both. If you just want to browse a bit, or read a single chapter, you can do that in the browser. If you want to really engage with the book, you can download it using Adobe Digital Editions. And if a year from now we can see that no one is bothering to check out the books, we’ll probably cease using it.
Again just a mock up
Those are things we feel pretty confident about putting into place, these are challenges we’re
This is an extremely unsexy issue that we have got to figure out how to deal with, but even better would be some standards organization deal with it and demand everyone fall in line. Since we can get initial deliveries in a standard format, I’m not holding out a lot of hope but...corrections are an issue.
Poor mark up. It looks bad and is confusing, and is also an accessibility issue. To publishers credit most are very receptive to feedback, and we expect 2018 and 2019 books to look better, but no one is going to go back and fix these. The work it would take to teach a computer to scan a book and recognize a chapter heading and then replace what we’ve got is ….a lot for a small issue, so it’s on the back burner.
So that’s the stuff we’ve done in the last year, or we know we need to get done shortly. But we also know there are broader, bigger changes happening, We know that digital scholarship is pushing the boundaries of what a book is —that the monograph read cover to cover or a chapter at a time is no longer the most useful unit of self-expression or study. So the bigger question is: how can we be adaptable to what comes next?
Right now we’re getting one offs - often local faculty publications or local conference proceedings. And we deal with all of that by email. Could this act as a more repository-like tool? We have the underlying preservation infrastructure in place. Who’s in charge of making sure metadata is good and entitlements are accurate?
Long term hosting, but also allowing for in-site remixing? Displaying reuse clauses Notification of reuse Versioning Discovery? Situation where instructor at one school wants to use book in a certain configuration, another instructor at another school wants to use it in an alternate way. Can we host both versions in a way that is useful? Preservation on the module level? In site peer review?? Can we integrate with tools like PressBooks to allow for copy editing, remixing, peer-review within the system?
ArchiveIt is a tool for capturing and preserving web data. There’s a lot of concern about the preservation of local history and municipal documents. Federal and provincial there are some mandates in place, but noting at the more local level. For instances where PDFs are available, there’s interest in creating metadata and hosting the PDFs locally. It’s a lot of work, but we’re well set up to host PDFs But what about web data? Can we flexible enough to be able to display archived versions of websites? We think this will be an important piece of the preservation puzzle, long term. But to be useful, they should be indexed and searchable.

Charleston Conference 2017 - What's Past is Still Messing With Our Workflows

Recommended

Recommended

More Related Content

Similar to Charleston Conference 2017 - What's Past is Still Messing With Our Workflows

Similar to Charleston Conference 2017 - What's Past is Still Messing With Our Workflows (20)

More from Jacqueline Whyte Appleby

More from Jacqueline Whyte Appleby (6)

Recently uploaded

Recently uploaded (20)

Charleston Conference 2017 - What's Past is Still Messing With Our Workflows

Editor's Notes