Historical Texts: Blending the old with the new – the challenges behind building a replacement for EEBO and ECCO
Scott Gibbens Senior Service Manager (Jisc eCollections) (JIBS presentation)
Historical Texts: Blending the old with the new – the challenges behind building a replacement for EEBO and ECCO
1. HistoricalTexts: Blending the old with the new – the challenges behind building
a replacement for EEBO and ECCO
28/02/2015
2. Background
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 2
• Jisc Collections purchased the rights to EEBO (Early English Books Online) and ECCO
(Eighteenth Century Collections Online) many years ago
• Those rights initially was to allow academic institutions in the UK to access the
collections via the content providers paying only a small access fee
• Crucially we also have the rights to create are own version of the platforms if we
wish.
• As Proquest and Cengage started to increase the platform fees we decided to build
our own platform.
• In 2011 Jisc Historic Books was launched. It had a number of problems, that were
difficult to resolve using the platform that had been chosen
• In the spring of 2013 our Advisory Board agreed to build a new platform from scratch
– this service – HistoricalTexts – is the service I intend to talk about today
3. What are the challenges?
»The big two
› The data
› The users
»Also..
› Hardware
› Software / Supplier
› Staffing
› Time and Budgets
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 3
4. The data
»We have three collections (EEBO, ECCO and the BL)
»Each collection has radically different data
› EEBO has MARC records to describe each publication
› EEBO also sometimes includesTCP text
› EECO has quite clear metadata for each book.This also
includes some quite poor OCR
› BL has a different style of metadata, which does
include some errors. It also includes OCR, which is a
little better than EECO.
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 4
5. The data
»In native ECCO you cannot see the OCR text, but when
you look at the data the poor OCR becomes clear
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 5
6. The data
» We need to index all that poor OCR so it can be found.
» Some data is not provided by providers. Periodicals data in
Proquest is a clear example.We have one MARC record
covering huge runs of titles.
» Proquest seemed unable to provide us with the additional
data to describe each issue, so we had to get the data
ourselves
» We had to create a search engine that could search all these
collections including options like fuzzy search and variant
spelling search.
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 6
7. The users
»Users also wanted every *single* feature of their old
database made available in the new platform (even if
they didn’t use that feature)
»Different user groups had quite different requirements
»Sometimes they had unrealistic expectations (searchable
text for all EEBO, improved OCR in ECCO)
»They wanted to be consulted.
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 7
8. Hardware
»Initial build started just as Mimas was in the middle of a
major reorganisation
»Mimas was no-longer going to host vast amounts of
servers, instead everything would be on the cloud
»We had to get to grips with this new approach, configure
and set up cloud servers and transfer almost 40Tb of
information over the UoM internet connection (page
images, pdfs, thumbnails and text) to the Amazon cloud
servers in Ireland
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 8
9. Supplier / Software
»We had decided quite early on we wanted a system we
could adapt – we needed to know what was going on
under the lid!
»We had to get a supplier in to help us build the system (at
this time we had just one developer)
»We were keen to go with an open source solution so that
we could take over the solution in the long term
»We needed a supplier that could work with the complex
data and would work with us in a collaborative approac
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 9
10. Staffing
» When we started this the HistoricalTexts and Journal Archives
team had three staff (and that includes me)
» We had the keep the old service running as well as develop a
new service
» We needed to increase the team.We quickly managed to
recruit another support post (Mimas was ending some
contracts), but we had problems with recruiting technical
staff
» It took us over 6 months to recruit another developer (to give
us two developers).
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 10
11. Time and Budgets
»A number of things we did were dictated by time and
budget restrictions
»Our old supplier demanded we turn off the old system by
the end of June 2014, or we would face massive
additional charges (they owned the ip of the site)
»We have had to pick and choose developments
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 11
12. We faced the challenges
» Data –We ended up having to “page scrape” some data from EEBO just
to make the periodicals searchable
» The users – We set out to involve the users from the start from
procurement to testing wireframes and selecting enhancements.
» Hardware –The cloud has been a huge benefit. It has meant we have
been able to adapt the hardware we need as the system develops
» Supplier & Software – we made the right choices, with the help of our
users
» Staffing – we are about the recruit another developer, and our team work
alongside our supplier – a true example of partnership working.
» Time and Budgets – always still a challenge even with the best project
planning!
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 12
13. From EEBO
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 13
17. Have we overcome the challenges?
» We are currently running a survey on our site.
» We have had many encouraging comments
› “It's excellent, a real improvement on EEBO.”
› “Nice new interface with some good features.”
› “The viewer has everything I need without being cluttered, and the search
can be as broad or as specific as I like”
› “The interface isVASTLY better than the old EEBO interface.”
» But , like any service, we still have some people who are not happy
› “I've only been using it for a month. EEBO was a site I recommended to all my
students.This new site is a nightmare that I've had to apologise to my
students for.”
» We clearly still have some work to do to convince everyone..
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 17
18. Find out more…
28/02/2015 Title of presentation (Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide) 18
Scott Gibbens
Senior Service Manager (Jisc eCollections)
scott.gibbens@jisc.ac.uk
Brettenham House (South Entrance), 5 Lancaster
Place, London,WC2E 7EN
info@jisc.ac.uk jisc.ac.uk
Except where otherwise noted, this work is licensed under CC-BY-NC-ND
Editor's Notes
Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide (click ‘Apply’ to change only the currently selected slide, or ‘Apply to All’ to change the footers on all slides).
To add a background image to this slide; drag a picture to the placeholder or click the icon in the centre of the placeholder to browse for and add another image. Once added, the image can be cropped, resized or repositioned to suit.
Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide (click ‘Apply’ to change only the currently selected slide, or ‘Apply to All’ to change the footers on all slides).
Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide (click ‘Apply’ to change only the currently selected slide, or ‘Apply to All’ to change the footers on all slides).
Go to ‘View’ menu > ‘Header and Footer…’ to edit the footers on this slide (click ‘Apply’ to change only the currently selected slide, or ‘Apply to All’ to change the footers on all slides).