Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Toc Follow Up Case Study Digitizing Emerald’S Backlist Presentation


Published on

Presented at the 2008 O'Reilly Tools of Change for Publishing Conference (

Published in: Business, Education
  • Be the first to comment

Toc Follow Up Case Study Digitizing Emerald’S Backlist Presentation

  1. 1. Digitizing Emerald’s Backlist Anna Torrance (Backfiles Project Manager) “ The farther backward you can look, the farther forward you are likely to see” Winston Churchill
  2. 2. Overview <ul><li>About Emerald & the Backfiles </li></ul><ul><li>Rationale </li></ul><ul><li>Project processes and set-up </li></ul><ul><li>Salient points from last year’s session (Workshop with Rebecca Goldthwaite & David Durand) </li></ul><ul><li>Big Questions </li></ul><ul><li>Lessons learned </li></ul>
  3. 3. About Emerald Group Publishing Limited <ul><li>Academic Journal publisher, established in 1967 </li></ul><ul><li>200 employees </li></ul><ul><li>Largest collection of management and LIS journals available today </li></ul><ul><ul><li>185+ business & library and information science </li></ul></ul><ul><li>Recently acquired almost 2,000 Series, Serials and Books </li></ul><ul><li>Global business with customers in over 80 countries </li></ul><ul><li>Work with over 90% of the top business schools </li></ul><ul><li>“ Research you can use” </li></ul>
  4. 4. Online Usage and Dissemination 20 million downloads in 2006 1.5 million articles downloaded each month on average 61,000 articles online over 13 years of content 200,000 online abstracts
  5. 5. What is Emerald Backfiles? <ul><li>Digital archive back to Volume 1 Issue 1 with some articles dating back as far as 1899 </li></ul><ul><li>Over 120 journal titles providing over 60,000 articles on key management disciplines </li></ul><ul><li>Each backfile transformed into a fully searchable PDF </li></ul><ul><li>Contains early articles from seminal publications ( British Food Journal, European Journal of Marketing , Journal of Documentation) </li></ul>
  6. 6. What is Emerald Backfiles?
  7. 7. The Decision to Digitize <ul><li>Users increasingly searching & accessing content online – Emerald received its 50 millionth article download in 2007 </li></ul><ul><li>Over 12% of cited Emerald articles were not available online prior to the launch of Emerald Backfiles </li></ul><ul><li>Four of the five most cited papers in Emerald journals were published before 1994 (when we first started to capture content electronically) </li></ul><ul><li>Digitizing the archive will bring this knowledge to new readers and re-open historical scholarship </li></ul><ul><li>Contributing to preservation of content </li></ul>
  8. 8. Pre-project preparation <ul><li>Research – is this what our customers want? </li></ul><ul><li>Content analysis – commissioned initial inventory from British Library. Confirmation that 97.5% of content could be sourced. </li></ul><ul><li>Extended negotiations - by 2-3 months to ensure obligations spelled out for both parties & that journal list was accurate. </li></ul>
  9. 9. Project Timeline attended ToC workshop Jun announcements made to contributor community and wider press Oct sales teams training Aug delivery date (anticipated in March) 1 st quarter official launch 2008 - Jan contract signed & work started Sept analysis complete & contract negotiations started Jul BL commissioned to carry out inventory analysis May project received board approval 2007 - Apr
  10. 10. Benefits of Partnering with the British Library Experts in their field – a trusted partner Extensive networks from which to source content that they did not hold Digitisation on site meant no unnecessary damage to the collection or transportation costs BL’s relationship with Innodata Isogen ensured a total service solution – critical due to Emerald’s tight delivery deadline BL holds both common and hard to find content – 98% of Emerald articles were provided by the BL
  11. 11. How was this project different from other British Library initiatives? <ul><li>Co-ordination of the elements of the production team outputs </li></ul><ul><li>Emphasis on establishing content standards that could apply over the life of journals that in some cases have more than 100 years of production </li></ul>
  12. 12. Digitization Process BL sources content Content scanned & TIFF file sent to Innodata Pass to BL for final QA check Package sent from BL to Emerald <ul><li>Graphic creation as per specifications </li></ul><ul><li>Graphic QC </li></ul><ul><li>Renaming of graphics as per specifications </li></ul><ul><li>Upload the package </li></ul><ul><li>Make Corrections </li></ul><ul><li>Receipt of PDF / TIFF </li></ul><ul><ul><li>Double Key </li></ul></ul><ul><li>For 99.95% accuracy </li></ul><ul><li>Compare </li></ul><ul><li>SGML Tagging </li></ul><ul><li>Visual QC </li></ul><ul><li>SGML Validation </li></ul><ul><li>Quality Checks </li></ul><ul><li>Yes </li></ul><ul><li>No </li></ul><ul><li>Final Inspection </li></ul><ul><li>Errors reported </li></ul><ul><li>Rename the file as per the naming convention. Package SGML & Graphics. </li></ul><ul><li>Quality Audit </li></ul>
  13. 13. Project Process Journal Control List CHECK CHECK Hard copy journal Scanning Publishing/QA Digitization Content load onto website Web pages Website CHECK CHECK
  14. 14. Project Team <ul><li>Initial pre-project workshop included members from all departments to discover interdependencies – issues identified but not all resolved </li></ul><ul><li>Emerald Project Board – cross-functional </li></ul><ul><li>British Library Project team </li></ul><ul><li>Maintained 3 way communication throughout the project (Emerald, BL and Innodata). Regular face-to-face meetings of great benefit when discussing content selection. BL also held weekly tele-conferences with Innodata production team </li></ul><ul><li>2 project managers –technical and non-technical </li></ul>
  15. 15. Project Manager x2 – a balancing act <ul><li>Coordination </li></ul><ul><li>Balancing tensions between business requirements and IT milestones </li></ul><ul><li>2 parallel workstreams </li></ul><ul><li>Shared ownership (possible confusion) </li></ul>
  16. 16. Lessons Learned from last year’s TOC presentation <ul><ul><li>Involved contributor community from the outset </li></ul></ul><ul><ul><li>Regular communication </li></ul></ul><ul><ul><li>Editorial network helped to source rare journal copies </li></ul></ul><ul><ul><li>Continuing involvement of editorial teams by linking Backfiles to other initiatives (leading journal campaigns) </li></ul></ul>“ Editorial is key – these are the people with detailed content knowledge”
  17. 17. Lessons Learned from last year’s TOC presentation <ul><ul><li>Article audits </li></ul></ul><ul><ul><li>Time taken to ensure journal control list was accurate </li></ul></ul><ul><ul><li>BUT </li></ul></ul><ul><ul><li>Could have taken a much bigger cross-section of content going further back </li></ul></ul>“ Even if you are only digitizing PDFs, think of them as XML just so that you have an in-depth knowledge of the content”
  18. 18. Lessons Learned from last year’s TOC presentation <ul><ul><li>Every SGML file checked </li></ul></ul><ul><ul><li>Spot checking by all 30+ members of the editorial department in addition to regular QA processes </li></ul></ul><ul><ul><li>Regular meetings </li></ul></ul><ul><ul><li>Time commitment emphasised from the outset </li></ul></ul><ul><ul><li>Training to engage sales teams who were tasked with selling a product they could not demonstrate </li></ul></ul>“ Don’t short change QA” <ul><ul><li>“ Engage team members and make clear time commitment up front” </li></ul></ul>
  19. 19. Lessons Learned from last year’s TOC presentation <ul><ul><li>Not every article from every issue will be present at go-live </li></ul></ul><ul><ul><li>Some articles may never be sourced </li></ul></ul><ul><ul><li>BUT </li></ul></ul><ul><ul><li>Fully searchable PDFs essential </li></ul></ul><ul><ul><li>SGML files must be accurate </li></ul></ul>“ Ask yourself; what is the minimum required to declare victory?”
  20. 20. Big Questions <ul><li>What is an article? </li></ul><ul><ul><li>Older journals more like magazines. </li></ul></ul><ul><ul><li>Problems differentiating article and non article content (NAC). </li></ul></ul><ul><li>At first, all pages scanned (adverts, NAC) </li></ul><ul><ul><li>Product process stopped for 2-3 weeks to reassess content standards which were largely based on the formats of current publications.  </li></ul></ul><ul><ul><li>Regular meetings with BL </li></ul></ul><ul><ul><li>Decided to scan whole issue of earlier journals </li></ul></ul>
  21. 22. British Food Journal – 1899 Back
  22. 23. Big Questions <ul><li>What are the copyright implications? </li></ul><ul><ul><li>Emerald retains copyright of the Backfiles product but not every article within the Backfiles </li></ul></ul><ul><ul><li>Network of almost 50,000 authors notified </li></ul></ul><ul><ul><li>Notice put on each article with option to contact Emerald if authors feel copyright breached </li></ul></ul><ul><ul><li>Policy established to remove any articles at author’s request </li></ul></ul><ul><li>To date, no removal requests received </li></ul>
  23. 24. Big Questions <ul><li>What should we do with articles without abstracts? </li></ul><ul><ul><li>Where no abstract provided, first paragraph of the article used </li></ul></ul><ul><li>How should we apply DOIs? </li></ul><ul><ul><li>Could not apply in usual manner (including journal ISSN) </li></ul></ul><ul><ul><li>Decision made to apply sequentially without ISSNs </li></ul></ul><ul><li>How should acquisitions be handled going forwards? </li></ul><ul><ul><li>Cut-off point in July 2007 </li></ul></ul><ul><ul><li>New acquisitions will not be included in the Backfiles product until a new version is released </li></ul></ul>
  24. 25. Big Questions <ul><li>How should we sell the Backfiles? </li></ul><ul><ul><li>By journal? </li></ul></ul><ul><ul><li>By subject? </li></ul></ul><ul><ul><li>As a single collection? </li></ul></ul><ul><ul><li>Regional pricing – consistency with Emerald’s core product </li></ul></ul><ul><ul><li>Regional launches </li></ul></ul><ul><li>When should we sell the Backfiles? </li></ul><ul><ul><li>Danger of missing 2008 cycle </li></ul></ul>
  25. 26. Managing Expectations - externally <ul><li>Clear web branding strategy: </li></ul><ul><ul><li>Icons to identify content as Backfiles (different to archive content) </li></ul></ul><ul><ul><li>PDF-only stated on every abstract page </li></ul></ul><ul><li>Inform the author community early on </li></ul><ul><li>Clear communication to customers about launch date and availability </li></ul>
  26. 27. Managing Expectations – internally <ul><li>Address tensions between IT/Production and Sales & Marketing </li></ul><ul><ul><li>Sales – want to provide customers with a delivery date ASAP </li></ul></ul><ul><ul><li>Production – want as much time as possible to source all articles and ensure quality </li></ul></ul><ul><ul><li>IT – do not want to commit to a date too early due to number of unresolved requirements </li></ul></ul><ul><ul><li>Marketing – want product features and specification early in order to create material </li></ul></ul>
  27. 28. New Business Model <ul><li>Providing flexible purchasing options for librarians who </li></ul><ul><li>prefer to purchase access in perpetuity </li></ul>1899-1993 Backfiles (access in perpetuity) 1994-2001 (content rented – access as long as institution is a subscriber) 2002-present (access in perpetuity) 2002-Emerald sub starts Offer librarians option to purchase access in perpetuity to this content
  28. 29. Lessons Learned <ul><li>Choice of partner plays an important role –British Library made the whole process simple, efficient and cost-effective </li></ul><ul><li>Every journal is different – an in-depth knowledge of ALL content is essential </li></ul><ul><li>Obtain the widest/earliest possible cross-section of content </li></ul><ul><li>Don’t underestimate the importance of the content control list! </li></ul>
  29. 30. Lessons Learned <ul><li>Spend time identifying all issues up front but do not try to answer them all at once </li></ul><ul><li>File naming conventions need to take into account all eventualities over the full content range – it’s difficult to change conventions mid project </li></ul><ul><li>If outsourcing digitization, take as much time as you need to get the first batch perfect </li></ul><ul><li>Be aware of tensions between project delivery date and business requirements – managing expectations is key </li></ul>
  30. 31. Conclusion <ul><li>Official launch at American Library Association at end of January 2008 and product on schedule for delivery in 1 st Quarter of 2008 as planned. </li></ul><ul><li>Pre-launch order list healthy – exceeded expectations. </li></ul><ul><li>New acquisitions in Series, Serials & Books could mean another Backfiles project in the not too distant future! </li></ul>
  31. 32. Thank You Questions?