Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
EBOOKS WITHOUT
VENDORS
Using Open
Source Tools
to Create and
Share
Meaningful
Ebook
Collections
Who am I?
Matt Weaver
Systems Medical
Librarian –
Cleveland Clinic
Who am I?
Matt Weaver
Former
IT Manager -
Westlake Porter
Public Library
Not an alternative to Overdrive, ebrary,
3M, etc.
EBOOKS AS TOOLS
To be created by:
• the library
• the community
Opportunities for:
• collaboration
• connection
An Experiment: Library as
publisher
USAGE: late Oct. 2013 through
Jul. 2015
More than 2,000 ebook downloads
More than 60,000 recipes downloaded/printed
15% by...
Costs:
Content: $0
Software licensing: $0
Staff time: 4-7 hours per ebook (estimated)
Mostly editing
An Experiment: Library as
publisher
SECURING
ACCESS TO
CONTENT
DIY: Copyright
Disclaimer:
I am not now, nor
have I ever been a
lawyer.
I am not a copyright
expert.
tap shoes
DIY: Copyright - Resources
 http://cocatalog.loc.gov/
DIY: Copyright - Resources
 http://collections.stanford.edu/copyrightrenewals/bin/page
?forward=home
Digital Copyright Slider
http://librarycopyright.net/resources/digitalslider/
DIY: Copyright - Resources
Section 108 Spinner
http://librarycopyright.net/resources/spinner/
DIY: Copyright - Resources
Copyright Genie
http://librarycopyright.net/resources/genie/
DIY: Copyright - Resources
Orphan Works
: http://bit.ly/1WRC8ck
“…the Copyright Office rejects the
idea that fair use can provide an
adequate solutio...
DIY: Copyright
Because of digital distribution,
and
because the library does not own titles to
be digitized…
o no Fair Use...
Documention of
copyright
research
Content
Permission
agreements
DIY: Copyright
EBOOKS
DISSECTED
& DIGITIZED
ePub as zip file
ePub as zip file
ebook markup
HTML & CSS
Everything has been digitized,
right?
Bad OCR: hours, fractions
Scanned ≠ Digitized
Corrected
WPPL
Epub
page
Everything has been digitized,
right?
Curation/editing takes time.
Who else would invest such time?
Corrected
WPPL
Epub
pa...
PRODUCING
EBOOK
FILES
Homer ebook project
http://bookscanner.pbworks.com/w/page/40965440/FrontPage
Homer
The following tools are installed as part of the Homer Project:
 ImageMagick (for manipulating images)
 Jpegtran (...
Homer
The following tools are installed as part of the Homer Project:
 ImageMagick (for manipulating images)
 Jpegtran (...
Ebook
Production Workflow
Ebook
Production Workflow
or
Homer: ScanTailor
 Preprocess tiff-format
images of book pages
 Deskewing
 De-speckling
 Correcting warp
 Right-to-le...
Homer:
ScanTailor
Homer:
ScanTailor
OCR
challenges
HOMER BASH SCRIPT
It looks like
command-line…
HOMER BASH SCRIPT
but it’s drag-and-
drop!!!
Homer: tesseract-ocr
Optical Character
Recognition
Multilingual support -
From Afrikaans to
Vietnamese
Homer: pdfbeads
Outputs a searchable
PDF
Homer & pdfbeads
Outputs a searchable
PDF
Sigil
https://code.google.com/p/sigil/
Epub Validator
http://validator.idpf.org/
Calibre
http://calibre-ebook.com/
COMMUNITY
COOKBOOK
COOKING.WESTLAKELIBRARY.ORG
Drupal
 Open source
content
management
system
drupal.org
Drupal
Ability to create
custom fields for
metadata – can be
hidden from users
drupal.org
Drupal –
Controlling
Access
Private files
vs
public files
drupal.org
Drupal Controlling Access –
ILS authentication module
Drupal –
Controlling
Access
Taxonomy Control Lite
module: permissions based
on taxonomy terms
drupal.org
Drupal – Recipe module
3 content
types:
•recipe
•ebook
•organization
 Drupal 7
 “Responsive”
layout
Drupal - Omega 3 Responsive
Theme
PHASE
TWO
(WHAT MIGHT HAVE BEEN?)
Drupal – ePub module
Drupal – ePub module
Drupal – PDF module
Drupal – HTML import module
Merging Content
The Community Cookbook –
mapping
Bonus:
Capturing Original Content
…with one more open-source tool, we can
even help them design print versions:
Bonus:
Capturing Original Content
We can do everything but the printing.
Further Reading
http://journal.code4lib.org/articles/9911
Further Reading
 Jarret Buse Epub from
the Ground UP:A Hands-
on Guide to EPUB2 and
EPUB3
 Excellent guide to the
guts o...
Further Reading
Stanford University: Copyright & Fair Use – Charts and Tools
http://fairuse.stanford.edu/charts-and-tools/
mattrweaver
Image credits
Open Source Sign Timothy Appnel -
https://www.flickr.com/photos/tappnel/5798812875/
“Librarian from Turn of ...
Ebooks without Vendors: Using Open Source Software to Create and Share Meaningful Ebook Collections - LITA 2015 edition
Upcoming SlideShare
Loading in …5
×

Ebooks without Vendors: Using Open Source Software to Create and Share Meaningful Ebook Collections - LITA 2015 edition

1,054 views

Published on

When you start building your own ebook collections from items in your community, you stop looking at them as licensed products and start seeing them as tools. In this talk I present the open source tools used to create The Community Cookbook website I created when I worked at Westlake Porter Public Library:
http://cooking.westlakelibrary.org

This presentation has been updated since the previous version.

I wrote about this project for codelib. The article includes more technical details: http://journal.code4lib.org/articles/9911

Published in: Education
  • Be the first to comment

  • Be the first to like this

Ebooks without Vendors: Using Open Source Software to Create and Share Meaningful Ebook Collections - LITA 2015 edition

  1. 1. EBOOKS WITHOUT VENDORS Using Open Source Tools to Create and Share Meaningful Ebook Collections
  2. 2. Who am I? Matt Weaver Systems Medical Librarian – Cleveland Clinic
  3. 3. Who am I? Matt Weaver Former IT Manager - Westlake Porter Public Library
  4. 4. Not an alternative to Overdrive, ebrary, 3M, etc.
  5. 5. EBOOKS AS TOOLS To be created by: • the library • the community Opportunities for: • collaboration • connection
  6. 6. An Experiment: Library as publisher
  7. 7. USAGE: late Oct. 2013 through Jul. 2015 More than 2,000 ebook downloads More than 60,000 recipes downloaded/printed 15% by cardholders
  8. 8. Costs: Content: $0 Software licensing: $0 Staff time: 4-7 hours per ebook (estimated) Mostly editing
  9. 9. An Experiment: Library as publisher
  10. 10. SECURING ACCESS TO CONTENT
  11. 11. DIY: Copyright Disclaimer: I am not now, nor have I ever been a lawyer. I am not a copyright expert. tap shoes
  12. 12. DIY: Copyright - Resources  http://cocatalog.loc.gov/
  13. 13. DIY: Copyright - Resources  http://collections.stanford.edu/copyrightrenewals/bin/page ?forward=home
  14. 14. Digital Copyright Slider http://librarycopyright.net/resources/digitalslider/ DIY: Copyright - Resources
  15. 15. Section 108 Spinner http://librarycopyright.net/resources/spinner/ DIY: Copyright - Resources
  16. 16. Copyright Genie http://librarycopyright.net/resources/genie/ DIY: Copyright - Resources
  17. 17. Orphan Works : http://bit.ly/1WRC8ck “…the Copyright Office rejects the idea that fair use can provide an adequate solution [to the problem of orphan works]”… Krista Cox, Association of Research Libraries
  18. 18. DIY: Copyright Because of digital distribution, and because the library does not own titles to be digitized… o no Fair Use case, o no section 108 protections
  19. 19. Documention of copyright research Content Permission agreements DIY: Copyright
  20. 20. EBOOKS DISSECTED & DIGITIZED
  21. 21. ePub as zip file
  22. 22. ePub as zip file
  23. 23. ebook markup HTML & CSS
  24. 24. Everything has been digitized, right? Bad OCR: hours, fractions Scanned ≠ Digitized Corrected WPPL Epub page
  25. 25. Everything has been digitized, right? Curation/editing takes time. Who else would invest such time? Corrected WPPL Epub page
  26. 26. PRODUCING EBOOK FILES
  27. 27. Homer ebook project http://bookscanner.pbworks.com/w/page/40965440/FrontPage
  28. 28. Homer The following tools are installed as part of the Homer Project:  ImageMagick (for manipulating images)  Jpegtran (loseless jpeg transformation)  JBIG2 encoder (compression tool for bi-level images)  Tesseract-OCR (optical character recognition)  RubyInstaller (installs the Ruby programming language)  Hpricot (HTML parser)  RMagick (interface between the Ruby programming language and ImageMagick)  Pdfbeads (to create searchable PDF)  Cmdow.exe (command-line utility used in Homer)  ScanTailor (post-processing tool)  Homer (command-line bash script)
  29. 29. Homer The following tools are installed as part of the Homer Project:  ImageMagick (for manipulating images)  Jpegtran (loseless jpeg transformation)  JBIG2 encoder (compression tool for bi-level images)  Tesseract-OCR (optical character recognition)  RubyInstaller (installs the Ruby programming language)  Hpricot (HTML parser)  RMagick (interface between the Ruby programming language and ImageMagick)  Pdfbeads (to create searchable PDF)  Cmdow.exe (command-line utility used in Homer)  ScanTailor (post-processing tool)  Homer (command-line bash script)
  30. 30. Ebook Production Workflow
  31. 31. Ebook Production Workflow or
  32. 32. Homer: ScanTailor  Preprocess tiff-format images of book pages  Deskewing  De-speckling  Correcting warp  Right-to-left language support  Outputs images for Homer
  33. 33. Homer: ScanTailor
  34. 34. Homer: ScanTailor
  35. 35. OCR challenges
  36. 36. HOMER BASH SCRIPT It looks like command-line…
  37. 37. HOMER BASH SCRIPT but it’s drag-and- drop!!!
  38. 38. Homer: tesseract-ocr Optical Character Recognition Multilingual support - From Afrikaans to Vietnamese
  39. 39. Homer: pdfbeads Outputs a searchable PDF
  40. 40. Homer & pdfbeads Outputs a searchable PDF
  41. 41. Sigil https://code.google.com/p/sigil/
  42. 42. Epub Validator http://validator.idpf.org/
  43. 43. Calibre http://calibre-ebook.com/
  44. 44. COMMUNITY COOKBOOK COOKING.WESTLAKELIBRARY.ORG
  45. 45. Drupal  Open source content management system drupal.org
  46. 46. Drupal Ability to create custom fields for metadata – can be hidden from users drupal.org
  47. 47. Drupal – Controlling Access Private files vs public files drupal.org
  48. 48. Drupal Controlling Access – ILS authentication module
  49. 49. Drupal – Controlling Access Taxonomy Control Lite module: permissions based on taxonomy terms drupal.org
  50. 50. Drupal – Recipe module
  51. 51. 3 content types: •recipe •ebook •organization  Drupal 7  “Responsive” layout
  52. 52. Drupal - Omega 3 Responsive Theme
  53. 53. PHASE TWO (WHAT MIGHT HAVE BEEN?)
  54. 54. Drupal – ePub module
  55. 55. Drupal – ePub module
  56. 56. Drupal – PDF module
  57. 57. Drupal – HTML import module
  58. 58. Merging Content
  59. 59. The Community Cookbook – mapping
  60. 60. Bonus: Capturing Original Content …with one more open-source tool, we can even help them design print versions:
  61. 61. Bonus: Capturing Original Content We can do everything but the printing.
  62. 62. Further Reading http://journal.code4lib.org/articles/9911
  63. 63. Further Reading  Jarret Buse Epub from the Ground UP:A Hands- on Guide to EPUB2 and EPUB3  Excellent guide to the guts of ebooks  Features many of the open-source programs I have discussed http://www.worldcat.org/oclc/837954536
  64. 64. Further Reading Stanford University: Copyright & Fair Use – Charts and Tools http://fairuse.stanford.edu/charts-and-tools/
  65. 65. mattrweaver
  66. 66. Image credits Open Source Sign Timothy Appnel - https://www.flickr.com/photos/tappnel/5798812875/ “Librarian from Turn of the Century” - http://www.moyak.com/researcher/Clients/male_librarians/ind ex.html?id=34 Ereaders - Michael Porter https://www.flickr.com/photos/libraryman/5052936803/ Apples & oranges http://mrg.bz/n1xLHg Techno_background2.jpg (ones and zeroes) http://www.morguefile.com/creative/Grafixar Ricoh Copier: http://www.itinstock.com/ekmps/shops/itinstock/images/ricoh- aficio-mp-4001-fast-photocopier-copier-printer-scan-fax-5598- p.jpg

×