Your SlideShare is downloading. ×
0
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Goobi at the bodleian
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Goobi at the bodleian

147

Published on

Published in: Education, Technology
2 Comments
0 Likes
Statistics
Notes
  • @Dan Field Hi Dan, There was a UK Goobi users group meeting, earlier this year [end of May], so I presented it there. Would be happy to answer any questions, if you have any. [email to my Bodleian address]
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Would have liked to see this presentation, where did you give it?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
147
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
2
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. Goobi at the Bodleian BACKGROUND AND WORK SO FAR
  2. Background oExisting long-running and very experienced digitisation studio. oPrimarily low-volume, very high-quality work. Special collections material. oSome project-funded larger scale projects, but not in the recent past.
  3. Existing systems A mixture of bespoke applications, and a diverse mix of technologies: •MySQL •MS Access •VBA •Perl •PHP •Python •Windows batch files •Imagemagick •Shell scripts / cron
  4. ‘Systems’ limitations Physical hardware nearing end of lifetime. Physical hardware performance inadequate for existing production volume. Network limitations. Commercially supported software at or past end of lifetime. Bespoke or locally developed software past end of lifetime, and not suitable for incremental upgrade and revision. Lack of in-house resources to build a completely new workflow system from scratch. Poor or non-existent documentation.
  5. Project work and ‘mass’ digitisation Newly funded major digitisation projects: •Polonsky foundation: 500,000 images (3 years) – Greek & Hebrew manuscripts and incunabula. •Chinese: 1,000,000 images. Need to substantially increase production, while maintaining quality. Existing systems already inadequate for current production levels.
  6. Solution Software workflow: ◦ Goobi – phased introduction. Phase 1: ‘large’ projects only, Phase 2: smaller commercial orders. New hardware infrastructure: ◦ Dedicated server cluster (virtualised) ◦ Upgraded network infrastructure ◦ Custom built from the ground-up to support high-volume digitisation. Repository: ◦ ‘Databank’ Delivery: ◦ Digital.Bodleian ◦ Viewer.Bodleian
  7. Current State of Play Software workflow: ◦ Goobi – Entering final testing phase, prior to roll-out. New hardware infrastructure: ◦ Dedicated server cluster (virtualised on dedicated hardware) – In build and test. ◦ Upgraded network infrastructure – Nov. 2014 [move to a new building] ◦ Custom built from the ground-up to support high-volume digitisation. Repository: ◦ ‘Databank’ – In production. Delivery: ◦ Digital.Bodleian – ‘Soft’ launch, not in full public launch. ◦ Viewer.Bodleian – In production. Version 1.
  8. Goobi workflow (1) Create process Insert UUID and export path [as process properties] Order and check physical item Photography TIFF verification [JHOVE2] Jpeg generation Jpeg verification [JHOVE2] QA Jpeg2000 creation [Kakadu + Python]
  9. Goobi workflow (2) Jpeg2000 verification [JHOVE2] Metadata entry Metadata QA Export to DMS UUID generation [for page/image level records] Generate derivative metadata [Dublin Core, IIIF] Extract EXIF/XMP technical metadata [Exempi / Python] Send to queue/workers for upload to repository [RabbitMQ, Databank]
  10. Problems / Lessons learned Metadata ‘ruleset’: •Difficulties getting consensus from disparate groups of stake-holders, e.g. curators, and technical specialists. •Information gathering / consultation time-consuming, and returns poor. Systems integration: •Difficulties integrating with elements of our own systems where no ‘out-of-the-box’ or standard solutions exist. Systems performance: •Networking bandwidth •Server loads •Working storage for ‘in-flight’ data. •Efficient ‘pipe’ to final repository.
  11. Ongoing problems / work remaining Goobi only replaces part of our existing workflow. Further development needed to integrate with on-line ordering, order/customer tracking, and billing systems. Further development needed to integrate with secure delivery mechanisms for commercial orders. Possible integration with other library systems and resources.

×