SCAP                                                               EImproved validation and feature extraction for JPEG 20...
SCAPE                   MetamorfozeNational Programme for preservation of  paper   heritage   Digitisation as a means to c...
SCAPEJP2 from JISC 1 Newspaper Collection (BL)
SCAPEJP2 from JISC 1 Newspaper Collection (BL)                              “Well‐formed and valid”
SCAPE             Source: http://img70.imageshack.us/img70/9950/serversnm2.jpgHardware failure may result in corrupted ima...
SCAPENot all encodersproduce standardcompliant images 
SCAPE               Possible solutionsOption 1Improve JPEG 2000 module JHOVEBut no institutional support, superseded by JH...
SCAPE                                    Jpylyzer tool0   1   1   1   1   0   0   1   0   1   1   1                       ...
SCAPE                 Jpylyzer tool‐ First prototype: December 2011 ‐ Refactoring of original code: Jan 2012 ‐ Packaging (...
SCAPEJP2 file             JPEG 2000 Signature box                  File Type box            JP2 Header box (superbox)     ...
SCAPECommand‐line use
SCAPEResult
SCAPEProperties extraction (excerpt)
SCAPEProperties embedded ICC profile
SCAPEDocumentation
SCAPEExample 1: detection of broken JP2s in JISC 1                Newspapers     Number of images          2,152,116     T...
SCAPE                           Results‐ 676 broken JP2s in JISC 1 collection (0.03 %)  TIFF originals still available‐ JI...
SCAPEExample 2: quality control Metamorfoze              migration         146 TB            Migrate by end 2012  TIFF    ...
SCAPE     TIFF                                             pixels     no                                                  ...
SCAPEExample 3: pre‐ingest quality control Wellcome                   Library ‐ JP2s produced in‐house and by external sup...
SCAPEPlatforms and licensing stuff
SCAPEhttp://www.openplanetsfoundation.org/software/jpylyzer
SCAPECommunity involvement
SCAPE              AcknowledgementsDebian packages‐ Dave Tarrant (Uni Southampton/OPF)‐ Miguel Ferreira, Rui Castro, Hélde...
SCAPE                    FundingThis work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by ...
Upcoming SlideShare
Loading in …5
×

Improved validation and feature extraction for JPEG 2000 Part 1: the jpylyzer tool

660
-1

Published on

Presentation on jpylyzer, a new tool that performs thorough validation of JPEG 2000 Part 1 (JP2) images. Presented during IS&T "Archiving 2012" conference.

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
660
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Improved validation and feature extraction for JPEG 2000 Part 1: the jpylyzer tool

  1. 1. SCAP EImproved validation and feature extraction for JPEG 2000 Part 1:the jpylyzer toolJohan van der Knijff1,2, René van der Ark1, Carl Wilson31 Koninklijke Bibliotheek – National Library of the Netherlands2 Open Planets Foundation3 The British Library IS&T, Archiving 2012, Copenhagen, 15.6.2012
  2. 2. SCAPE MetamorfozeNational Programme for preservation of  paper  heritage Digitisation as a means to conserve threatened paper  originals 146 TB Migrate by end 2012 TIFF JP2
  3. 3. SCAPEJP2 from JISC 1 Newspaper Collection (BL)
  4. 4. SCAPEJP2 from JISC 1 Newspaper Collection (BL) “Well‐formed and valid”
  5. 5. SCAPE Source: http://img70.imageshack.us/img70/9950/serversnm2.jpgHardware failure may result in corrupted images
  6. 6. SCAPENot all encodersproduce standardcompliant images 
  7. 7. SCAPE Possible solutionsOption 1Improve JPEG 2000 module JHOVEBut no institutional support, superseded by JHOVE2 (?)Option 2Develop JPEG 2000 module for JHOVE2Not ready for operational use (yet)Option 3Develop dedicated tool
  8. 8. SCAPE Jpylyzer tool0 1 1 1 1 0 0 1 0 1 1 1 1 0 1 0 1 1 0 1 1 0 1 1 0
  9. 9. SCAPE Jpylyzer tool‐ First prototype: December 2011 ‐ Refactoring of original code: Jan 2012 ‐ Packaging (Debian): Mar 2012  Univ. Southampton, KEEP Solutions, AIT Vienna‐ Add remaining functionality, bugfixes: Apr‐May  2012 (current version: 1.5)
  10. 10. SCAPEJP2 file JPEG 2000 Signature box File Type box JP2 Header box (superbox) Contiguous Codestream box 0 Contiguous Codestream box n IPR box XML box(es) UUID box(es) UUID Info box(es) (superbox)
  11. 11. SCAPECommand‐line use
  12. 12. SCAPEResult
  13. 13. SCAPEProperties extraction (excerpt)
  14. 14. SCAPEProperties embedded ICC profile
  15. 15. SCAPEDocumentation
  16. 16. SCAPEExample 1: detection of broken JP2s in JISC 1  Newspapers Number of images 2,152,116 Total size 45 TB Average image size 21.8 MB Number of threads 1 Time 21 days* Images/day/ thread 100,000 TB/day/thread 2 *Includes unzipping, actual time needed by jpylyzer much less!
  17. 17. SCAPE Results‐ 676 broken JP2s in JISC 1 collection (0.03 %) TIFF originals still available‐ JISC 2 (> 1 million images): 3 broken JP2s‐ 19th Century books (> 22 million images): no broken  JP2s
  18. 18. SCAPEExample 2: quality control Metamorfoze migration 146 TB Migrate by end 2012 TIFF JP2
  19. 19. SCAPE TIFF pixels no identical? pixel compare  yesAware JP2K SDK no valid JP2? JP2 Jpylyzer* yes image no properties compare properties match? yes properties profile pass fail *Imported as module in Python‐based workflow 
  20. 20. SCAPEExample 3: pre‐ingest quality control Wellcome Library ‐ JP2s produced in‐house and by external suppliers ‐ Use jpylyzer to validate against JP2 spec ‐ Use extracted properties to validate against a  profile  (Progression order, ratio, layers, ….) ‐ Profile coded as XML schema (So jpylyzer output can be validated against schema)
  21. 21. SCAPEPlatforms and licensing stuff
  22. 22. SCAPEhttp://www.openplanetsfoundation.org/software/jpylyzer
  23. 23. SCAPECommunity involvement
  24. 24. SCAPE AcknowledgementsDebian packages‐ Dave Tarrant (Uni Southampton/OPF)‐ Miguel Ferreira, Rui Castro, Hélder Silva (KEEP Solutions), ‐ Rainer Schmidt (AIT)Feedback on early versions‐ Christy Henshaw (Wellcome Library)‐ Ross Spencer (TNA)‐ Wouter Kool (KB)
  25. 25. SCAPE FundingThis work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137). http://www.scape‐project.eu #SCAPEProject

×