SCAPEImproved validation and featureextraction for JPEG 2000 Part 1:the jpylyzer toolJohan van der Knijff1,2, René van der...
SCAPE                  MetamorfozeNational Programme for preservation of paper  heritage  Digitisation as a means to conse...
SCAPEJP2 from JISC 1 Newspaper Collection (BL)
SCAPEJP2 from JISC 1 Newspaper Collection (BL)                              “Well-formed and valid”
SCAPE             Source: http://img70.imageshack.us/img70/9950/serversnm2.jpgHardware failure may result incorrupted images
SCAPENot all encodersproduce standardcompliant images
SCAPE               Possible solutionsOption 1Improve JPEG 2000 module JHOVEBut no institutional support, superseded by JH...
SCAPE                                    Jpylyzer tool0   1   1   1   1   0   0   1   0   1   1   1       0   1   0   1   1
SCAPE                 Jpylyzer tool- First prototype: December 2011- Refactoring of original code: Jan 2012- Packaging (De...
SCAPEJP2 file             JPEG 2000 Signature box                  File Type box            JP2 Header box (superbox)     ...
SCAPECommand-line use
SCAPEResult
SCAPEProperties extraction (excerpt)
SCAPEProperties embedded ICC profile
SCAPEDocumentation
SCAPEExample 1: detection of broken JP2s in JISC 1               Newspapers    Number of images           2,152,116    Tot...
SCAPE                           Results- 676 broken JP2s in JISC 1 collection (0.03 %)  TIFF originals still available- JI...
SCAPEExample 2: quality control Metamorfoze              migration         146 TB            Migrate by end 2012 TIFF     ...
SCAPE     TIFF                                            pixels     no                                                   ...
SCAPEExample 3: pre-ingest quality control Wellcome                   Library - JP2s produced in-house and by external sup...
SCAPEPlatforms and licensing stuff
SCAPEhttp://www.openplanetsfoundation.org/software/jpylyzer
SCAPECommunity involvement
SCAPE              AcknowledgementsDebian packages- Dave Tarrant (Uni Southampton/OPF)- Miguel Ferreira, Rui Castro, Hélde...
SCAPE                    FundingThis work was partially supported by the SCAPE Project.The SCAPE project is co-funded by t...
Upcoming SlideShare
Loading in...5
×

Jpylyzer, a validation and feature extraction tool developed in SCAPE project

357

Published on

Jpylyzer is a tool for validation and feature extraction for the JP2 (JPEG 2000 Part 1) still image format. The tool is being developed in the SCAPE Project and was presented by Johan van der Knijff at Archiving 2012 in Copenhagen.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
357
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Jpylyzer, a validation and feature extraction tool developed in SCAPE project"

  1. 1. SCAPEImproved validation and featureextraction for JPEG 2000 Part 1:the jpylyzer toolJohan van der Knijff1,2, René van der Ark1, Carl Wilson31 Koninklijke Bibliotheek – National Library of the Netherlands2 Open Planets Foundation3 The British LibraryIS&T, Archiving 2012, Copenhagen, 15.6.2012
  2. 2. SCAPE MetamorfozeNational Programme for preservation of paper heritage Digitisation as a means to conserve threatened paper originals 146 TB Migrate by end 2012 TIFF JP2
  3. 3. SCAPEJP2 from JISC 1 Newspaper Collection (BL)
  4. 4. SCAPEJP2 from JISC 1 Newspaper Collection (BL) “Well-formed and valid”
  5. 5. SCAPE Source: http://img70.imageshack.us/img70/9950/serversnm2.jpgHardware failure may result incorrupted images
  6. 6. SCAPENot all encodersproduce standardcompliant images
  7. 7. SCAPE Possible solutionsOption 1Improve JPEG 2000 module JHOVEBut no institutional support, superseded by JHOVE2 (?)Option 2Develop JPEG 2000 module for JHOVE2Not ready for operational use (yet)Option 3Develop dedicated tool
  8. 8. SCAPE Jpylyzer tool0 1 1 1 1 0 0 1 0 1 1 1 0 1 0 1 1
  9. 9. SCAPE Jpylyzer tool- First prototype: December 2011- Refactoring of original code: Jan 2012- Packaging (Debian): Mar 2012 Univ. Southampton, KEEP Solutions, AIT Vienna- Add remaining functionality, bugfixes: Apr-May 2012 (current version: 1.5)
  10. 10. SCAPEJP2 file JPEG 2000 Signature box File Type box JP2 Header box (superbox) Contiguous Codestream box 0 Contiguous Codestream box n IPR box XML box(es) UUID box(es) UUID Info box(es) (superbox)
  11. 11. SCAPECommand-line use
  12. 12. SCAPEResult
  13. 13. SCAPEProperties extraction (excerpt)
  14. 14. SCAPEProperties embedded ICC profile
  15. 15. SCAPEDocumentation
  16. 16. SCAPEExample 1: detection of broken JP2s in JISC 1 Newspapers Number of images 2,152,116 Total size 45 TB Average image size 21.8 MB Number of threads 1 Time 21 days* Images/day/ thread 100,000 TB/day/thread 2 *Includes unzipping, actual time needed by jpylyzer much less!
  17. 17. SCAPE Results- 676 broken JP2s in JISC 1 collection (0.03 %) TIFF originals still available- JISC 2 (> 1 million images): 3 broken JP2s- 19th Century books (> 22 million images): no broken JP2s
  18. 18. SCAPEExample 2: quality control Metamorfoze migration 146 TB Migrate by end 2012 TIFF JP2
  19. 19. SCAPE TIFF pixels no identical? pixel compare yesAware JP2K SDK no valid JP2? JP2 Jpylyzer* yes image no properties compare properties match? yes properties profile pass fail *Imported as module in Python-based workflow
  20. 20. SCAPEExample 3: pre-ingest quality control Wellcome Library - JP2s produced in-house and by external suppliers - Use jpylyzer to validate against JP2 spec - Use extracted properties to validate against a profile (Progression order, ratio, layers, ….) - Profile coded as XML schema (So jpylyzer output can be validated against schema)
  21. 21. SCAPEPlatforms and licensing stuff
  22. 22. SCAPEhttp://www.openplanetsfoundation.org/software/jpylyzer
  23. 23. SCAPECommunity involvement
  24. 24. SCAPE AcknowledgementsDebian packages- Dave Tarrant (Uni Southampton/OPF)- Miguel Ferreira, Rui Castro, Hélder Silva (KEEP Solutions),- Rainer Schmidt (AIT)Feedback on early versions- Christy Henshaw (Wellcome Library)- Ross Spencer (TNA)- Wouter Kool (KB)
  25. 25. SCAPE FundingThis work was partially supported by the SCAPE Project.The SCAPE project is co-funded by the European Union underFP7 ICT-2009.4.1 (Grant Agreement number 270137). http://www.scape-project.eu #SCAPEProject
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×