Using the JPEG2000 image format for storage and access in biodiversity collections.  Chris Freeland Missouri Botanical Garden
Overview of JPEG2000 Wavelet-based compression Different than JPEG Decompress without extracting entire file Proposed in 2000 to supercede JPEG Hasn’t Slow adoption in museums & libraries Poor (no) native browser support Few open source options Faster adoption in medical imaging, other commercial applications
Parts of the format Part 1 , Core coding system (JP2) defines format; adopted as standard first. Part 2 , Extensions  Part 3 , Motion JPEG 2000  Part 4 , Conformance  Part 5 , Reference software  Part 6 , Compound image file format (JPM) Part 7 has been abandoned  Part 8 , Security (JPSEC)  Part 9 , Protocols and API (JPIP) Part 10 , JP3D ( volumetric imaging )  Part 11 , JPWL ( wireless applications )  Part 12 , ISO Base Media File Format ( common w/ MPEG-4 )
Advantages of JPEG2000 Region extraction Compression Both lossless & lossy Self-containedness XML metadata + image Multiple objects can be bundled together Progressive Transmission Lower quality at early load http://www.dlib.org/dlib/september08/chute/09chute.html
Region Extraction “ Give me  x , y  coordinates at  z  resolution.” 600ppi,  200MB  TIF; encode to  100MB  JP2 72ppi:  20KB  JPG
“ How many books in a ___?” 2 Biblioburros; 4,800 books* *http://www.nytimes.com/2008/10/20/world/americas/20burro.html  Luis Soriano, with Alpha and Beto 1 Biblioburro =  2,400 books BHL to date =  9 Biblioburros!
Storage requirement for a digital Biblioburro 2,400 books / Biblioburro (9,238,295 pages / 22,118 books in BHL) = 418 pages / book 1,002,437 pages / Biblioburro Avg size of each image file RAW/TIF:  24MB ;   JP2:  2MB Drive space needed / Biblioburro TIF:   24TB ;  JP2:   2TB 2,400 books 24 TB TIFs 2 TB JP2 = =
Self-containedness / metadata bundling Not just an image, but an image, its content & its context Adobe XMP Dublin Core Your own XML TIF Headers & JPEG limit fields Can describe more than just an image A whole web site
Barriers for adoption Lack of affordable, scalable serving options Until recently, no open source server Commercial options expensive No native browser support Safari  does , but via QuickTime But why?? PNG? No motivation? Community skepticism
Encoding Software Commercial Adobe Photoshop LuraTech SDK LizardTech Non-Commercial Kakadu ImageMagik IrfanView
Decoding & Serving Commercial LizardTech Aware LuraTech ICS FSIV Non-Commercial Kakadu GSIV djatoka
Part 6: JPIP Protocol and API for transmitting JP2 Designed for HTTP, but not restricted to that carrier Don’t need a browser Implementations are  available, use is  infrequent HiRISE camera on Mars Reconnaissance Orbiter
Current use of JP2 in BHL Serve 85% (lossy) .jp2 LizardTech decoder Tiled on the fly Cached for performance GSIV browser-based client viewer
LizardTech ExpressServer Browser  GSIV.js www.biodiversitylibrary.org .jp2 .jpg Internet Archive  /page/1274907 pageid: 1274907 BHLdb http://www.archive.org/download/mushroomsofameri00palm/.../mushroomsofameri00palm_0010.jp2  images.mobot.org A user requests  Mushrooms of America, edible and poisonous , Plate X: http://www.biodiversitylibrary.org/page/1274907   locate:
 
 
 
 
The Future: djatoka Developed at Los Alamos National Laboratory, Research Library Use of the ISO-standardized JPEG 2000 format [ 6 ] as the service format;  Java-based open source solution built around the  Kakudu  JPEG 2000 library;  Geared towards reuse through URI-addressability of all image disseminations including regions, rotations, and format transformations;  Provision of a consistent, guessable URI pattern for image disseminations based on the ANSI/NISO OpenURL standard [ 7 ];  Provision of an extensible service framework for image disseminations enabled by  OCLC's  Java  OpenURL  package ;  Availability of image disseminations in a range of image formats;  Availability of image disseminations for locally stored JPEG 2000 files, as well as for Web-accessible images in a variety of formats;  Configurable server-side, file-based caching;  Ajax-based client reference implementation, based on  IIPImage  JavaScript Viewer , which allows panning, zooming, and selecting the URI of the current view.  http://www.dlib.org/dlib/september08/chute/09chute.html
References djatoka http://www.dlib.org/dlib/july08/buonora/07buonora.html HUL: Page Image Compression for Mass Digitization http:// preserve.harvard.edu/massdig/hul_study / JP2 in Libraries and Archives http://j2karclib.info/taxonomy/term/2 JPEG 2000 - a Practical Digital Preservation Standard? http://www.dpconline.org/docs/reports/dpctw08-01.pdf JPEG2000 site http://www.jpeg.org/jpeg2000/
Contact Chris Freeland Missouri Botanical Garden 4344 Shaw Blvd. St. Louis, MO  63110 [email_address] http://www.chrisfreeland.com

Using the JPEG2000 image format for storage and access in biodiversity collections.

  • 1.
    Using the JPEG2000image format for storage and access in biodiversity collections. Chris Freeland Missouri Botanical Garden
  • 2.
    Overview of JPEG2000Wavelet-based compression Different than JPEG Decompress without extracting entire file Proposed in 2000 to supercede JPEG Hasn’t Slow adoption in museums & libraries Poor (no) native browser support Few open source options Faster adoption in medical imaging, other commercial applications
  • 3.
    Parts of theformat Part 1 , Core coding system (JP2) defines format; adopted as standard first. Part 2 , Extensions Part 3 , Motion JPEG 2000 Part 4 , Conformance Part 5 , Reference software Part 6 , Compound image file format (JPM) Part 7 has been abandoned Part 8 , Security (JPSEC) Part 9 , Protocols and API (JPIP) Part 10 , JP3D ( volumetric imaging ) Part 11 , JPWL ( wireless applications ) Part 12 , ISO Base Media File Format ( common w/ MPEG-4 )
  • 4.
    Advantages of JPEG2000Region extraction Compression Both lossless & lossy Self-containedness XML metadata + image Multiple objects can be bundled together Progressive Transmission Lower quality at early load http://www.dlib.org/dlib/september08/chute/09chute.html
  • 5.
    Region Extraction “Give me x , y coordinates at z resolution.” 600ppi, 200MB TIF; encode to 100MB JP2 72ppi: 20KB JPG
  • 6.
    “ How manybooks in a ___?” 2 Biblioburros; 4,800 books* *http://www.nytimes.com/2008/10/20/world/americas/20burro.html Luis Soriano, with Alpha and Beto 1 Biblioburro = 2,400 books BHL to date = 9 Biblioburros!
  • 7.
    Storage requirement fora digital Biblioburro 2,400 books / Biblioburro (9,238,295 pages / 22,118 books in BHL) = 418 pages / book 1,002,437 pages / Biblioburro Avg size of each image file RAW/TIF: 24MB ; JP2: 2MB Drive space needed / Biblioburro TIF: 24TB ; JP2: 2TB 2,400 books 24 TB TIFs 2 TB JP2 = =
  • 8.
    Self-containedness / metadatabundling Not just an image, but an image, its content & its context Adobe XMP Dublin Core Your own XML TIF Headers & JPEG limit fields Can describe more than just an image A whole web site
  • 9.
    Barriers for adoptionLack of affordable, scalable serving options Until recently, no open source server Commercial options expensive No native browser support Safari does , but via QuickTime But why?? PNG? No motivation? Community skepticism
  • 10.
    Encoding Software CommercialAdobe Photoshop LuraTech SDK LizardTech Non-Commercial Kakadu ImageMagik IrfanView
  • 11.
    Decoding & ServingCommercial LizardTech Aware LuraTech ICS FSIV Non-Commercial Kakadu GSIV djatoka
  • 12.
    Part 6: JPIPProtocol and API for transmitting JP2 Designed for HTTP, but not restricted to that carrier Don’t need a browser Implementations are available, use is infrequent HiRISE camera on Mars Reconnaissance Orbiter
  • 13.
    Current use ofJP2 in BHL Serve 85% (lossy) .jp2 LizardTech decoder Tiled on the fly Cached for performance GSIV browser-based client viewer
  • 14.
    LizardTech ExpressServer Browser GSIV.js www.biodiversitylibrary.org .jp2 .jpg Internet Archive /page/1274907 pageid: 1274907 BHLdb http://www.archive.org/download/mushroomsofameri00palm/.../mushroomsofameri00palm_0010.jp2 images.mobot.org A user requests Mushrooms of America, edible and poisonous , Plate X: http://www.biodiversitylibrary.org/page/1274907 locate:
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    The Future: djatokaDeveloped at Los Alamos National Laboratory, Research Library Use of the ISO-standardized JPEG 2000 format [ 6 ] as the service format; Java-based open source solution built around the Kakudu JPEG 2000 library; Geared towards reuse through URI-addressability of all image disseminations including regions, rotations, and format transformations; Provision of a consistent, guessable URI pattern for image disseminations based on the ANSI/NISO OpenURL standard [ 7 ]; Provision of an extensible service framework for image disseminations enabled by OCLC's Java OpenURL package ; Availability of image disseminations in a range of image formats; Availability of image disseminations for locally stored JPEG 2000 files, as well as for Web-accessible images in a variety of formats; Configurable server-side, file-based caching; Ajax-based client reference implementation, based on IIPImage JavaScript Viewer , which allows panning, zooming, and selecting the URI of the current view. http://www.dlib.org/dlib/september08/chute/09chute.html
  • 20.
    References djatoka http://www.dlib.org/dlib/july08/buonora/07buonora.htmlHUL: Page Image Compression for Mass Digitization http:// preserve.harvard.edu/massdig/hul_study / JP2 in Libraries and Archives http://j2karclib.info/taxonomy/term/2 JPEG 2000 - a Practical Digital Preservation Standard? http://www.dpconline.org/docs/reports/dpctw08-01.pdf JPEG2000 site http://www.jpeg.org/jpeg2000/
  • 21.
    Contact Chris FreelandMissouri Botanical Garden 4344 Shaw Blvd. St. Louis, MO 63110 [email_address] http://www.chrisfreeland.com