Image Processing For Color Facsimile
Upcoming SlideShare
Loading in...5
×
 

Image Processing For Color Facsimile

on

  • 1,836 views

A presentation given at the World Techno Fair Chiba '96, September 1996, Chiba (Japan)

A presentation given at the World Techno Fair Chiba '96, September 1996, Chiba (Japan)

Statistics

Views

Total Views
1,836
Views on SlideShare
1,835
Embed Views
1

Actions

Likes
0
Downloads
21
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Image Processing For Color Facsimile Image Processing For Color Facsimile Presentation Transcript

  • Image Processing 0 For Color Facsimile © 1996 Hewlett-Packard Company. All rights reserved. Giordano Beretta Hewlett-Packard Laboratories Imaging Technology Department 1501 Page Mill Road Palo Alto, CA 94304–1126 http://www.hpl.hp.com/personal/Giordano_Beretta HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Joint Work With 1 • Vasudev Bhaskaran • Konstantinos Konstantinides • Daniel T. Lee • Ho John Lee • Andrew H. Mutz • Balas K. Natarajan HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Outline 2 • Background for the standard • Discrete nature of visual perception • JPEG data compression • Optimizing the JPEG compression • Perceptually lossy compression • Text sharpening • Conclusions HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Three Breakthroughs 3 • Digital imaging — compression algorithms • Hardware cost / performance — SOHO market • International standard — ITU-T T.42 Addendum SOHO: Small Office — Home Office HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Project Goal 4 Achieve same transmission time for full color as for binary black & white in the case of the 4CP01 test chart HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • The Color Facsimile Pipeline 5 ‚ RGB to CIELA B ƒ JPEG coding „ G3 enc apsula tion ¥ A size, 200dpi 4:1:1 ~226K bytes … 11.22M bytes 5.6M bytes (default DQT) 24 bits/pixel 12 bits/pixel 0.48 bits/pixel ding 3 deco G † ssion mpre G deco JPE ‡ IELAB from C CMYK ˆ HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Time Phases for T.30 Fax Transmission 6 Phase Calling fax Called fax CNG beep A CED (call setup) DIS DCS B Training, TCF (pre-message) CFR C Training (message) Message RTC EOP D (post-message) MCF DCN E (call release) HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • New G3 Entries to DCS Frames 7 Bit No. DCS 68 JPEG coding 69 Full color mode 70 Preferred Huffman tables 71 12 bits/pel/component 72 Extend field 73 No subsampling (1:1:1) 74 Custom illuminant (not used) 75 Custom gamut range HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Four Categories of Business Images 8 1. Full color (pictorial, color photographs) 2. Multi-color (color charts & graphs) 3. Bi-color (documents marked up with red ink) 4. Mixed color (combination of 1–3, such as color pages of magazines) Source: NTT HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Color Space Selection 9 17 spaces considered (Munsell, CMYK, YIQ, CIE colorimetric spaces) Evaluation criteria (source: Fuji-Xerox): • ability to represent all colors • numerical complexity • device independence • quantization error under compression • compatibility with compression algorithms • color stability with white point change • … HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Perception is Discrete 10 • When a stimulus is changed just by a small amount, an observer will not notice a difference • Psychophysical experiments: threshold value of where an observer can detect a difference in stimuli • jnd — just noticeable difference HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Model for Perception Discretization 11 quantity nature, continuum intensity available information jnd human visual system HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Digital System: Sampling 12 nature, continuum sampled information HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Poor Sampling 13 digital, sampled available information jnd human visual system HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Good Sampling 14 digital, sampled available information jnd human visual system HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Result of the Color Space Evaluation 15 • 1931 Standard Colorimetric Observer • CIE Standard Illuminant D50 • CIELAB color space HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • The CIELAB Color Space 16 • Based on the CIE 1931 Standard Colorimetric Observer: device independent • Based on Munsell color tree, von Kries adaptation, CIE XYZ color space, and power law compression (n = 1/3): good perceptual uniformity • Easy to compute compared to other uniform spaces • Widely used in printing industry • Can be read directly with measurement instruments • Design issue: choice of the best range for the chromatic channels a* and b* HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Data Compression 17 ISO/IEC IS 10918–1 (a.k.a. JPEG) compressed raster to image stream entropy ƒ „ block DCT quantization coding translation quantization Huffman tables tables “Critical knob” HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • The DCT and its Kernels 18 7 7 ( 2x + 1 )kπ ( 2 y + 1 )lπ 1 ∑∑ Y ( k, l ) = -- C ( k )C ( l ) S ( x, y ) cos --------------------------- cos -------------------------- - - - 4 16 16 x = 0y = 0 m  n + -- π 1 -  2 [ C 8 ] mn = k m cos --------------------------- - 8 The 64 kernels of the discrete cosine transform: HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • The JPEG Compression 19 Quantization tables (DQT) are the key parameter • Do not quantize where it can be seen • Default parameters are for images on CRT displays • spatial information in text is different • printers have a much higher resolutions than CRTs • The three worlds of spatial information: 1. Physical: energy in the signal 2. Perceptual: sensitivity of the visual system 3. Semantic: cognitive mechanisms Last step: design the Huffman tables HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Physical World 20 DCT transforms 2-dimensional data to a 64-dimensional space • Each dimension represents a spatial pattern • For a number of typical images: • measure the energy in each of the 64 dimensions • popular estimator for energy: statistical variance • average over the images in the test set • allocate bandwidth in proportion to the average energy • For text documents with images, a much better compression rate is achieved for a given image quality, than with the default tables HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • L* Energy in Text vs. Pictorial Images 21 106 104 102 Average text Average picture 100 Text on fancy back Fax test image Topographic map 10-2 0 8 16 24 32 40 48 56 64 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • L* Energy in Text vs. Pictorial Images 22 1000000 Average text Average picture Text on fancy back Fax test image Topographic map 10000 100 0 1 2 3 4 5 6 7 8 9 10 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Traditional Bit Allocation 23 V y [ k, l ] 1 = -- ⋅ log -------------------- based on variance • Bit allocation N k, l - - D 2 B 1 ∑ ( Y i [ k, l ] – M y [k,l] ) 2 V y [ k, l ] = var ( Y [ k, l ] ) = --- - B i=1 To improve bits-per-pixel rate: 1. Brute force: uniform q-factor 2. Perceptual: increase DQT elements based on HVS 300 100 Contrast sensitivity red–green (chromatic) 30 green (monochromatic) 10 3 1 0.03 0.1 0.3 1 3 10 30 100 HP Laboratories Spatial frequency (cycles/degree) 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Perceptual World — Simple Method 24 Image quality depends on the contrast visible at a given spatial resolution: contrast sensitivity function (CSF) • Discard spatial information above the CSF: it cannot be seen anyway • Standard method: weigh the DQT elements by the CSF • Printer resolution is 600 dpi, same as visual system • improved compression of images • for text pattern recognition may be more important than CSF • Not necessarily a good model for what happens in the visual system HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Compression Ratio 25 Image (10.7M B) Default DQT Custom DQT Real estate flier with photo 52:1 (211K B) 82:1 (134K B) Book page with photos and text 53:1 (207K B) 63:1 (174K B) 4CP01 test chart 47:1 (233K B) 63:1 (174K B) HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Perceptual World — Complex Method 26 When two structures (textures) are present in an image, one structure may hide the other • Principle of visual masking • When a new structure is added to an existing structure, the new structure may mask the old structure or vice-versa • Quantization noise is a structure that is added to the image • Noise that is masked by the image is perfectly acceptable; this allows for higher compression ratios • This method is used mostly for image-dependent compression (adaptive) HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Results 27 1. If the method is applied to several images of the same type, there is little variation in the obtained DQTs • Hence, in the case of color facsimile we can use one DQT for all images 2. Adjust each DQT element to reach the threshold 3. The iterative method converges faster when the previous steps are performed 4. Lossy compression: any jnd value can be targetted HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Semantic World 28 Reading performance of text is the speed at which text can be read without errors • When compressing, discard information that does not impact reading performance • Identify the parts of characters in fonts that affect reading performance • Discard prevalently spatial information not related to these character parts HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Visual Impact of Quantization 29 Depends on Image and Sub-Space < fine coarse > < mixed HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Some Typeface Parts in Times Roman 30 itag bar ear stress stem serif terminal 7 point 9 point 11 point 12 point 16 point 1.12 mm 1.44 mm 1.76 mm 1.95 mm 2.58 mm X-Height in the Times New Roman typeface HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Critical Feature Sizes in Color 31 Facsimile Number of pixels 200 dpi 300 dpi 400 dpi 0.127 mm 0.085 mm 0.063 mm 1 0.254 mm 0.169 mm 0.127 mm 2 0.381 mm 0.254 mm 0.191 mm 3 0.508 mm 0.339 mm 0.254 mm 4 0.635 mm 0.423 mm 0.318 mm 5 0.762 mm 0.508 mm 0.381 mm 6 0.889 mm 0.593 mm 0.445 mm 7 1.016 mm 0.677 mm 0.508 mm 8 Red: Limit for visual acuity in the luminance channel Green: Limit for visual acuity in the chrominance channels Blue: Peak contrast sensitivity in the luminance channel Pink: typical character part sizes (11 point Times New Roman) HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Modified Bit Allocation Equation 32  w [ k, l ] V y [ k, l ] 1 = -- ⋅ log N k, l - -------------------- - • Weight w based on visual quality:  D 2 • Separate provisions for text, graphics, and pictorial images (smoothing) • Detail of spatial frequency in image data, ind. of phase angle, resides almost totally within 3 coefficients in the transform domain when a DCT is applied. • Rudimentary example of a weight table: 1 1 1 3/4 1 3/4 1 1 1 3/4 1/2 1/4 1/2 1/4 1/2 1/2 1 1/2 1/4 1/8 1/4 1/8 1/4 1/4 3/4 1/4 1/8 1/8 1/8 1/8 1/8 1/8 1 1/2 1/4 1/8 1/8 1/8 1/8 1/8 3/4 1/4 1/8 1/8 1/8 1/8 1/8 1/8 1 1/2 1/4 1/8 1/8 1/8 1/8 1/8 1 1/2 1/4 1/8 1/8 1/8 1/8 1/8 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Preliminary Experimental Results 33 • Compress two typical 300 dpi images • Comparable visual quality • q factor 200; use K.2 for the chrominance channels Text 4CP01 K.1 NEW K.1 NEW 2550 × 3300 Bitmap size 338,776 277,617 707,819 258,765 Compressed size (bytes) 0.32 0.26 0.67 0.25 Bits per pixel 1 : 75 1 : 91 1 : 36 1 : 98 Compression ratio HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • The Fuzzy Text Problem 34 Cost reduction introduces problems such as • sensor misalignment • optical blur and electric cross-talk • halftoning at low resolution HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Comparison of L* Energy in all Images 35 106 104 102 4CP01 test image Average pictorial Average text Text on fancy back 150 lpi brochure Newspaper 100 Group portrait Realtor flier 10 pt sans text 12 pt sans text 10 pt serif text 12 pt serif text Synthetic text Topographic map 10-2 0 8 16 24 32 40 48 56 64 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Comparison of Low Frequency Energy 36 106 4CP01 test image Average pictorial Average text Text on fancy back 150 lpi brochure Newspaper Group portrait Realtor flier 10 pt sans text 12 pt sans text 10 pt serif text 12 pt serif text Synthetic text 104 Topographic map 102 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • L* Energy in Pictorial Images 37 1000000 10000 100 Average 150 lpi brochure Text is Newspaper 1 Group portrait different Realtor flier Average text 0 0 8 16 24 32 40 48 56 64 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • L* Energy in Text Images 38 106 104 102 Average 10 pt sans 12 pt sans 100 10 pt serif 12 pt serif (synthetic) Robustness vs. font 10-2 0 8 16 24 32 40 48 56 64 HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Enhancement of 10 Point Serif Text 39 106 104 Variance 102 100 Unprocessed Convolution Filter New Algorithm Synthetic 10-2 0 8 16 24 32 40 48 56 64 Basis HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Sharpening an Image in the JPEG 40 Domain During the Encoding generate compute reference average image energy quantization scale table scaling matrix compute typical average scanned energy image • Edge sharpening is achieved by using the original DQT for the encoding, while at the receiving fax machine the decoder uses the scaled DQT matrix • The scaled matrix is the one included in the JPEG file • Since only the scaled DQT is transmitted the original matrix and the actual factors can remain a trade secret HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Huffman Tables 41 • Example Huffman Table is for perceptually lossless compression • Color facsimile based on CIELAB color space • Start with test chart Ad hoc technique: all symbol probabilities less than 2–16 are • set to 2–16 • Improvement: 8% to 14% • Only 0.5% compression rate loss for other images • Average improvement: 11% HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96
  • Conclusions 42 • Digital images with text are not very robust with respect to quantization errors • Sufficient information should be preserved in documents to preserve image quality when they are printed • If the bandwidth is used judiciously, documents can be compressed to a higher degree, allowing the use of better resolutions or shorter transmission times 1. Encode color in a perceptually uniform color space such as CIELAB 2. Compress the spatial information using JPEG 3. Design custom DQT tables for your document type 4. Design custom Huffman tables HP Laboratories 10/17/96 Hiro:Documents:Giordano Beretta:Research:Chiba96:OHP:Chiba96