Lifecycle of a pixel

Lifecycle of a pixel
From a photon to Jpeg

Three basic principles of light
sensing
● photochemistry: light renders silver halide grains in film
“emulsion” “developable”
● thermal physics: heating effect of incident light heats
sensor that measures temperature
● photophysics: interaction of light with matter frees
electrons
○ light absorbed by metal surfaces causes current to be ejected from
them
○ light absorbed by semiconductors causes their conductivity to
increase

Photoelectric Effect
● Einstein's Nobel prize
● Light hits the conductor or semiconductor, electrons are
ejected and voltage is generated
● Principle behind solar cells and photosensitive diodes

Semiconductor Technologies
● CCD (“charge coupled device”), mostly obsolete
● CMOS (“complementary metal oxide semiconductor”)
○ originally: naked memory chips
○ currently: “camera on a chip” designs
● CID (“charge injection device”)
○ Mostly used in speciality cameras, by NASA and similar

CMOS imager array
● Two dimensional array of photosensitive diodes is
organized into rows and columns
● Each row charges are moved to a BUS at a time
● Color separation by filters and prisms
● Faster than CCD, cheaper, uses less power, and works
better in low light, but generates more noise

Bayer filter
● Color filter array (CFA) for arranging RGB color filters on
a square grid of photosensors
● 50% green, 25% blue, 25% red
● GRGB arrangement

Conversion to RGB
● Need to convert to RGB or similar color space
● Need to interpolate missing pixel values
● Prone to errors
● Moire effect

What is JPEG?
● JPEG is NOT an image format, it is compression standard
● JPEG stands for Joint Photographic Experts Group
● JFIF is standard file format that holds JPEG compressed
images, JPEG File Interchange Format
● Exif, Exchangeable Image Format is actual format
modern cameras use

JPEG compression assumptions
● Human eye sees grayscale better than color
● Human eye are more sensitive to green than red or blue
● Human eye does not see high frequency changes well

Transformation to YCbCr space
● Channels luma, blue and red
● Unlike in RGB, each channel is stored separately
● Luma holds luminosity and green data
● YCbCr color space separates luminance and
chrominance
● Chroma channels are downsampled
● Usually by factor of 2, therefore 4x less space
● Each channel is array of bytes, same as RGB pixel
components, therefore it holds same amount of data

Discrete Cosine Transform (DCT)
● Locally, image is a combination of high and low frequency
cosine functions
● For example, shape of a hand is high frequency, shape of
knitting on the sweater is high frequency component
● Image is divided into 8x8 pixel blocks, each block is
encoded as sum of 64 preset cosine functions
● This sorts the data according to its frequency

DCT calculation
and compression

Divide by quantization table
value and round to nearest
integer

DCT quantization
● We will find that large values are in low frequency area
● On most images, high frequency cosines don’t contribute
much, and our eye can’t see them well
● Most values will be zero, only those that contribute a lot
to the image will stay
● High frequency values in quantization table are big, so
they will nearly sure be zeroed out

Compression
● Huffman encoding is very good in removing multiple
values
● We store quantization table used and DCT coefficients in
JFIF format so process can be reversed when
decompression
● Chroma tables have much higher penalty than luma
● Decompressed image has relatively small differences
from original while reducing size significantly

Disadvantages of JPEG
● It is lossy, so not ideal for scientific images
● No coherence between blocks -> JPEG artifacts
● Not good for text, signs, etc, images with sharp edges and
high contrast due to JPEG artifacts
● Text violates our assumption that high frequency does
not contribute a lot to the image
● When DCTing image of the text, DCT coefficients are not
concentrated in low frequency area, a lot gets lost

Steganography
● Neat trick of encoding secret messages in images
● Simplest approach: encode in least significant bits
○ Can be detected as least significant bits will be random noise
● More complex approach: encode in DCT coefficients
○ Not detectable by visual inspection
○ Can be detected by statistics, distribution will be off
○ JSteg, f5, etc, many are open source
● Modern algorithms use DCT, but take into account
statistics, some success detecting with machine learning
● Used for protecting media by watermarking

Lifecycle of a pixel

More Related Content

Similar to Lifecycle of a pixel

More from Shaul Rosenzwieg

Recently uploaded

Lifecycle of a pixel