Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment

Stereoscopic 3D: Generation Methods
and Display Technologies for Industry
and Home Entertainment
Raymond Phan

Ph.D. Candidate
Multimedia and Distributed Computing (MDC) Research Laboratory
Department of Electrical and Computer Engineering
Ryerson University 1
Human Computer Interaction Guest Lecture
Thursday, March 8th, 2012

Outline of Presentation
• Introduction
– Stereoscopy / 3D Vision
• What is 3D all about??
• Depth and Disparity
• Some methods on generating 3D content
– Conversion from 2D imagery / video to 3D
• Cut and Paste Technique
• Depth Based Image Rendering – Recover Depth Maps
– Automated Methods – Using motion, focus cues, perspective
– Semi-Automated Methods

Human Computer Interaction Guest Lecture 2

Outline of Presentation – (2)
– Acquiring 3D content directly
• Stereo Rigs, Multi-camera Setup
• 3D cameras
• Displaying 3D Content
– Anaglyphs (very retro)
– 3D Theatres with polarized glasses
• RealD (most popular), IMAX
– Shutter glass technology
• nVidia 3D Vision, XpanD 3D, DLP projection systems,
DLP TVs

Outline of Presentation – (3)
– Interference Filter Technology
• Based on projecting colours of different wavelengths to
each eye  Dolby 3D, Panavision 3D
– Autostereoscopic Systems
• Technology without the use of glasses
– Parallax Barriers, Lenticular Arrays
– Single view vs. Multi-view systems
• Seen in the Nintendo 3DS, Fujifilm FinePix Real 3D
cameras, etc.
• Applications
• Conclusions

Introduction
• So… what is stereoscopy / 3D vision?
– Creating the illusion of depth in an image or video
– Take images on flat displays, and make it “look real”


Introduction – (2)
• Need to know some basic things first:
– Objects seen with the left eye are separated by
horizontal distances with the right eye  disparity
– Greater/smaller the distance, the closer/farther the
object  depth


Generating 3D: 2D – 3D – (1)
• 1st Method: Cut and Paste Technique
– Used in IMAX’s 3D
DMR process

– 35 mm frames  High res. digital  Left-eye frames
– Right-eye frames  Left frames objects are manually
shifted horizontally to create this new frame

– Remember disparity (close/far)! The closer the object,
the farther the shift needs to be
– Main Disadvantage: Very time consuming!
• Currently done on a frame-by-frame basis
• Due to this, only ~10 minutes of 35 mm video is 3D
converted  Takes ~1 month to complete whole process
• Our MDC Project with IMAX: Goal  Perform 2D to 3D
movie conversion faster
• Use a semi-automatic process to extract objects, and do
this every 10, or 20 frames or so
• In between frames, “guess” the best estimate of where the
objects are

• 2nd Method: Depth-Based Image Rendering (DBIR)
– 3D Content  1 2D Image + Depth Map
– Depth Map: Image containing depth of each image
• Closer Pixels == Light values, Farther Pixels == Dark values
– Orig. Image  Left View. Right view  Use depth
map (d(x,y)) to calculate shifted pixel from left view
Equation to
generate view
Right(x,y) =
Left(x+d(x,y),y)

• Commonly known as 2D to 3D Conversion
– Goal of 2D to 3D Conversion: Use an image and
determine what the best depth map is
– We use this depth map for conversion
– Use original single view image / frame as the left
view, and the depth map to create the right view
• There are two main methods to do this:
– Automated Methods  Automatically examine
features in an image or frame and infer depth


– Semi-Automated Methods  User-guided
• Mark certain areas of the image / frame on what you
think the depths should be at these locations
• Algorithm determines the rest of the depths
• Question: How do we know for sure that
we’re marking the proper depths?
– Been shown that as long as you mark depths in a
perceptually consistent way, perception is good
• Automated Methods:
– Popular Methods: Motion, Focus and Perspective

• Motion: Main Principle
– Objects that are closer
move faster
– Objects that are far
move slower
• Find motion vectors
– Find how much a pixel moves from one frame to
the next  Calculate displacement vector
– Larger vector == Closer depth and vice-versa

• Potential Problems:
– Sometimes, far away objects move just as fast too
– Motion estimation (calculating motion vectors)
can be subject to error (i.e. very fast motion)
– If the image / frame is noisy, will corrupt
measurements
• Depth from focus: Main Principle
– Take multiple pictures of the same scene
– Each is taken with different camera parameters

• We basically change the focal length of the
camera
– Focal length : Distance from the image plane
to the surface to capture
– Crudely, we can change the focal length by
adjusting the zoom of your lens
• After, we find the amount of blur of an object
– In this aspect, sharper surfaces are closer, and
farther objects are more blurry

• We find a correlation between the depth, and
the amount of blur over the surfaces
– Finding multiple images at different focal lengths
is a must!
• Problems:
– Needs > 1 of same shot
• May not have such info
– Math is just too crazy
– Method rarely used now!

• Depth from perspective: Main Principle
– We use parallel lines and vanishing points in an
image or frame to give us a sense of depth
– Examples: Railroad Tracks, Tunnels, Roadsides
– These entities give us a sense of depth where they
appear to converge at a single point
– This single point would be the farthest point in the
image and the farthest depth


• Problems:
– Only a subset of
images / frames
fall into this
category
– Can only deal with
outdoor or with
scenes that have
perspective within them
• Not all images belong here!

• Semi-automatic methods:
– Mark some areas in an image / frame on what you
think the best depth should be
– Use this info to determine the rest of the depths
– This is the area that I am focusing on right now
• We can consider this as a case of multiple
object image segmentation
– Each “object” is a user-marked depth
– We decompose the rest of the image into
different objects  i.e. different depths

•

• This method allows the user to fully control
the depth perception and experience
• Potential Problem:
– Takes more time because of user interaction and
computational complexity increases

• Another way to generate depth maps:
– Specialized hardware
• Example: ZCam  Measures depth using bounced infra-red
light off of objects read in by a camera sensor

– Problem: Hardware is expensive!

Direct 3D Acquisition – (1)
• Can directly acquire 3D information:
– Grabbing both left and right eye images / video
• 1st Method: Stereo Rigs
– Tripod with 2 cameras, separated by eye distance

– Drawbacks:
• Need 2 cameras! Synchronization!
• Difficult to separate cameras by eye distance

• We can also use multi-camera stereo rigs
– Each pair of cameras is positioned at a different
point to capture the same scene

– Each viewpoint captures the objects in a different
way so that we can assemble all these together to
view a 3D object without glasses (more later)


• Example: MERL 3DTV system (w/o glasses)

– 16 cameras and projectors for 16 viewpoints
– Depending on where you stand, you see a
different viewpoint  Just like in real-life!

• 2nd Method: 3D Cameras
– Specialized cameras specifically designed to take left
and right eye images


• Non Digital 3D Cameras take left and right
images on two separate rolls of film
• Digital 3D Cameras (e.g. Fujifilm’s W1) take
left and right images and generate two
separate image files
• IMAX and specialized 3D video cameras
operate in the same way
– Two separate rolls of film
– For IMAX, the cameras are large as the film is
larger. Why? For higher resolution

Displaying 3D Content – (1)
• Left & Right eye images are created
– How do we display these so we can perceive 3D?
– Many technologies exist to display 3D imagery and
video
• Let’s start off with the most basic one: Anaglyphs
– Left & right is filtered with separate colour filters
– Example: If you had a red colour filter, you determine
how much red a pixel has and that’s the output
– Each colour filter is chromatically different
• One filter cannot have any similarity in colour to the other

• When one side is filtered with one colour, you
must choose the other filter to be a
contrasting colour
– How do we choose? Trichromacy theory states
that all colours are made up of Red, Green & Blue
– We basically choose the colour filters from this set
• Examples:
– Red and Cyan (Green + Blue) Filters
– Red and Green Filters
– Red and Blue Filters, etc.

• After you filter each image separately, you
superimpose the results onto one image
• To view the images, you use anaglyph glasses,
where each side is of the same filter you used
– i.e. if you used Red for the left, and Cyan for the
right, we use anaglyph glasses that are of the
same order
– Here, the image with the red filter goes to the left
eye, and the cyan image goes to the right eye

• As such, because we’re seeing two separate
images for two eyes, we thus perceive 3D


• Advantages:
– Great for viewing without 3D technology
– Anaglyph glasses are pretty cheap
• Problems:
– Range of colours can be limited, as the
predominant colours in the images are of the
colour filters you applied
– Doesn’t work will if the range of colours in the
image are limited


• 2nd method: 3D films in theatres with polarized
glasses
– 2 projectors  Left & Right video projected
simultaneously on the theatre screen
– Views filtered with orthogonal polarizing filters
– Viewers wear low-cost polarized eyeglasses
– Each lens is orthogonally polarized with the other


• What’s polarization!?
– Light can be viewed as a propagating wave
– Polarization determines the orientation of a wave’s
oscillations
– When passed through a
polarizing filter, orientation of
the light’s propagation changes
by forcing it through a slit
– Consequence – Not all light passes through
– Left view passed through a horizontal polarized filter
– Right view passed through a vertical polarized filter

– Both views are shown simultaneously on a silver
perforated screen to preserve polarization
– Glasses  Left lens has a horizontal filter
Right lens has a vertical filter
– Left blocks right view, and
right blocks left view!
• Drawbacks:
– Need to keep your head level
– Tilting your head causes the left and right views to
bleed into each other
– Image is darker, as only some of the light is sent

• There is a way to combat “head level” issue
– Circular Polarization  Used in RealD technology
– IMAX used former method  Now they changed
– RealD is used in standard 3D theatres
– IMAX has the bigger screen, and better sound!
• What is circular polarization?
– We change the way the
wave propagates in a
circular motion


• Each lens of the 3D projector continuously
changes polarization direction
• 3D glasses: Circularly polarized liquid crystal
that automatically adjust its polarization
– How is this possible?
– One lens is circularly polarizing clockwise, while
the other is polarizing counter-clockwise
– One lens is designed to filter clockwise images,
and counter-clockwise images for the other
– Each lens receives correct corresponding image

• 3rd Method: Using shutter glasses
– Most popular in current 3DTVs on the market
– Also used in DLP Projection Systems
• Shutter glasses principle:
– Lenses are usually made of LCDs
– Used to separate the left and right views
– Lenses contain liquid crystals that block or pass
light in sync with an IR sensor, connected @ display
– Voltages are applied to the lenses so that one eye
blocks light, but the other one allows it through

– Alternate this shutting off in sync with the image
displayed on the screen to show 3D, via IR sensor
– TV / monitor displays the left image, right lens is
blocked  Allows left eye image to be seen
– After, we do for right image, with left lens blocked 
Allows right eye image to be seen


• Is used in nVidia 3D Vision Kit & XpanD 3D
– XpanD 3D: Company that markets shutter glass 3D
technology to homes and theatres
• Currently > 1000 theatres with shutter glass tech.
– nVidia 3D Vision: Kit for an nVidia video card
• IR sensor connected to video card to control views
• Only works with a compatible 3D monitor
• Advantages:
– No silver screen and keeping your head level
• Disadvantages: Shutter glasses are expensive!
– Need to replace batteries, high maintenance

• DLP 3DTV technology further explained
– DLP: Digital Light Processing
– Backbone: Digital Micromirror Device
• Tiny mirrors direct light
• Device can have over 1 million mirrors!
– Each micromirror is either ON or OFF
• ON reflect light out towards screen
• OFF do not reflect out towards screen
(absorb it instead)
– Each mirror in the DLP 3DTV is
controlled by a pixel in the image
to display to the screen


• For DLP 3DTVs, mirrors == diamond configuration
– One mirror displays two pixels of input data: How!?
• Each mirror shows one pixel, then does a half-pixel shift
downwards and shows the other pixel immediately below
• @ twice the normal frame so you can’t see the change

• Wait! Aren’t we losing 50% of the data?

– No! The half-pixel shifting ensures same resolution
• Called SmoothPicture algorithm
– Saves bandwidth: Use same bandwidth for 3D images
– For a 2D image, the input data is the image itself
– For showing 3D, the left-eye image is shown first,
then the right-eye image is shown after ½-pix shifting
– LCD shutter glasses are in sync during each shift
• Drawbacks:
– Obviously, the TV is expensive
– Shutter glasses are high maintenance, and expensive

• Next method: Interference Filter Technology
– Used in Dolby 3D and Panavision 3D systems
– A multispectral colour filter is used to filter
specific wavelengths of red, green and blue,
directed to the left eye
– Another colour filter used to filter different
wavelengths of red, green and blue, directed to
the right eye
– This uses glasses too  Designed to filter the
same wavelengths in tune with each colour filter


• This process is called: wavelength
multiplex visualization
• Advantages:
– No silver screen required
– Works with conventional screens
– Is not restricted to just theatres
• Disadvantages:
– Glasses are more expensive
– Colour filters must be very accurate

• Last but not least: Autostereoscopic Displays
– View 3D content without glasses
– Currently seen in small gaming systems and small
commercial 3D cameras
• Nintendo 3DS and view screen of the Fujifilm W1
– Currently not available publicly for larger screens
– Common problem with autostereoscopic: Good
for viewing over small screens, but larger screens
tend to make people dizzy or cause discomfort
– Research currently performed to minimize this

• Principle: Uses either lenticular sheets or
parallax barrier sheets
– Impose the left and right images
onto narrow alternating strips
– Half the columns show the
corresponding columns in the left
image, and other half show the
corresponding right image cols.
– In the figure, they’re represented
as green and pink respectively

– After we use a screen that either
blocks every other strip 
Parallax Barrier
– Or can use lenses of same size
as the strips so that we can bend
the left and right strips and make
it appear to fill the entire image
– Either of these will allow the left
and right images to be directed
to the correct eye
– You just need to stand in the right spot!

• This can work with multi-view systems too
– The technology can be modified to
display a different viewpoint of the
scene
• Remember multi-view stereo rigs?
– When you stand in a different
position, you will get a different
perspective of the scene
• Just like what would happen in real-life!
– Achieve this directing the view of a particular
perspective to the right pairs of strips / lenses
47

• Current advocates for autostereoscopic tech.
– Sharp in 2004 designed their first
autostereoscopic LCD monitor
in 2004  Discontinued in 2007
– Similar  Philips WOWvx series
• Discontinued in 2009
– Hitachi  Designed autostereoscopic
mobile phone in 2009
– Nintendo 3DS  Uses parallax barrier
– Fujifilm W1 Viewscreen 
Uses lenticular sheets

• Advantages:
– Glass-free: No maintenance req’d on equipment
– Ideal for delivering to a large group of people
• Co-ordination is required for glass-based technology
– Proven good for small screens / mobile phones
• Disadvantages:
– Larger screens still experimental and expensive
– Larger screens require you to stand far back to
appreciate 3D content


Applications
• What can 3D be used for?
– Entertainment and Gaming (obviously!)
– Real-time 3D Video Teleconferencing
– Interactive Medical Surgery
– Interactive Training Sessions
– Virtual Model Exploration 
– Robot Navigation
– Fine Art Appreciation


Conclusions
• This presentation gave a basic overview of how
3D is made, and how we display 3D
• This presentation is not exhaustive!
– Many other methods to generate 3D material
• Much research performed in this area
– Several technical conferences in 3D: IEEE 3DTVCON,
IEEE 3DIM, SPIE Electronic Imaging
– Research group in Europe researching on
standardizing 3D to mobile phones:
http://sp.cs.tut.fi/mobile3dtv/

Thank You!
Questions?


Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment

Similar to Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment (20)

Recently uploaded

Recently uploaded (20)

Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment