Three-dimensional video

ThreeThree-dimensional video:
Trends and challenges
Marco Cagnazzo
Maître de conférences
Télécom-ParisTech

Télécom-ParisTech
 Founded in 1878 as Ecole
Supérieure de Télégraphie
 The place where the word
Telecommunications
(Télécommunications)
was born

 Ecole Nationale Supérieure des Télécommunications from 1943 to
2008, Télécom-ParisTech since then
• Member of Institut Mines-Telecom since March 2012

2

26/02/2014

Marco Cagnazzo

3D Video: Trends and Challenges

Télécom-ParisTech
 7 Masters Of Science in Telecommunications
 Active research
•
•
•
•
•

More than 220 researchers
≈400 PhD Students
50 Doctorates awarded per year
Dozens of post-doc positions
opened every year
Over 600 scientific publications per year

 CNRS Mixes Research Unity
•
•
•
3

Signal and Image
Processing
Computer Science
and Networks
Electronics and Communications

26/02/2014

Marco Cagnazzo


Summary
 Introduction
 3D scene acquisition and formats
 3D geometry
 3D representation: coding
 3D services
 Conclusions
4

26/02/2014

Marco Cagnazzo


Summary
 Introduction
• 3D representation: an old new story?
• Depth perception

 3D geometry
 3D services
 Conclusions
5

26/02/2014

Marco Cagnazzo


3D imaging: an old new story?

The weighing of the heart scene from
the Papyrus of Ani, ca. 1200 B.C.

FlatPerspective,
images
Masaccio, Trinità (1425-1427),
Cubism was based on the idea of Firenze
Santa Maria Novella,
distance fog multiple points of view in a
incorporating
Perspective
painted image, as if to simulate the visual

P. Picasso, Les Demoiselles
d'Avignon, 1907, MOMA, NYC

Multi-view

experience of being physically in the
presence of the subject, and seeing it from
different angles (Wikipedia)
Piero della Francesca?, Leon Battista Alberti?, Città ideale (1470-1475 circa),
Galleria Nazionale delle Marche, Urbino.
6

26/02/2014

Marco Cagnazzo


Stereoscopic imaging

 As soon as photography was born, stereoscopic devices
were created
 1844: Stereoscope by David Brewster, a device that could
take photographic pictures in 3D.

 1851: Improvement by Louis Jules Duboscq (picture of
Queen Victoria displayed at The Great Exhibition)

7

26/02/2014

Marco Cagnazzo


Example

Stereoscopic view of Manhattan, 1909

8

26/02/2014

Marco Cagnazzo


Anaglyph image

9

26/02/2014

Marco Cagnazzo


3D Movies
 1855: the Kinematoscope was invented, ie the Stereo Animation
Camera.
 1915: The first anaglyph movie was produced in
 1922: the first public 3D movie was displayed - The Power of Love
 1935: the first 3D color movie was produced
 1947: Soviet Union developed 3D films: Robinson Crusoe
 ’50: many 3D movies were produced: Bwana Devil, House of Wax,
Alfred Hitchcock’s Dial M for Murder (movie was released in 2D
because not all cinemas were able to display 3D films).
 2000s: Computer graphics and 3D Renaissance (Avatar, etc.)
•
•
•

10

26/02/2014

3D video channels, 3D TV
3D video standards
Multi-view, super-multiview, holoscopy… holography?

Marco Cagnazzo


3D Television





2008: 3D broadcast on Japanese cable channel BS 11
01/01/2010: SKY 3D started broadcasting in S. Korea
24/03/2010: Cablevision (USA) launched a 3D version of its MSG channel
03/04/2010: British Sky Broadcasting launched a limited 3D TV broadcast
service.
 18/05/2010: Spanish Canal+ started 3D broadcast
 28/09/2010: Virgin Media launched a 3D TV on Demand service
...
 November 2010: 8 3D channels in Europe
 April 2011: HIGH TV, a 3D family entertainment Channel launched
 2012: 3D TV launched in China, Italy, and other countries
 2013: New 3D programs in Brazil; BBC suspends 3D programs
11

26/02/2014

Marco Cagnazzo


3D Television
Channel
HIGH TV 3D
n3D
Cinema 3D
3net
Eurosport 3D
Sky 3D
Foxtel 3D
HD1
Sky 3D
Anixe 3D
3D-TV
Sport 5 3D
MSG 3D
nShow 3D

12

26/02/2014

Country(s)
Worldwide
United States
United States
United States
Europe
United Kingdom
and Ireland
Australia
Belgium (and
other European
countries)
Germany and
Austria
Germainspeaking
countries
Finland
Israel
United States
Poland

Channel
Canal+ 3D
Canal+ 3D
España
NEXT Man 3D
NEXT Lejdis 3D
NEXT Young 3D

Country(s)
France

Active 3D

India

BS11
RedeTV!
Viasat 3D
Brava3D
Teledünya 3D
Sky 3D

Japan
Brazil
Sweden
Europe
Turkey
South Korea

Spain
Poland
Poland
Poland

Sukachan 3D169 Japan
ESPN 3D
Xfinity 3D
Penthouse 3D
TV Azteca 3D

Marco Cagnazzo

United States
United States
Europe
Mexico


Depth perception
 Monocular cues
•
•
•
•
•
•
•
•

13

26/02/2014

Perspective
Motion parallex
Depth from motion
Distance fog and texture degradation
Object sizes
Illumination and shades
Blur
Occlusions

Marco Cagnazzo


Monocular cues
 Perspective, distance fog and texture degradation

14

26/02/2014

Marco Cagnazzo


Monocular cues
 Depth from motion

15

26/02/2014

Marco Cagnazzo


Monocular cues
 Illumination and shadows

16

26/02/2014

Marco Cagnazzo


Monocular cues
 Defocus blur, occlusions

17

26/02/2014

Marco Cagnazzo


Binocular cues
 Stereovision: vergence
• Disparity perception

 Accommodation (focus)

18

26/02/2014

Marco Cagnazzo


3D Video
 Nowadays, 3D video is much about a very simple
representation of a 3D scene, i.e., a stereoscopic (two
views) representation
 However, more complete and flexible representations
exist, as we will see

 Ideally, one would like to reproduce the light field of the
original scene

19

26/02/2014

Marco Cagnazzo


3D Video Systems

2D/3D
conversion

20

26/02/2014

…
3D Video Decoder
+DIBR

…

Multi user
3D TV

Single user
3D TV

DVB
Decoder

Multiview +
Depth (MVD)

3D Video Coding

Depth
camera

…

Multi-camera
setup

3D Content Production

Stereo
camera

Video

Depth /
Geometry
Marco Cagnazzo

Meta data


2D TV

Video services evolution

N
views

# views

FTV

HD-FTV

UD-FTV

HD3DTV

UD3DTV

2
views

3DTV
720
×
576

TV

1920
×
1080

HDTV

7680
×
4320

UDTV

# pixels
21

26/02/2014

Marco Cagnazzo


Summary
 Introduction
• Plenoptic function
• Stereo, Multiview, MVD, LDV, holoscopy

 3D geometry
 3D services
 Conclusions
22

26/02/2014

Marco Cagnazzo


3D Video capture
 Stereoscopy : 2 cameras mounted side by side, separated by the same
distance as between a person's pupils.
 Multi-view capture uses arrays of many cameras to capture a 3D scene
through multiple independent videos
 Color+depth camera: capture normal video and a depth map, estimated
with radar-like techniques (using infrared) or structured light
 Multiview+depth (MVD): N Color+depth cameras
•

MVD: the most flexible format (view synthesis at user side)

 Holoscopy

23

26/02/2014

Marco Cagnazzo


3D Video representation: plenoptic function
 The plenoptic function, or light-field of a scene is the complete
information about what can be seen from any angle, at any
position, in any time, at any frequence (color)

y

z

x

24

26/02/2014

Marco Cagnazzo



y

z

x

25

26/02/2014

Marco Cagnazzo



y

z

x

26

26/02/2014

Marco Cagnazzo


3D video representation


27

26/02/2014

Marco Cagnazzo


From the plenoptic function to the stereo video

y

z

x

28

26/02/2014

Marco Cagnazzo


From the plenoptic function to the multiview video

y

z

x

29

26/02/2014

Marco Cagnazzo


From the plenoptic function to the super multiview
video

y

z

x

30

26/02/2014

Marco Cagnazzo


3D Video Acquisition: stereo camera

31

26/02/2014

Marco Cagnazzo


3D Video Acquisition: color + depth

32

26/02/2014

Marco Cagnazzo


3D Video Acquisition: MVD

33

26/02/2014

Marco Cagnazzo


3D rendering: anaglyph

34

26/02/2014

Marco Cagnazzo


3D rendering: polarized glasses

35

26/02/2014

Marco Cagnazzo


3D rendering:
Alternate-frame sequencing
 Every second frame is from the left [right] view
 Video is projected at twice the frame rate
 Viewers wears glasses that shutter alternatively the left or
the right eye

36

26/02/2014

Marco Cagnazzo


Auto stereoscopic display

37

26/02/2014

Marco Cagnazzo


Traditional 3D rendering: problems
 Accommodations (focus) - vergence (disparity) conflict
 Cross-talk

38

26/02/2014

Marco Cagnazzo


From the plenoptic function to the holoscopy

y

z

x

39

26/02/2014

Marco Cagnazzo


From the plenoptic function to the holoscopic
format

 New format: holoscopy, or integral imaging
 Glasses-free 3D, promising no visual pain

3D scene

Microlenses
array

Camera

2D screen

Microlenses
array

3D
rendering

Holoscopic image and videos
Data redundancy
Grid-shaped pattern
Compression?
40

26/02/2014

Marco Cagnazzo


Holoscopy

41

26/02/2014

Marco Cagnazzo


Other formats: Layered Depth Video and Images

42

26/02/2014

Marco Cagnazzo


3D scene representation: summary
# depths (geometrical information)
∞ depths

3D model
+ texture

1View+
Multi
Depth

1 depths

1View+1
Depth

0 depths

2D TV

LDV

Multiview

Super Multiview
Holoscopy

1 view
43

26/02/2014

Light
field

# views

∞ views
Marco Cagnazzo


Summary
 Introduction

 3D geometry
• Pin-hole camera model
• Stereoscopy and disparity

 3D services
 Conclusions
44

26/02/2014

Marco Cagnazzo


Pin-hole camera model
C : optical center
f : focal length
c : principal point
Using the image plane we avoid the image inversion of the
retinal plane

45

26/02/2014

Marco Cagnazzo


 Coordinate systems:
•
•
•
•

46

26/02/2014

W.r.t. the optical center (XC,YC,ZC)
Wr.t. the image plane (x,y)
Wr.t. the principal point (xc,yc)
Real world (X,Y,Z)

Marco Cagnazzo


M
m

m’

f
C

Image
plane

Object
plane

M

m

m’

47

26/02/2014

M’

Zc

M’

Marco Cagnazzo


Homogeneous coordinates


48

26/02/2014

Marco Cagnazzo


Intrinsic parameters


Image
plane

m

m’

49

26/02/2014

Marco Cagnazzo


Image coordinates


50

26/02/2014

Marco Cagnazzo


Extrinsic parameters


51

26/02/2014

Marco Cagnazzo


Image and real coordinates


52

26/02/2014

Marco Cagnazzo


Stereoscopy

The two projections of the
same point into the two image
planes are called homologous
points
The stereo matching
problem consists in finding
the correspondence
between homologous
points

53

26/02/2014

Marco Cagnazzo


Epipolar geometry


54

26/02/2014

Marco Cagnazzo


Parallel cameras





It is a case of particular interest
Corresponds to the human vision (frontal vision)
Parallel optical axes and same focal length
In this case the epipolar lines are parallel to the baseline, and the images
are co-planar
 Homologous point only differ for the an horizontal component: it is called
disparity
 It is possible to rectify a couple of camerals, i.e. to produce the images
corresponding to the co-planar case

55

26/02/2014

Marco Cagnazzo


Disparity and depth

B

X-B

X

x

M

Z

x'
m

Cl

56

26/02/2014

f
Cr

M’

Marco Cagnazzo


Disparity estimation

57

26/02/2014

Marco Cagnazzo


The disparity field


58

26/02/2014

Marco Cagnazzo


The disparity field: example

59

26/02/2014

Marco Cagnazzo


The disparity estimation problem


60

26/02/2014

Marco Cagnazzo


Difficulties of stereo matching

 Occlusions: not all the left image points are visible in the right
image
 Not perfectly identical cameras and noise make homologous
point having different luminance/colour
 Untextured regions: this makes difficult evaluating the data
attachment term
 Complexity of the minimization problem
•
•
•
61

26/02/2014

Full search
Convex minimization
Parallel algorithms
Marco Cagnazzo


Post-processing
 Often the disparity field can be enhanced using postprocessing
• Cross-checking helps in finding occlusion points
• Interpolation: it allows to “fill in” occlusions
• Median filtering: removes estimated values too different
with respect to the neighborhood

62

26/02/2014

Marco Cagnazzo


Summary





Introduction
3D scene acquisition and formats
3D geometry
3D representation: coding
• Multiview video coding
• MVD video coding
• Holoscopy coding

 3D services
 Conclusions

63

26/02/2014

Marco Cagnazzo


Coding of 3D video
 Encode separately each view
(Simulcast)

 Encode jointly view
• Use other views to perform
prediction of current image

 Encode one/more views and a
depth maps
• Joint or separate coding of
view and depth
64

26/02/2014

Marco Cagnazzo


Compression standards
 Frame compatible stereo interleaving

 MPEG-2 Multiview Profile
I

26/02/2014

B

P

B

B

P

B

I

B

P

65

B

B

B

B

B

B

B

B

P

B

Marco Cagnazzo


Compression standards: H.264/MVC
P0

B0

B3

B1

B3

P0

B3

B

B2

B4

B1

B4

B2

B4

B0

B4

B3

B1

B3

B0

B3

B1

B3

P0

B3

B0

B4

B2

B4

B1

B4

B2

B4

B0

B4

I0

B3

B1

B3

B0

B3

B1

B3

I0

B3

B0

B4

B2

B4

B1

B4

B2

B4

B0

B4

P0

B3

B1

B3

B0

B3

B1

B3

P0

B3

B0

B

B2

B4

B1

B4

B2

B4

B0

B4

P0

26/02/2014

B3

P0

66

B1

B0

H.264 MVC extension
Base view + dependent
views
Disparity compensated
prediction

B3

B3

B1

B3

B0

B3

B1

B3

P0

B3

Marco Cagnazzo


3D video coding
 3D Video Coding (3DVC)
 New phase of standardization in MPEG
 Objectives:
• Display-independent representation
• Advanced stereoscopic display processing: e.g. adjust depth
perception by controlling baseline distance
• High quality auto-stereoscopic multiview displays: many
views with limited bit rate

67

26/02/2014

Marco Cagnazzo


MVV vs. MVD


68

26/02/2014

Marco Cagnazzo


3D Video Coding (3DVC)

69

26/02/2014

Marco Cagnazzo


3D-HEVC: Coding structure
 Coding by access units
HEVC

Temporal
Inter-component
Inter-view (texture)
Inter-view (depth)

HEVC + additional tools

70

26/02/2014

Marco Cagnazzo


Standardization is on-going

 Inter-view tools
• Disparity compensated prediction
• Inter-view motion prediction
• …

 Inter-component tools
•
•
•
•

71

26/02/2014

Quad-tree initialization/limitation
Motion parameter inheritance
Intra-prediction inheritance
…

Marco Cagnazzo


Enhancing the use of DCP
DV : 9%
MV : 91%

Intra

Temporal Skip
72

26/02/2014

Temporal Inter Interview Inter Interview Skip
Marco Cagnazzo


Conditional mode inheritance

73

26/02/2014

Marco Cagnazzo


Criteria for inheritance
Sobel
Module

Angle histogram
60

300

10

50

250

20

30

150

Nbr occurences

40

200

30

40
20

100
50
10

50
60
0
-2

10

20

30

40

50

-1.5

-1

-0.5

60

Module

1

1.5

2

1.5

2

300

250

250

200
30

150

Nbr occurences

350

300

20

0.5

Angle histogram

350

10

0
Angle

200

150

40
100

100

50
50

50

60
10

74

26/02/2014

20

30

40

50

60

0

0
-2

-1.5

-1

-0.5

0
Angle

0.5

Marco Cagnazzo

1


Non standard approach:
Depth Coding Based on Elastic Deformations
1


1

Base tool: A tool that can find an intermediate contour between an initial
and a final one, by generating the geodesic (series of elastic deformations)
between the two curves.

2
2

3
4
5

3

6
7
4

75

26/02/2014

Marco Cagnazzo


8

Depth compression: impact on image synthesis

76

26/02/2014

Marco Cagnazzo


DIBR: Depth-image based rendering






Given a view, how to synthesize a virtual view point?
It is possible if depth is known:
Linear operation (omography) in homogenous coordinates
Further simpliflied in the rectified case: disparity compensation
VSRS: view synthesis reference software
Reference image plane

Virtual image plane

M

m'
m
C2

C1
77

26/02/2014

Marco Cagnazzo


VSRS: global scheme
Reference
homography
matrix

Single view
processing
Filling
holes

Synthesis
homography
matrix

Reference
homography
matrix

78

26/02/2014

Merging

Single view
processing

Marco Cagnazzo


VSRS: single view processing

Reference
homography
matrix

Synthesis
homography
matrix

Reference depth

Depth Map
Synhtesis
Synthesized view

Homography
Matrix

View
Synhtesis
Reference view

79

26/02/2014

Marco Cagnazzo


Depth map synthesis
 Mapping of depth values on the image plane
 When tow points are associated to the same coordinates,only the nearest is
kept (occlusion)
 Some coordinates may have no depth value (disocclusion)
 Median filtering removes “small” holes
Synthetized depth

80

26/02/2014

Median filtering

Marco Cagnazzo


View synthesis
 Mapping of texture values of the reference image using the synthetized
depth
 Depth knowledge allows to solve some occlusion conflict

Synthesis from the left

81

26/02/2014

Synthesis from the right

Marco Cagnazzo


Contour correction
 False contours may appear in the synthetized view
 This can me mitigated if filled regions are artificially increased by one pixel

82

26/02/2014

Marco Cagnazzo


View merging
 Left and right images are merged, averaging pixels where both views are
available
 As a consequence, only small holes remain in the merged image

83

26/02/2014

Marco Cagnazzo


Holes filling: inpainting


84

26/02/2014

Marco Cagnazzo


Holes filling: inpainting

Holes
Filling

85

26/02/2014

Marco Cagnazzo


Encoding holoscopic video
 The holoscopic videos (HV) have a lot of redundancy…
 … but also a large high-frequency content (grid)
• Grid removal?

 Benchmark: “2D coding”, i.e. plain HEVC on the HV

86

26/02/2014

Marco Cagnazzo


Encoding holoscopic video
Ad hoc techniques
Self-similarities: intra-image motion-estimation
View extraction + Multi-view coding
Scalable coding

Residual
encoder

Holoscopic
Prediction

Residual
encoder

Inter-view
Prediction

Multiplexer

View extraction






2D Encoder
87

26/02/2014

Marco Cagnazzo


Summary
 Introduction
 3D geometry
 3D services
• FTV and IMVS

 Conclusions
88

26/02/2014

Marco Cagnazzo


FTV System

Video
capture

26/02/2014

Encoding

2D/3D
Display

89

Preprocessing

View
generation

Decoding

Marco Cagnazzo


FTV Display

View
Synhtesis

3D Display

FTV Data

Viewpoint
control
View
Synhtesis

90

26/02/2014

2D/3D
Display

Marco Cagnazzo


FTV interactive streaming
 FTV can be very heavy, even after compression
 In the interactive framework, only 2views + 2 depths could
be sent
 The current view is synthesized using encoded views
 Problem: view switching (among encoded views) affects
temporal prediction

91

26/02/2014

Marco Cagnazzo


FTV interactive streaming
 Multiview video for free viewpoint TV services
 Several view available: the user interactively switches from one view
to another
 View pattern unknown at encoding time

92

26/02/2014

Marco Cagnazzo


Interactive Multiview Video Streaming
Views

All frames are Intra
Coded
Each image is coded
and stored only once
Large bandwidth
requested
Relatively low server
space requested
Time

93

26/02/2014

Marco Cagnazzo


Views

P-frames are used:
all possible frame
dependencies are
coded

Each image is coded
many times
Smallest bandwidth
requested

Very large server
space requested

Time

94

26/02/2014

Marco Cagnazzo


Distributed video coding: principle

Quantizer

Q

Turbo
Encoder

Buffer

Q’

Turbo
Decoder

Min Distort
Reconstr
Decoded
WZFs

WZ

WZ

WZ

SI

Slepian-Wolf Coder

Image
Interpolation

KF

KF

Intra
Coder

Encoder
95

26/02/2014

Intra
Decoder

Decoded
KFs

Decoder
Marco Cagnazzo


Views

WZ-frames are used:
only parity bits are
coded

Each image is coded
and stored only once
Trade-off between
server space and
bandwidth
Time

96

26/02/2014

Marco Cagnazzo


Application to IMVS:
Bandwidth

Only
Intra

WZ coding

Ideal Case: Path known
at encoding time

Predictive coding:
Each image coded
many times

Operation
region

Server space

97

26/02/2014

Marco Cagnazzo


Conclusions
 3D video has periodically experienced waves of excitement
and deception
 A main problem is the visual discomfort related to the
stereoscopic representation
 The emerging format may solve this problem
• Super-multiview, holoscopy

 Many problems yet to be solved
• Effective compression
• Quality evaluation (objective and subjective metrics)
• Transmission

 Is holography the future?

98

26/02/2014

Marco Cagnazzo


Conclusions
Contact :
cagnazzo@telecom-paristech.fr
Bibliography :
[1] M. Tanimoto, Overview of free viewpoint television. In Signal Processing: Image
Communication Volume 21, Issue 6, July 2006, Pages 454-461
[2] A. Smolic and P. Kauff, Interactive 3-D video representation and coding
technologies. Proc. IEEE, 93(1), pp. 98–110, Jan. 2005
[3] G. Cheung, A. Ortega and N. Cheung, Interactive Streaming of Stored Multiview
Video Using Redundant Frame Structures, in IEEE Transactions on Image
Processing, 20(3), pp.744-761, March 2011
[4] F. Dufaux, B. Pesquet-Popescu, M Cagnazzo (eds.): Emerging Technologies for
3D Video. Wiley, 2013
[5] Faugeras, O. , Three-dimensional computer vision: a geometric viewpoint. MIT
Press, Cambridge, MA, 1994
[6] C. Fehn, Depth-Image-Based Rendering (DIBR), Compression and Transmission
for a New Approach on 3D-TV, SPIE Electronic imaging 2004
[7] M. Bertalmio, G. Sapiro, C. Ballester and V. Caselles, Image inpainting,
Computer Graphics, SIGGRAPH 2000, July 2000, 417–424

99

26/02/2014

Marco Cagnazzo


THANKS FOR YOUR ATTENTION!

?? ||

(1)

______________
(1)

100

26/02/2014

Questions or comments, ® Dario Rossi, Télécom-ParisTech

Marco Cagnazzo


Three-dimensional video

Recommended

Recommended

More Related Content

Similar to Three-dimensional video

Similar to Three-dimensional video (20)

Recently uploaded

Recently uploaded (20)

Three-dimensional video