360° Video Viewing Dataset in Head-Mounted Virtual Reality

Wen-Chih Lo¹, Ching-Ling Fan¹, Jean Lee¹, Chun-Ying Huang²,
Kuan-Ta Chen³, and Cheng-Hsin Hsu¹
¹National Tsing Hua University, HsinChu, Taiwan
²National Chiao Tung University, HsinChu, Taiwan
³Institute of Information Science Academia Sinica, Taipei,
Taiwan
1
ACM MMSys’17, Dataset Track, Taipei, Taiwan, June 22, 2017

 Introduction
 Basic statistics
 Dataset structure
 Content trace collection
 Sensor trace collection
 How to choose
 Sample applications
 Teaser
2

 A 360° video is a view that every direction is recorded at the
same time
 With planar monitors is passive experiences
 Head-Mounted Displays (HMDs) offer more immersive
experiences
4

 VR/AR deliver a total $3.9 billion, including $2.7 billion
VR and $1.2 billion AR, revenue in 2016 [1]
5[1] After mixed year, mobile AR to drive $108 billion VR/AR market by 2021, Digi-capital, Jan 2017.
https://goo.gl/Blcv2f

 Latency
 Extremely high resolution
 The distortion while stitching or projecting videos/images
 Compress tremendous amount of video data in real-time
 Reduce the computational cost using tile-based viewing
method
6
Distortio
n

 We recruit 50 subjects, each of them is asked to watch ten
360°videos
 52% are male
 Most of them are in early twenties
 56% of them for the first time
8
56%
42%
2%
How often do you watch
360°video using HMDs?
Never Seldom Often

9
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile

Video traces
Image saliency map
Motion map
10

 We collect ten 360° videos from YouTube
 1 minute long, 4K resolution, and 30fps
11
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Content

 Convolutional Neural Networks (CNN)
 Based on a pre-trained VGG-16 networks
 Gray-scale image (from 0 to 255)
12
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
[1] M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara. 2016. A Deep Multi-Level Network for
Saliency Prediction. In International Conference on Pattern Recognition (ICPR’16).
Content

 Relative motions
 Lucas-Kanade optical flow
 Black-and-white images (0 or 1)
13
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
[2] B. Lucas and T. Kanade. 1981. An iterative image registration technique with an application to
stereo vision. In Proc. of the International Joint Conference on Artificial Intelligence (IJCAI’7)
Content

 Collect sensor data from HMDs while viewers are watching
360° videos
 Frame Capturer: GamingAnywhere[1]
 Sensor Logger: OpenTrack[2]
14
[1] http://gaminganywhere.org/
[2] https://github.com/opentrack/opentrack
360° video
Sensor data
with timestamp250Hz
Video frame with timestamp
30Hz
Oculus
DK2

 Raw data
 Orientation data
 Tile data
15

 Raw sensor data from HMDs
 Timestamp with epoch time
 Position (x, y, and z)
 Orientation (yaw, pitch, and roll)
16
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

17
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

18
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
x
y
z
Sensor

19
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
x
y
z
roll
pitchyaw
Sensor

 Align the sensor data and video frames
 Different viewers introduce different bias
 A 35-sec calibration procedure
20
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

 Design a calibration procedure
21
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

22
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

23
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

24
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

 Field-of-View (FoV) are 100° x 100° circle
 We divide each frame into 192x192 tiles
 We number the tiles from upper-left to lower-right (from 0
to 199)
25
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
100°
100°
Sensor

 Field-of-View (FoV) is 100° x 100° circle
to 199)
26
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor

 Field-of-View (FoV) is 100° x 100° circle
to 199)
27
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
0

Head
moveme
nt
Eye
moveme
nt
Content
-driven
data
Open-
source
software
Applicati
on-driven
Lo et al. [1]
Rai et al. [2]
Corbillon et al.
[3]
Wu et al. [4]
28
[1] W. Lo, C. Fan, J. Lee, C. Huang, K. Chen, and C. Hsu. “360° Video Viewing Dataset in Head-Mounted Virtual
Reality.” In Proc. of the 8th ACM on Multimedia Systems Conference (MMSys'17). 2017.
[2] Y. Rai, J. Gutiérrez, and P. Callet. “A Dataset of Head and Eye Movements for 360 Degree Images.” In Proc. of the
8th ACM on Multimedia Systems Conference (MMSys'17). 2017.
[3] X. Corbillon, F. Simone, and G. Simon. “360-Degree Video Head Movement Dataset.” In Proc. of the 8th ACM on
Multimedia Systems Conference (MMSys'17). 2017.
[4] C. Wu, Z. Tan, Z. Wang, and S. Yang. “A Dataset for Exploring User Behaviors in VR Spherical Video Streaming.”
In Proc. of the 8th ACM on Multimedia Systems Conference (MMSys'17). 2017.

 Viewed tile predictions
 Bitrate allocation
α
β
θ
FoV
0°
1 1 1 1
1 1 1 1
1 1 1
0 0 0 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0
0 0
011 1 1 1
1 1 1 1
1 1 1
1 1

 NOSSDAV’17
 Tomorrow (6/23) 2:10pm - 3:10pm at 2nd Conference Room
 C. Fan, J. Lee, W. Lo, C. Huang, K. Chen, and C. Hsu,
“Fixation Prediction for 360˚ Video Streaming in Head-
Mounted Virtual Reality”
30

Wen-Chih Lo
wchih.lo@gmail.com
Dataset link address:
https://nmsl.cs.nthu.edu.tw/dropbox/3
60dataset.zip
31

360° Video Viewing Dataset in Head-Mounted Virtual Reality

Recommended

Recommended

More Related Content

Similar to 360° Video Viewing Dataset in Head-Mounted Virtual Reality

Similar to 360° Video Viewing Dataset in Head-Mounted Virtual Reality (20)

Recently uploaded

Recently uploaded (20)

360° Video Viewing Dataset in Head-Mounted Virtual Reality

Editor's Notes