The consumer market of High Dynamic Range (HDR) displays and cameras is blooming rapidly with the advent of 3D video and display technologies. Specialised agencies like Moving Picture Experts Group and International Telecommunication Union are demanding the standardization of latest display advancements. Lack of sufficient experimental data is a major bottleneck for the development of preliminary research efforts in 3D HDR video technology. We propose to make publicly available to the research community, a diversified database of Stereoscopic 3D HDR images and videos, captured within the beautiful campus of Indian Institute of Technology, Madras, which is blessed with rich flora and fauna, and is home to several rare wildlife species. Further, we have described the procedure of capturing, aligning, calibrating and post-processing of 3D images and videos. We have discussed research opportunities and challenges, and the potential use cases of HDR stereo 3D applications and depth-from-HDR aspects.
A Rich Stereoscopic 3D High Dynamic Range Image & Video Database of Natural Scenes
1. A Rich Stereoscopic 3d High Dynamic Range Image &
Video Database Of Natural Scenes
Aditya Wadaskar, Mansi Sharma, Rohan Lal
Department of Electrical Engineering
Indian Institute of Technology Madras, India
Wednesday, 11 December 2019
2. 3D HDR Image and Video database
Wednesday, 11 December 2019
3. Need for such a database
• Scarcity of publicly available 3D HDR image & video datasets [1].
• Absence of 3D HDR datasets featuring natural scenes [2].
• Facilitate & expedite R&D in HDR computational photography, HDR video
compression, HDR quality assessment, etc.
• Extend applications for 3D Displays & VR/AR HMDs
Wednesday, 11 December 2019
4. • Synchronized dual sensors for capturing left and right views.
• 4M pixels resolution (2µ pixels)
• Horizontal separation between the sensors : 12 cm
• Depth range of camera : 0.5m to 20m.
.
Wednesday, 11 December 2019
ZED Stereoscopic Camera
5. ZED Stereoscopic Camera
• ZED SDK support environment stitches left and right views into one contiguous
frame.
• Camera calibration done using SDK prior to capturing.
• No need for external time-synchronization.
.
Wednesday, 11 December 2019
6. • Each scene has been captured under 3−4 different exposure settings.
• Camera frame held fixed (perfect frame alignment).
• Left and right views combined into single contiguous image.
Wednesday, 11 December 2019
Capturing Procedure
7. • Natural scenes – forests & trees, sky-scapes, surface and water reflections and
low-lit/indoor scenes
• Good variation of depth profile, colour, texture, complexity and illumination
across exposures
• Fixed camera frame – perfect frame alignment
• Slight object motion – unavoidable with natural scenes
• These characteristics make our dataset unique – opens new challenges to
researchers
More efficient tone mapping and depth estimation algorithm
Deployment of new deep learning algorithms for 3D HDR processsing
Wednesday, 11 December 2019
Dataset Attributes
8. • Number of scenes – 30 (each under 3-4 different exposures)
• Resolution of each view (L & R) : 2208 x 1242 pixels (full HD)
• Scenes chosen – forests, buildings, water bodies, waterfalls, roads, sky-
scapes, indoor and low-lit scenes.
• Perfectly steady camera frame (frame aligned multi-exposure views)
• Slight object motion between successive exposure captures. (eg. swaying
of trees, flowing water, etc)
• High spatial complexity, medium-high depth bracket.
Wednesday, 11 December 2019
3D HDR Image Dataset
9. Wednesday, 11 December 2019
High depth profile; high spatial complexity. Sky and clouds seen at low exposures. Lotuses seen at high exposure.
High depth profile; high spatial complexity. Rich shades of green, interplay of lights and shadows, motion of branches.
10. Wednesday, 11 December 2019
Indoor scene: medium depth profile; medium spatial complexity. contrasting lighting settings, floor reflections.
High depth profile; high spatial complexity. The building reflects the sky, clouds visible at low exposures.
11. • Contain short fixed-frame captures of natural scenes
• Number of scenes - 10 (each scene captured under 3-4 exposures)
• Resolution of each view (L&R) : 1920×1080
• Frame rate : 30 fps
• Perfectly steady camera frame (perfect frame alignment)
• Multi-exposure captures done sequentially – it is difficult to reproduce same motion
of objects between exposures.
• Multi-exposure stereo videos are not identical – slight to medium, partially traceable motion of
objects at varying depth ranges.
• Opportunity for researchers to develop efficient tone mapping and depth estimation algorithms.
Wednesday, 11 December 2019
3D HDR Video Database
12. Classification of data based on extent of object
motion
• Static scene with partially traceable object motion –
Objects undergo gentle, partially traceable motion. Eg. Fountain, swaying of trees
in a forest, flowing stream, etc.
HDR video conversion using one video as reference, identifying scene objects, and
using deep learning algorithms - to learn object motion and colour variation across
exposures.
• Static scene with large object motion –
Objects undergo large motion, and often change between different exposure captures.
Eg. moving traffic on a road, persons talking, deer feeding on grass, etc.
More difficult problem due to changing objects – requires sophisticated learning
algorithms.
Wednesday, 11 December 2019
3D HDR Video Database
15. • Captured dataset requires deep-learning based algorithm to account for intricate
object motion.
• Require tone mapping operators which takes into account complex object
motion [3].
Wednesday, 11 December 2019
Post processing and potential use cases
19. Left view + depth map
Wednesday, 11 December 2019
Depth Video
20. • Depth-from-stereo–
Multiple view synthesis from computed depth map.
Extension to VR, AR, MR technologies (XR)
• Working on deep-learning based methods to meet 3D HDR challenges.
Wednesday, 11 December 2019
Potential applications & use cases
21. Conclusion
• Proposed database presents 3D HDR Image and Video datasets of natural
scenes.
• Facilitates investigation of challenges involved with 3D HDR video depth
estimation, tone mapping, encoding, quality assessment.
• Establishes the need for developing robust deep learning and neural network
based training models.
• Opens research avenues to counteract challenges in creating a backward
compatible end to end production pipeline for 3D HDR video.
Wednesday, 11 December 2019
22. Database will be available for download upon request at
https://sites.google.com/view/hdr-dataset-aditya-wadaskar/home
Wednesday, 11 December 2019
Where to find the data
23. References
[1] C. Bist, R. Cozot, G. Madec and X. Ducloux. QoEbased brightness control for HDR
displays. QoMEX, Erfurt, 2017, pp. 1-6.
[2] A. Banitalebi-Dehkordi. Introducing a Public Stereoscopic 3D High Dynamic Range
(SHDR) Video Database. 3D Research, 2017, 8(1), pages 3.
[3] Wu, S., Xu, J., Tai, Y. W., & Tang, C. K. (2018). Deep high dynamic range
imaging with large foreground motions. In Proceedings of the European Conference
onComputer Vision (ECCV) (pp. 117-132).
[4] Ashutosh Saxena, Jamie Schulte, and Andrew NG. Depth Estimation Using
Monocular and Stereo Cues. IJCAI International Joint Conference on Artificial
Intelligence, 2197-2203, 2007.
[5] Sunghoon Im, Hae-Gon Jeon, Stephen Lin, and In So Kweon. DPSNet: end-to-end
deep plane sweep stereo. arXiv:1905.00538, 2019.
Wednesday, 11 December 2019