Published in MMSys'17 on June 2017.
360° videos and Head-Mounted Displays (HMDs) are getting increasingly popular. However, streaming 360° videos to HMDs is challenging. This is because only video content in viewers' Field-of-Views (FoVs) is rendered, and thus sending complete 360° videos wastes resources, including network bandwidth, storage space, and processing power. Optimizing the 360° video streaming to HMDs is, however, highly data and viewer dependent, and thus dictates real datasets. However, to our best knowledge, such datasets are not available in the literature. In this paper, we present our datasets of both content data (such as image saliency maps and motion maps derived from 360° videos) and sensor data (such as viewer head positions and orientations derived from HMD sensors). We put extra efforts to align the content and sensor data using the timestamps in the raw log files.
The resulting datasets can be used by researchers, engineers, and hobbyists to either optimize existing 360° video streaming applications (like rate-distortion optimization) and novel applications (like crowd-driven camera movements). We believe that our dataset will stimulate more research activities along this exciting new research direction.
This document discusses optimizing 360-degree video streaming to head-mounted virtual reality. It covers challenges like existing codecs only supporting 2D videos and 360 videos having wider views than conventional videos. Approaches proposed include fixation prediction to avoid streaming unwatched parts, QoE modeling designed for 360 videos to improve user experience, and an adaptive streaming platform to select and transmit tiles based on fixation prediction while allocating bitrates based on the QoE model. Part I discusses fixation prediction including using neural networks trained on viewing features. Part II covers QoE modeling, noting limitations of existing metrics and factors that affect QoE like content and bitrates. It constructs a logarithmic linear QoE model. Part III outlines an
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityWen-Chih Lo
Published in NOSSDAV'17 on June 2017.
We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings.
Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.
Performance Measurements of 360◦ Video Streaming to Head-Mounted Displays Ove...Wen-Chih Lo
Published in APNOMS'17 on October 2017.
Watching 360◦ videos using Head-Mounted Display (HMD) allows users to only see a part of the whole 360◦ videos. With this feature, tiled videos become a potential solution for aggressively reducing the required bandwidth for 360◦ video streaming, turning it into a reality in cellular networks. In this paper, we design several experiments for quantifying the performance of tile-based 360◦ video streaming over a real cellular network on our campus. In particular, we empirically investigate the impacts of tile streaming over 4G networks, such as coding efficiency, bandwidth saving, and scalability.
Our experiments lead to interesting findings, for example, (i) only streaming the tiles viewed by the viewer achieves bitrate reduction by up to 80% and (ii) the coding efficiency of 3x3 tiled videos may be higher than non-tiled videos at higher bitrates.
We believe this work will stimulate more studies in the emerging area of mobile AR/VR (Augmented Reality and Virtual Reality) over 4G networks.
The media landscape changes significantly over the last few years by new content formats, new service offerings, additional consumption devices and new monetization models. Think of Netflix, DAZN, Mediatheks, mobile devices, interactive content, smart TVs, Virtual and Augmented Reality, and so on. Many of these efforts have been realized by a limited usage of standards, but are standards irrelevant? Secondly, more and more services are enabled by latest mobile compute platforms enabling new services and experiences. This presentation will provide an overview some of these trends and will motivate the development of global interop standards. Specific aspects will include the move of linear TV services to the Internet (both mobile and fixed) as well recent advances on Extended Reality and immersive media trends.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/dec-2017-alliance-vitf-khronos
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Neil Trevett, President of the Khronos Group, delivers the presentation "Update on Khronos Standards for Vision and Machine Learning" at the Embedded Vision Alliance's December 2017 Vision Industry and Technology Forum. Trevett shares updates on recent, current and planned Khronos standardization activities aimed at streamlining the deployment of embedded vision and AI.
The document introduces CoaXPress, a high-speed video interface standard for computer vision applications. Key points include:
- CoaXPress uses a single coaxial cable to transmit video data at rates up to 6.25 Gbps, camera control signals, power, and real-time triggers. This simplifies cabling compared to alternatives.
- It supports cable lengths up to 130m, longer than competing standards, and aggregates multiple cables for even higher bandwidth up to 50 Gbps.
- Euresys offers CoaXPress frame grabbers with up to four 6.25 Gbps connections and features like I/O lines, camera control, and event logging software.
The document discusses the concept of the Tactile Internet with Human-in-the-Loop. It aims to democratize access to skills and expertise for people of all backgrounds and abilities. This goes beyond the current Internet's goal of providing access to information regardless of location or time. The document outlines a vision for two-way skills transfer between humans and machines using multimodal feedback over 5G networks. It discusses challenges like differing neural time delays for multisensory perception and individual differences in processing that affect perception and action. The Center for Tactile Internet's research agenda involves understanding multisensory goal-directed processing neurocognitively, modeling perception and action, and expertise in related fields to advance human-technology interactions
Tile-based Streaming of 8K Omnidirectional Video: Subjective and Objective Qo...Alpen-Adria-Universität
Omnidirectional video (ODV) streaming applica- tions are becoming increasingly popular. They enable a highly immersive experience as the user can freely choose her/his field of view within the 360-degree environment. Current deployments are fairly simple but viewport-agnostic which inevitably results in high storage/bandwidth requirements and low Quality of Experience (QoE). A promising solution is referred to as tile- based streaming which allows to have higher quality within the user’s viewport while quality outside the user’s viewport could be lower. However, empirical QoE assessment studies in this domain are still rare. Thus, this paper investigates the impact of different tile-based streaming approaches and configurations on the QoE of ODV. We present the results of a lab-based subjective evaluation in which participants evaluated 8K omnidirectional video QoE as influenced by different (i) tile-based streaming approaches (full vs. partial delivery), (ii) content types (static vs. moving camera), and (iii) tile encoding quality levels determined by different quantization parameters. Our experimental setup is character- ized by high reproducibility since relevant media delivery aspects (including the user’s head movements and dynamic tile quality adaptation) are already rendered into the respective processed video sequences. Additionally, we performed a complementary objective evaluation of the different test sequences focusing on bandwidth efficiency and objective quality metrics. The results are presented in this paper and discussed in detail which confirm that tile-based streaming of ODV improves visual quality while reducing bandwidth requirements.
This document discusses optimizing 360-degree video streaming to head-mounted virtual reality. It covers challenges like existing codecs only supporting 2D videos and 360 videos having wider views than conventional videos. Approaches proposed include fixation prediction to avoid streaming unwatched parts, QoE modeling designed for 360 videos to improve user experience, and an adaptive streaming platform to select and transmit tiles based on fixation prediction while allocating bitrates based on the QoE model. Part I discusses fixation prediction including using neural networks trained on viewing features. Part II covers QoE modeling, noting limitations of existing metrics and factors that affect QoE like content and bitrates. It constructs a logarithmic linear QoE model. Part III outlines an
Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual RealityWen-Chih Lo
Published in NOSSDAV'17 on June 2017.
We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings.
Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.
Performance Measurements of 360◦ Video Streaming to Head-Mounted Displays Ove...Wen-Chih Lo
Published in APNOMS'17 on October 2017.
Watching 360◦ videos using Head-Mounted Display (HMD) allows users to only see a part of the whole 360◦ videos. With this feature, tiled videos become a potential solution for aggressively reducing the required bandwidth for 360◦ video streaming, turning it into a reality in cellular networks. In this paper, we design several experiments for quantifying the performance of tile-based 360◦ video streaming over a real cellular network on our campus. In particular, we empirically investigate the impacts of tile streaming over 4G networks, such as coding efficiency, bandwidth saving, and scalability.
Our experiments lead to interesting findings, for example, (i) only streaming the tiles viewed by the viewer achieves bitrate reduction by up to 80% and (ii) the coding efficiency of 3x3 tiled videos may be higher than non-tiled videos at higher bitrates.
We believe this work will stimulate more studies in the emerging area of mobile AR/VR (Augmented Reality and Virtual Reality) over 4G networks.
The media landscape changes significantly over the last few years by new content formats, new service offerings, additional consumption devices and new monetization models. Think of Netflix, DAZN, Mediatheks, mobile devices, interactive content, smart TVs, Virtual and Augmented Reality, and so on. Many of these efforts have been realized by a limited usage of standards, but are standards irrelevant? Secondly, more and more services are enabled by latest mobile compute platforms enabling new services and experiences. This presentation will provide an overview some of these trends and will motivate the development of global interop standards. Specific aspects will include the move of linear TV services to the Internet (both mobile and fixed) as well recent advances on Extended Reality and immersive media trends.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/dec-2017-alliance-vitf-khronos
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Neil Trevett, President of the Khronos Group, delivers the presentation "Update on Khronos Standards for Vision and Machine Learning" at the Embedded Vision Alliance's December 2017 Vision Industry and Technology Forum. Trevett shares updates on recent, current and planned Khronos standardization activities aimed at streamlining the deployment of embedded vision and AI.
The document introduces CoaXPress, a high-speed video interface standard for computer vision applications. Key points include:
- CoaXPress uses a single coaxial cable to transmit video data at rates up to 6.25 Gbps, camera control signals, power, and real-time triggers. This simplifies cabling compared to alternatives.
- It supports cable lengths up to 130m, longer than competing standards, and aggregates multiple cables for even higher bandwidth up to 50 Gbps.
- Euresys offers CoaXPress frame grabbers with up to four 6.25 Gbps connections and features like I/O lines, camera control, and event logging software.
The document discusses the concept of the Tactile Internet with Human-in-the-Loop. It aims to democratize access to skills and expertise for people of all backgrounds and abilities. This goes beyond the current Internet's goal of providing access to information regardless of location or time. The document outlines a vision for two-way skills transfer between humans and machines using multimodal feedback over 5G networks. It discusses challenges like differing neural time delays for multisensory perception and individual differences in processing that affect perception and action. The Center for Tactile Internet's research agenda involves understanding multisensory goal-directed processing neurocognitively, modeling perception and action, and expertise in related fields to advance human-technology interactions
Tile-based Streaming of 8K Omnidirectional Video: Subjective and Objective Qo...Alpen-Adria-Universität
Omnidirectional video (ODV) streaming applica- tions are becoming increasingly popular. They enable a highly immersive experience as the user can freely choose her/his field of view within the 360-degree environment. Current deployments are fairly simple but viewport-agnostic which inevitably results in high storage/bandwidth requirements and low Quality of Experience (QoE). A promising solution is referred to as tile- based streaming which allows to have higher quality within the user’s viewport while quality outside the user’s viewport could be lower. However, empirical QoE assessment studies in this domain are still rare. Thus, this paper investigates the impact of different tile-based streaming approaches and configurations on the QoE of ODV. We present the results of a lab-based subjective evaluation in which participants evaluated 8K omnidirectional video QoE as influenced by different (i) tile-based streaming approaches (full vs. partial delivery), (ii) content types (static vs. moving camera), and (iii) tile encoding quality levels determined by different quantization parameters. Our experimental setup is character- ized by high reproducibility since relevant media delivery aspects (including the user’s head movements and dynamic tile quality adaptation) are already rendered into the respective processed video sequences. Additionally, we performed a complementary objective evaluation of the different test sequences focusing on bandwidth efficiency and objective quality metrics. The results are presented in this paper and discussed in detail which confirm that tile-based streaming of ODV improves visual quality while reducing bandwidth requirements.
A major challenge for the next decade is to design virtual and augmented reality systems (VR at large) for real-world use cases such as healthcare, entertainment, e-education, and high-risk missions. This requires VR systems to operate at scale, in a personalized manner, remaining bandwidth-tolerant whilst meeting quality and latency criteria. One key challenge to reach this goal is to fully understand and anticipate user behaviours in these mixed reality settings.
This can be accomplished only by a fundamental revolution of the network and VR systems that have to put the interactive user at the heart of the system rather than at the end of the chain. With this goal in mind, in this talk, we describe our current researches on user-centric systems. First, we describe our view-port based streaming strategies for 360-degree video. Then, we present more in details our research on of users‘ behaviour analysis, when users interact with the 360-degree content. Specifically, we describe a set of metrics that allows us to identify key behaviours among users and quantify the level of similarity of these behaviours. Specifically, we present our clique-based clustering methodology, information theory and trajectory base in-depth analysis. Finally, we conclude with an overview of the extension of this work to navigation within volumetric video sequences.
Internet data almost double every year. The need of multimedia communication
is less storage space and fast transmission. So, the large volume of video data has become
the reason for video compression. The aim of this paper is to achieve temporal compression
for three-dimensional (3D) videos using motion estimation-compensation and wavelets.
Instead of performing a two-dimensional (2D) motion search, as is common in conventional
video codec’s, the use of a 3D motion search has been proposed, that is able to better exploit
the temporal correlations of 3D content. This leads to more accurate motion prediction and
a smaller residual. The discrete wavelet transform (DWT) compression scheme has been
added for better compression ratio. The DWT has a high-energy compaction property thus
greatly impacted the field of compression. The quality parameters peak signal to noise ratio
(PSNR) and mean square error (MSE) have been calculated. The simulation results shows
that the proposed work improves the PSNR from existing work.
Session 10 in module 3 from the Master in Computer Vision by UPC, UAB, UOC & UPF.
This lecture provides an overview of state of the art applications of convolutional neural networks to the problems in video processing: semantic recognition, optical flow estimation and object tracking.
"Future Internet enablers for VGI applications" presentation from ENVIROINFO 2013, Sept. 02-04 2013
Shows the ENVIROFI results relevant to crowdsourcing and crowdtasking.
This document provides a survey of adaptive 360-degree video streaming solutions, challenges, and opportunities. It discusses current solutions for streaming 360-degree video over dynamic networks in a viewport-independent, viewport-dependent, and tile-based manner. It also analyzes research challenges for on-demand and live 360-degree video streaming and discusses standardization efforts to ensure interoperability and deployment at scale. The document concludes by outlining future research opportunities enabled by 360-degree video streaming.
1. The document discusses analyzing videos with convolutional neural networks (CNNs). It covers techniques like video recognition using CNNs to classify video content at the clip level.
2. DeepVideo and C3D are discussed as approaches for video recognition using CNNs. DeepVideo employs 2D CNNs on multiple video frames while C3D uses 3D CNNs to learn spatiotemporal features directly from video data.
3. Optical flow estimation techniques like DeepFlow are also covered, which uses a deep matching approach to compute dense correspondences between video frames for large displacements.
This document summarizes recent trends in the spatial information field and their implications. It discusses the convergence of ICT and spatial IT, the evolution of location technology towards real-time dynamic mapping of both indoor and outdoor spaces, and the rise of digital twins, big data, AI and IoT. It also introduces mago3D as a platform that can visualize massive 3D models and seamlessly integrate BIM and 3D GIS for applications like facility management, cultural heritage preservation and live drone mapping.
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
The ubiquitous and connected nature of camera loaded mobile devices has greatly estimated the value and importance of visual information they capture. Today, sending videos from camera phones uploaded by unknown users is relevant on news networks, and banking customers expect to be able to deposit checks using mobile devices. In this paper we represent Movee, a system that addresses the fundamental question of whether the visual stream exchange by a user has been captured live on a mobile device, and has not been tampered with by an adversary. Movee leverages the mobile device motion sensors and the inherent user movements during the shooting of the video. Movee exploits the observation that the movement of the scene recorded on the video stream should be related to the movement of the device simultaneously captured by the accelerometer. the last decade e-lecturing has become more and more popular. We model the distribution of correlation of temporal noise residue in a forged video as a Gaussian mixture model (GMM). We propose a twostep scheme to estimate the model parameters. Consequently, a Bayesian classifier is used to find the optimal threshold value based on the estimated parameters. Cyrus Deboo | Shubham Kshatriya | Rajat Bhat"Video Liveness Verification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12772.pdf http://www.ijtsrd.com/computer-science/other/12772/video-liveness-verification/cyrus-deboo
This document provides a user guide for open-source geospatial tools to extract building footprints and define homogeneous zones from imagery for the purpose of developing exposure data. It describes software tools like Quantum GIS and GRASS for pre-processing imagery, as well as algorithms for automatically extracting building footprints. Protocols are provided for modifying GIS data and manually delineating homogeneous land use zones using QGIS or Google Earth. Sample data is included to allow users to test the extraction methods.
Call for papers - 9th International Conference on Signal, Image Processing an...sipij
9th International Conference on Signal, Image Processing and Pattern Recognition (SIPP 2021) is a forum for presenting new advances and research results in the fields of Signal and Image Processing.
Technological advances in dental implant surgeryPeriowiki.com
This document discusses recent technological advances in dental implant surgery, including computer-aided design/computer-aided manufacturing (CAD/CAM) technology and computer-guided implant surgery techniques. It describes computerized tomography (CT) imaging and how CT data can be used for virtual surgical planning and fabrication of surgical guides. The document compares computer-guided implant surgery (CGIS), which uses static surgical guides, to computer-navigated implant surgery (CNIS), which allows for intraoperative modification of the surgical plan. Both techniques aim to increase the accuracy and predictability of dental implant placement.
This document discusses augmented reality and its applications. It begins with an abstract and introduction on mediated reality and augmented reality. It then provides details on the concept of mediated reality including Milgram's reality-virtuality continuum. Next, it discusses the working of mediated reality including registration, tracking, and displays. It provides examples of applications of mediated reality in various fields such as military, industries, automotive, and education. In the conclusion, it states that as technology advances, the usage of augmented reality will increase with other technologies.
This document provides an overview of a course on augmented reality (AR). The course will cover introductions to AR technology and interaction techniques, AR authoring tools, and research directions in AR. Students will learn about AR and complete a simple AR project. They will be assessed through a research project, assignments, and a final exam. The document outlines the weekly topics and provides background on AR applications, history, and the importance of user experience design.
This presentation by OECD, OECD Secretariat, was made during the discussion “Competition and Regulation in Professions and Occupations” held at the 77th meeting of the OECD Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found at oe.cd/crps.
This presentation was uploaded with the author’s consent.
More Related Content
Similar to 360° Video Viewing Dataset in Head-Mounted Virtual Reality
A major challenge for the next decade is to design virtual and augmented reality systems (VR at large) for real-world use cases such as healthcare, entertainment, e-education, and high-risk missions. This requires VR systems to operate at scale, in a personalized manner, remaining bandwidth-tolerant whilst meeting quality and latency criteria. One key challenge to reach this goal is to fully understand and anticipate user behaviours in these mixed reality settings.
This can be accomplished only by a fundamental revolution of the network and VR systems that have to put the interactive user at the heart of the system rather than at the end of the chain. With this goal in mind, in this talk, we describe our current researches on user-centric systems. First, we describe our view-port based streaming strategies for 360-degree video. Then, we present more in details our research on of users‘ behaviour analysis, when users interact with the 360-degree content. Specifically, we describe a set of metrics that allows us to identify key behaviours among users and quantify the level of similarity of these behaviours. Specifically, we present our clique-based clustering methodology, information theory and trajectory base in-depth analysis. Finally, we conclude with an overview of the extension of this work to navigation within volumetric video sequences.
Internet data almost double every year. The need of multimedia communication
is less storage space and fast transmission. So, the large volume of video data has become
the reason for video compression. The aim of this paper is to achieve temporal compression
for three-dimensional (3D) videos using motion estimation-compensation and wavelets.
Instead of performing a two-dimensional (2D) motion search, as is common in conventional
video codec’s, the use of a 3D motion search has been proposed, that is able to better exploit
the temporal correlations of 3D content. This leads to more accurate motion prediction and
a smaller residual. The discrete wavelet transform (DWT) compression scheme has been
added for better compression ratio. The DWT has a high-energy compaction property thus
greatly impacted the field of compression. The quality parameters peak signal to noise ratio
(PSNR) and mean square error (MSE) have been calculated. The simulation results shows
that the proposed work improves the PSNR from existing work.
Session 10 in module 3 from the Master in Computer Vision by UPC, UAB, UOC & UPF.
This lecture provides an overview of state of the art applications of convolutional neural networks to the problems in video processing: semantic recognition, optical flow estimation and object tracking.
"Future Internet enablers for VGI applications" presentation from ENVIROINFO 2013, Sept. 02-04 2013
Shows the ENVIROFI results relevant to crowdsourcing and crowdtasking.
This document provides a survey of adaptive 360-degree video streaming solutions, challenges, and opportunities. It discusses current solutions for streaming 360-degree video over dynamic networks in a viewport-independent, viewport-dependent, and tile-based manner. It also analyzes research challenges for on-demand and live 360-degree video streaming and discusses standardization efforts to ensure interoperability and deployment at scale. The document concludes by outlining future research opportunities enabled by 360-degree video streaming.
1. The document discusses analyzing videos with convolutional neural networks (CNNs). It covers techniques like video recognition using CNNs to classify video content at the clip level.
2. DeepVideo and C3D are discussed as approaches for video recognition using CNNs. DeepVideo employs 2D CNNs on multiple video frames while C3D uses 3D CNNs to learn spatiotemporal features directly from video data.
3. Optical flow estimation techniques like DeepFlow are also covered, which uses a deep matching approach to compute dense correspondences between video frames for large displacements.
This document summarizes recent trends in the spatial information field and their implications. It discusses the convergence of ICT and spatial IT, the evolution of location technology towards real-time dynamic mapping of both indoor and outdoor spaces, and the rise of digital twins, big data, AI and IoT. It also introduces mago3D as a platform that can visualize massive 3D models and seamlessly integrate BIM and 3D GIS for applications like facility management, cultural heritage preservation and live drone mapping.
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
The ubiquitous and connected nature of camera loaded mobile devices has greatly estimated the value and importance of visual information they capture. Today, sending videos from camera phones uploaded by unknown users is relevant on news networks, and banking customers expect to be able to deposit checks using mobile devices. In this paper we represent Movee, a system that addresses the fundamental question of whether the visual stream exchange by a user has been captured live on a mobile device, and has not been tampered with by an adversary. Movee leverages the mobile device motion sensors and the inherent user movements during the shooting of the video. Movee exploits the observation that the movement of the scene recorded on the video stream should be related to the movement of the device simultaneously captured by the accelerometer. the last decade e-lecturing has become more and more popular. We model the distribution of correlation of temporal noise residue in a forged video as a Gaussian mixture model (GMM). We propose a twostep scheme to estimate the model parameters. Consequently, a Bayesian classifier is used to find the optimal threshold value based on the estimated parameters. Cyrus Deboo | Shubham Kshatriya | Rajat Bhat"Video Liveness Verification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12772.pdf http://www.ijtsrd.com/computer-science/other/12772/video-liveness-verification/cyrus-deboo
This document provides a user guide for open-source geospatial tools to extract building footprints and define homogeneous zones from imagery for the purpose of developing exposure data. It describes software tools like Quantum GIS and GRASS for pre-processing imagery, as well as algorithms for automatically extracting building footprints. Protocols are provided for modifying GIS data and manually delineating homogeneous land use zones using QGIS or Google Earth. Sample data is included to allow users to test the extraction methods.
Call for papers - 9th International Conference on Signal, Image Processing an...sipij
9th International Conference on Signal, Image Processing and Pattern Recognition (SIPP 2021) is a forum for presenting new advances and research results in the fields of Signal and Image Processing.
Technological advances in dental implant surgeryPeriowiki.com
This document discusses recent technological advances in dental implant surgery, including computer-aided design/computer-aided manufacturing (CAD/CAM) technology and computer-guided implant surgery techniques. It describes computerized tomography (CT) imaging and how CT data can be used for virtual surgical planning and fabrication of surgical guides. The document compares computer-guided implant surgery (CGIS), which uses static surgical guides, to computer-navigated implant surgery (CNIS), which allows for intraoperative modification of the surgical plan. Both techniques aim to increase the accuracy and predictability of dental implant placement.
This document discusses augmented reality and its applications. It begins with an abstract and introduction on mediated reality and augmented reality. It then provides details on the concept of mediated reality including Milgram's reality-virtuality continuum. Next, it discusses the working of mediated reality including registration, tracking, and displays. It provides examples of applications of mediated reality in various fields such as military, industries, automotive, and education. In the conclusion, it states that as technology advances, the usage of augmented reality will increase with other technologies.
This document provides an overview of a course on augmented reality (AR). The course will cover introductions to AR technology and interaction techniques, AR authoring tools, and research directions in AR. Students will learn about AR and complete a simple AR project. They will be assessed through a research project, assignments, and a final exam. The document outlines the weekly topics and provides background on AR applications, history, and the importance of user experience design.
Similar to 360° Video Viewing Dataset in Head-Mounted Virtual Reality (20)
This presentation by OECD, OECD Secretariat, was made during the discussion “Competition and Regulation in Professions and Occupations” held at the 77th meeting of the OECD Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found at oe.cd/crps.
This presentation was uploaded with the author’s consent.
This presentation by Thibault Schrepel, Associate Professor of Law at Vrije Universiteit Amsterdam University, was made during the discussion “Artificial Intelligence, Data and Competition” held at the 143rd meeting of the OECD Competition Committee on 12 June 2024. More papers and presentations on the topic can be found at oe.cd/aicomp.
This presentation was uploaded with the author’s consent.
This presentation by Yong Lim, Professor of Economic Law at Seoul National University School of Law, was made during the discussion “Artificial Intelligence, Data and Competition” held at the 143rd meeting of the OECD Competition Committee on 12 June 2024. More papers and presentations on the topic can be found at oe.cd/aicomp.
This presentation was uploaded with the author’s consent.
• For a full set of 530+ questions. Go to
https://skillcertpro.com/product/servicenow-cis-itsm-exam-questions/
• SkillCertPro offers detailed explanations to each question which helps to understand the concepts better.
• It is recommended to score above 85% in SkillCertPro exams before attempting a real exam.
• SkillCertPro updates exam questions every 2 weeks.
• You will get life time access and life time free updates
• SkillCertPro assures 100% pass guarantee in first attempt.
This presentation by Professor Alex Robson, Deputy Chair of Australia’s Productivity Commission, was made during the discussion “Competition and Regulation in Professions and Occupations” held at the 77th meeting of the OECD Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found at oe.cd/crps.
This presentation was uploaded with the author’s consent.
This presentation by Nathaniel Lane, Associate Professor in Economics at Oxford University, was made during the discussion “Pro-competitive Industrial Policy” held at the 143rd meeting of the OECD Competition Committee on 12 June 2024. More papers and presentations on the topic can be found at oe.cd/pcip.
This presentation was uploaded with the author’s consent.
XP 2024 presentation: A New Look to Leadershipsamililja
Presentation slides from XP2024 conference, Bolzano IT. The slides describe a new view to leadership and combines it with anthro-complexity (aka cynefin).
This presentation by Tim Capel, Director of the UK Information Commissioner’s Office Legal Service, was made during the discussion “The Intersection between Competition and Data Privacy” held at the 143rd meeting of the OECD Competition Committee on 13 June 2024. More papers and presentations on the topic can be found at oe.cd/ibcdp.
This presentation was uploaded with the author’s consent.
This presentation by OECD, OECD Secretariat, was made during the discussion “The Intersection between Competition and Data Privacy” held at the 143rd meeting of the OECD Competition Committee on 13 June 2024. More papers and presentations on the topic can be found at oe.cd/ibcdp.
This presentation was uploaded with the author’s consent.
1.) Introduction
Our Movement is not new; it is the same as it was for Freedom, Justice, and Equality since we were labeled as slaves. However, this movement at its core must entail economics.
2.) Historical Context
This is the same movement because none of the previous movements, such as boycotts, were ever completed. For some, maybe, but for the most part, it’s just a place to keep your stable until you’re ready to assimilate them into your system. The rest of the crabs are left in the world’s worst parts, begging for scraps.
3.) Economic Empowerment
Our Movement aims to show that it is indeed possible for the less fortunate to establish their economic system. Everyone else – Caucasian, Asian, Mexican, Israeli, Jews, etc. – has their systems, and they all set up and usurp money from the less fortunate. So, the less fortunate buy from every one of them, yet none of them buy from the less fortunate. Moreover, the less fortunate really don’t have anything to sell.
4.) Collaboration with Organizations
Our Movement will demonstrate how organizations such as the National Association for the Advancement of Colored People, National Urban League, Black Lives Matter, and others can assist in creating a much more indestructible Black Wall Street.
5.) Vision for the Future
Our Movement will not settle for less than those who came before us and stopped before the rights were equal. The economy, jobs, healthcare, education, housing, incarceration – everything is unfair, and what isn’t is rigged for the less fortunate to fail, as evidenced in society.
6.) Call to Action
Our movement has started and implemented everything needed for the advancement of the economic system. There are positions for only those who understand the importance of this movement, as failure to address it will continue the degradation of the people deemed less fortunate.
No, this isn’t Noah’s Ark, nor am I a Prophet. I’m just a man who wrote a couple of books, created a magnificent website: http://www.thearkproject.llc, and who truly hopes to try and initiate a truly sustainable economic system for deprived people. We may not all have the same beliefs, but if our methods are tried, tested, and proven, we can come together and help others. My website: http://www.thearkproject.llc is very informative and considerably controversial. Please check it out, and if you are afraid, leave immediately; it’s no place for cowards. The last Prophet said: “Whoever among you sees an evil action, then let him change it with his hand [by taking action]; if he cannot, then with his tongue [by speaking out]; and if he cannot, then, with his heart – and that is the weakest of faith.” [Sahih Muslim] If we all, or even some of us, did this, there would be significant change. We are able to witness it on small and grand scales, for example, from climate control to business partnerships. I encourage, invite, and challenge you all to support me by visiting my website.
The importance of sustainable and efficient computational practices in artificial intelligence (AI) and deep learning has become increasingly critical. This webinar focuses on the intersection of sustainability and AI, highlighting the significance of energy-efficient deep learning, innovative randomization techniques in neural networks, the potential of reservoir computing, and the cutting-edge realm of neuromorphic computing. This webinar aims to connect theoretical knowledge with practical applications and provide insights into how these innovative approaches can lead to more robust, efficient, and environmentally conscious AI systems.
Webinar Speaker: Prof. Claudio Gallicchio, Assistant Professor, University of Pisa
Claudio Gallicchio is an Assistant Professor at the Department of Computer Science of the University of Pisa, Italy. His research involves merging concepts from Deep Learning, Dynamical Systems, and Randomized Neural Systems, and he has co-authored over 100 scientific publications on the subject. He is the founder of the IEEE CIS Task Force on Reservoir Computing, and the co-founder and chair of the IEEE Task Force on Randomization-based Neural Networks and Learning Systems. He is an associate editor of IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
This presentation by Juraj Čorba, Chair of OECD Working Party on Artificial Intelligence Governance (AIGO), was made during the discussion “Artificial Intelligence, Data and Competition” held at the 143rd meeting of the OECD Competition Committee on 12 June 2024. More papers and presentations on the topic can be found at oe.cd/aicomp.
This presentation was uploaded with the author’s consent.
This presentation by Professor Giuseppe Colangelo, Jean Monnet Professor of European Innovation Policy, was made during the discussion “The Intersection between Competition and Data Privacy” held at the 143rd meeting of the OECD Competition Committee on 13 June 2024. More papers and presentations on the topic can be found at oe.cd/ibcdp.
This presentation was uploaded with the author’s consent.
This presentation by OECD, OECD Secretariat, was made during the discussion “Pro-competitive Industrial Policy” held at the 143rd meeting of the OECD Competition Committee on 12 June 2024. More papers and presentations on the topic can be found at oe.cd/pcip.
This presentation was uploaded with the author’s consent.
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdfBen Linders
Psychological safety in teams is important; team members must feel safe and able to communicate and collaborate effectively to deliver value. It’s also necessary to build long-lasting teams since things will happen and relationships will be strained.
But, how safe is a team? How can we determine if there are any factors that make the team unsafe or have an impact on the team’s culture?
In this mini-workshop, we’ll play games for psychological safety and team culture utilizing a deck of coaching cards, The Psychological Safety Cards. We will learn how to use gamification to gain a better understanding of what’s going on in teams. Individuals share what they have learned from working in teams, what has impacted the team’s safety and culture, and what has led to positive change.
Different game formats will be played in groups in parallel. Examples are an ice-breaker to get people talking about psychological safety, a constellation where people take positions about aspects of psychological safety in their team or organization, and collaborative card games where people work together to create an environment that fosters psychological safety.
4. A 360° video is a view that every direction is recorded at the
same time
With planar monitors is passive experiences
Head-Mounted Displays (HMDs) offer more immersive
experiences
4
5. VR/AR deliver a total $3.9 billion, including $2.7 billion
VR and $1.2 billion AR, revenue in 2016 [1]
5[1] After mixed year, mobile AR to drive $108 billion VR/AR market by 2021, Digi-capital, Jan 2017.
https://goo.gl/Blcv2f
6. Latency
Extremely high resolution
The distortion while stitching or projecting videos/images
Compress tremendous amount of video data in real-time
Reduce the computational cost using tile-based viewing
method
6
Distortio
n
8. We recruit 50 subjects, each of them is asked to watch ten
360°videos
52% are male
Most of them are in early twenties
56% of them for the first time
8
56%
42%
2%
How often do you watch
360°video using HMDs?
Never Seldom Often
11. We collect ten 360° videos from YouTube
1 minute long, 4K resolution, and 30fps
11
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Content
12. Convolutional Neural Networks (CNN)
Based on a pre-trained VGG-16 networks
Gray-scale image (from 0 to 255)
12
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
[1] M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara. 2016. A Deep Multi-Level Network for
Saliency Prediction. In International Conference on Pattern Recognition (ICPR’16).
Content
13. Relative motions
Lucas-Kanade optical flow
Black-and-white images (0 or 1)
13
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
[2] B. Lucas and T. Kanade. 1981. An iterative image registration technique with an application to
stereo vision. In Proc. of the International Joint Conference on Artificial Intelligence (IJCAI’7)
Content
14. Collect sensor data from HMDs while viewers are watching
360° videos
Frame Capturer: GamingAnywhere[1]
Sensor Logger: OpenTrack[2]
14
[1] http://gaminganywhere.org/
[2] https://github.com/opentrack/opentrack
360° video
Sensor data
with timestamp250Hz
Video frame with timestamp
30Hz
Oculus
DK2
16. Raw sensor data from HMDs
Timestamp with epoch time
Position (x, y, and z)
Orientation (yaw, pitch, and roll)
16
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
17. Raw sensor data from HMDs
Timestamp with epoch time
Position (x, y, and z)
Orientation (yaw, pitch, and roll)
17
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
18. Raw sensor data from HMDs
Timestamp with epoch time
Position (x, y, and z)
Orientation (yaw, pitch, and roll)
18
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
x
y
z
Sensor
19. Raw sensor data from HMDs
Timestamp with epoch time
Position (x, y, and z)
Orientation (yaw, pitch, and roll)
19
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
x
y
z
roll
pitchyaw
Sensor
20. Align the sensor data and video frames
Different viewers introduce different bias
A 35-sec calibration procedure
20
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
21. Align the sensor data and video frames
Different viewers introduce different bias
Design a calibration procedure
21
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
22. Align the sensor data and video frames
Different viewers introduce different bias
Design a calibration procedure
22
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
23. Align the sensor data and video frames
Different viewers introduce different bias
Design a calibration procedure
23
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
24. Align the sensor data and video frames
Different viewers introduce different bias
Design a calibration procedure
24
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
25. Field-of-View (FoV) are 100° x 100° circle
We divide each frame into 192x192 tiles
We number the tiles from upper-left to lower-right (from 0
to 199)
25
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
100°
100°
Sensor
26. Field-of-View (FoV) is 100° x 100° circle
We divide each frame into 192x192 tiles
We number the tiles from upper-left to lower-right (from 0
to 199)
26
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
27. Field-of-View (FoV) is 100° x 100° circle
We divide each frame into 192x192 tiles
We number the tiles from upper-left to lower-right (from 0
to 199)
27
Dataset
Content
Trace
Video Trace
Saliency
Map
Motion Map
Sensor
Trace
Raw
Orientation
Tile
Sensor
0
28. Head
moveme
nt
Eye
moveme
nt
Content
-driven
data
Open-
source
software
Applicati
on-driven
Lo et al. [1]
Rai et al. [2]
Corbillon et al.
[3]
Wu et al. [4]
28
[1] W. Lo, C. Fan, J. Lee, C. Huang, K. Chen, and C. Hsu. “360° Video Viewing Dataset in Head-Mounted Virtual
Reality.” In Proc. of the 8th ACM on Multimedia Systems Conference (MMSys'17). 2017.
[2] Y. Rai, J. Gutiérrez, and P. Callet. “A Dataset of Head and Eye Movements for 360 Degree Images.” In Proc. of the
8th ACM on Multimedia Systems Conference (MMSys'17). 2017.
[3] X. Corbillon, F. Simone, and G. Simon. “360-Degree Video Head Movement Dataset.” In Proc. of the 8th ACM on
Multimedia Systems Conference (MMSys'17). 2017.
[4] C. Wu, Z. Tan, Z. Wang, and S. Yang. “A Dataset for Exploring User Behaviors in VR Spherical Video Streaming.”
In Proc. of the 8th ACM on Multimedia Systems Conference (MMSys'17). 2017.
30. NOSSDAV’17
Tomorrow (6/23) 2:10pm - 3:10pm at 2nd Conference Room
C. Fan, J. Lee, W. Lo, C. Huang, K. Chen, and C. Hsu,
“Fixation Prediction for 360˚ Video Streaming in Head-
Mounted Virtual Reality”
30
Hello everyone, I am Wen-Chih Lo from NTHU.
Today I am going to present a 360-degree video dataset.
This work is joint with my collaborators from NTHU and
Prof. Huang from NCTU and Prof. Chen from Academia Sinica.
In this talk, I will briefly describe our dataset structure and the two major components in our 360-degree video viewing dataset.
Besides, I will also show the basic statistics of our dataset and a sample application using our dataset.
A 360-degree video is known as spherical video or immersive video.
Watching these videos using traditional planar monitors gives viewers passive experience.
Nowadays, HMDs are widely available.
Using HMDs give viewers more immersive experience than previous one.
In digi-capital report, they says AR/VR will keep growing up.
In the future, watching 360-degree video using HMDs might be a common experience.
However, it is challenging to deliver such a high resolution video over today’s network.
For example... bla bla bla.
More and more researchers, engineers are jumping into this topic.
At that time, there are no public standard dataset can be used to evaluate their system performance and develop the new algorithms.
But I guess that we don't need to worry about this problem. There are several dataset have been published here.
At the beginning, I want to show you some basic statistics of our dataset.
We recruit 50 subjects in our dataset.
(description)
Each of them is asked to watch ten videos.
During the experiments, all of them are asked to stand when watching videos.
Our dataset is unique, because it contains not only the sensor-driven data, but also the content-driven data.
First, I want to show you the content-driven data, we called it content trace.
The content trace means the impact of video content on viewer’s attentions.
(click)
For example, the video traces, the saliency map and motion map of 360-degree video.
I will explain each of them later.
Here, I want to show you the video traces we collected from YouTube.
We divide the 10 videos into three groups,
including fast-paced Natural Images (NI), slow-paced NI, and fast-paced Computer-Generated (CG)
All of them are 1 minute 4k resolution videos.
The other one is saliency map.
We develop a deep neural network based on pre-trained VGG networks.
We use this network to produce the saliency map.
(demo video)
A saliency map indicates the attraction levels of the video.
It means which part of video that attracts viewer’s attention.
As many previous presenters mentioned the saliency map.
Which FoV should we stream to meet the viewer’s needs in the next moment.
The last one is motion map.
We analyze the optical flow of the video frames to produce the motion map using OpenCV library.
(demo video)
Motion map indicate the relative motions between the objects in video and the viewers.
Second, I want to show you the sensor-drive data.
It is the viewing orientation data from HMDs when a viewer watching videos.
(click) In our testbed, we render the videos to Oculus DK2 and Oculus Video using Oculus SDK.
(click) We use GamingAnywhere to be our Frame capturer. It records the video from HMDs. The rate of frame capturer is 30 Hz.
(click) We modified and enhanced the opentrack to be our Sensor logger. Opentrack allows you to track user's head movements.
It captures and timestamps the orientation data from HMDs. The rate of sensor logger is 250 Hz
(click) The sensor trace contains 3 different data, the raw data from HMDs, the processed viewing orientation data, and the processed tile data.
Here, I want to show you the raw sensor data from HMDs.
There are 7 fields in the raw sensor data,
including timestamp (epoch time), position (x, y, and z), and orientation (yaw, pitch ,and roll).
Here, I want to show you the raw sensor data from HMDs.
There are 7 fields in the raw sensor data,
including timestamp (epoch time), position (x, y, and z), and orientation (yaw, pitch ,and roll).
Here, I want to show you the raw sensor data from HMDs.
There are 7 fields in the raw sensor data,
including timestamp (epoch time), position (x, y, and z), and orientation (yaw, pitch ,and roll).
Here, I want to show you the raw sensor data from HMDs.
There are 7 fields in the raw sensor data,
including timestamp (epoch time), position (x, y, and z), and orientation (yaw, pitch ,and roll).
To simplify the usage of our dataset, we align the timestamps in the raw sensor data from HMD and video frames which captured from GA.
However, in our pilot experiments, we find that different viewers tend to introduce different amount of bias.
We then insert a 35-sec calibration video before each viewer starts watching 360° videos.
This is not only the calibration procedure, but also help viewers to familiar how to watch the 360-degree video.
There are 10 fields in the orientation data,
Including index, position (x, y, and z), raw orientation (yaw, pitch ,and roll), and calibrated orientation data.
There are 10 fields in the orientation data,
Including index, position (x, y, and z), raw orientation (yaw, pitch ,and roll), and calibrated orientation data.
There are 10 fields in the orientation data,
Including index, position (x, y, and z), raw orientation (yaw, pitch ,and roll), and calibrated orientation data.
There are 10 fields in the orientation data,
Including index, position (x, y, and z), raw orientation (yaw, pitch ,and roll), and calibrated orientation data.
We measure that the FoV of HMD is about 100°x100° circles.
Therefore, we process the view orientation data, and generate viewed tile data to further simplify the usage of our dataset.
There are 2 fields in the tile data, including index and the tiles which are watched by viewer.
We measure that the FoV of HMD is about 100°x100° circles.
Therefore, we process the view orientation data, and generate viewed tile data to further simplify the usage of our dataset.
There are 2 fields in the tile data, including index and the tiles which are watched by viewer.
We measure that the FoV of HMD is about 100°x100° circles.
(click)
Therefore, we process the view orientation data, and generate viewed tile data to further simplify the usage of our dataset.
There are 2 fields in the tile data, including index and the tiles which are watched by viewer.
Now, there are several dataset are available.
Here is a table summarizing different dataset efforts.
Each of them is unique and contains different features.
For example, if you need to take content-driven data into consideration, you can choose the first dataset.
If you need to take eye movement into consideration, you can choose the second one.
I need to emphasize that this is preliminary, and if you think differently, please don't hesitate to let us know.
Our dataset can be used in various 360-degree video applications.(click) For example, when a viewer with HMD rotates his/her head to watch some new tiles which have not been requested. (click)
It may take several seconds to deliver these new tiles.
To reduce the buffer time, (click) predict a viewer who watches a tile-based 360-degree video using HMDs is important.
Our dataset can be used for developing and evaluating new algorithms for viewed tile predictions.
Besides, our dataset also has video content with diverse characteristics, for example, it also can be used for bitrate allocation for 360° video streaming to HMDs.
Here, I want to take a quick teaser of our prediction work.
Tomorrow we will present our fixation prediction network using our dataset.
If you are interested in this topic or how to use our dataset,
Please feel free to join us tomorrow afternoon at the 2nd conference room.
See you tomorrow.
That is. Thank you. I am ready to take questions.
I am not quiet sure. I guess….blablabla. Here is what I propose to do for deeper investigations. I will let you know my findings.
I hope that we can stay in touch with each other.
1. We tiled the video in the equirectangular domain.
2. We provide original video sequences in our dataset, and they are protected with password to avoid copyright issues.
3. It is our individual judgment.
4. Saliency maps are computed based on equirectangular images using a classical image-based saliency mapping approach. There is no research that indicates with sufficient confidence that this can be done. If we tiled the video into multi-level sphere window, I guess it will reduce the distortion influence.
5. If we use the cubic map projection and process the 6 sides of the cubes separately using the same saliency mapping software. It seems to me that it won't achieve the same result.