This document presents a method for region of interest (ROI) determination in videos using visual rhythm analysis and user attention models. Visual rhythm captures temporal information from videos through diagonal and anti-diagonal pixel sampling. This is used to determine six attention models and identify ROIs. ROIs are then classified into slices for flexible macroblock ordering video coding to prioritize important regions. Experimental results show the method identifying faces and hands as ROIs in example videos. The visual rhythm analysis runs in real-time and improves ROI video compression when integrated with H.264/AVC coding.
An abstract of my final project in bachelor\'s degree in Mathematics: interpolation and approximation of curves and surfaces with B-Spline basis functions
An abstract of my final project in bachelor\'s degree in Mathematics: interpolation and approximation of curves and surfaces with B-Spline basis functions
Immersive Telepresence - case study : Kirari
By Yoshihide Tonomura, Nippon Telegraph and Telephone (NTT) Corporation
at 2nd ITU-T Mini-Workshop on Immersive Live Experience (ILE) in 19 January 2017
Multimedia Technologies Introduction Subject
Multimedia Technology introduction - I created these slides for my students to teach CMP 383 Multimedia Technology at Jazan Community College , Jazan University
ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS ijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences
Error resilient for multiview video transmissions with gop analysisijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences.
In this paper, an investigation of the effects of group of pictures on H.264 multiview video coding content over an error prone environment with varying packet loss rates is presented. We analyse the bitrate performance for different GOP and error rates to see the effects on the
quality of the reconstructed multiview video. However, by analysing the multiview video content
it is possible to identify an optimum GOP size depending on the type of application used. A comparison is demonstrated for the performances between widely known H.264 data partitioning error resilience technique and multi-layer data partitioning technique with different error rates and GOP in terms of their perceived quality. Our simulation results turned out that Multi-layer data partitioning technique shows a better performance at higher error rates with different GOP. Further experiments in this work have shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences.
Low complexity video coding for sensor networkeSAT Journals
Abstract Modern video codecs such as H.264/AVC give state-of-the-art compression performance. However, extensive use of optimization tools makes them highly complex and hence not suitable for wireless video sensor network. In this paper an efficient video codec with substantially reduced complexity is proposed. Simulation result shows that the proposed video codec gives comparable compression performance compared to H.264/AVC but at substantially reduced computational complexity. Keywords—Low complexity coding, Sensor network, Video coding, Wavelet transform.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVCIDES Editor
This paper proposes an efficient VLSI architecture
for the implementation of variable block size motion
estimation (VBSME). To improve the performance video
compression the Variable Block Size Motion Estimation
(VBSME) is the critical path. Variable Block Size Motion
Estimation feature has been introduced in to the H.264/AVC.
This feature induces significant complexities into the design
of the H.264/AVC video codec. This paper we compare the
existing architectures for VBSME. An efficient architecture
to improve the performance of Spiral Search for Variable Size
Motion Estimation in H.264/AVC is proposed. Among various
architectures available for VBSME spiral search provides
hardware friendly data flow with efficient utilization of
resources. The proposed implementation is verified using the
MATLAB on foreman, coastguard and train sequences. The
proposed Adaptive thresholding technique reduces the average
number of computations significantly with negligible effect
on the video quality. The results are verified using hardware
implementation on Xilinx Virtex 4 it was able to achieve real
time video coding of 60 fps at 95.56 MHz CLK frequency.
In this paper, we describe an FPGA H.264/AVC encoder architecture performing at real-time. To reduce the critical path length and to increase throughput, the encoder uses a parallel and pipeline architecture and all modules have been optimized with respect the area cost. Our design is described in VHDL and synthesized to Altera Stratix III FPGA. The throughput of the FPGA architecture reaches a processing rate higher than 177 million of pixels per second at 130 MHz, permitting its use in H.264/AVC standard directed to HDTV.
Efficient video compression using EZWTIJERA Editor
In this article, wavelet based lossy video compression algorithm is presented. The motion estimation and compensation, being an important part in the compression, is based on segment movements. The proposed work is based on wavelet transform algorithm Embedded Zeroed WaveletTransform (EZWT). Based on the results of peak signal to noise ratio (PSNR), mean squared error (MSE), different videos are analyzed. Maintaining the PSNR to acceptable limits the proposed EZWT algorithm achieves very good compression ratios making the technique more efficient than the 2-Discrete Cosine Transform (DCT) in the H.264/AVC codec. The method is being suitable for low bit rate video showing highest compression ratio and very good PSNR of more than 30dB.
Immersive Telepresence - case study : Kirari
By Yoshihide Tonomura, Nippon Telegraph and Telephone (NTT) Corporation
at 2nd ITU-T Mini-Workshop on Immersive Live Experience (ILE) in 19 January 2017
Multimedia Technologies Introduction Subject
Multimedia Technology introduction - I created these slides for my students to teach CMP 383 Multimedia Technology at Jazan Community College , Jazan University
ERROR RESILIENT FOR MULTIVIEW VIDEO TRANSMISSIONS WITH GOP ANALYSIS ijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences
Error resilient for multiview video transmissions with gop analysisijma
The work in this paper examines the effects of group of pictures on H.264 multiview video coding bitstream
over an erroneous network with different error rates. The study considers analyzing the bitrate
performance for different GOP and error rates to see the effects on the quality of the reconstructed
multiview video. However, by analyzing the multiview video content it is possible to identify an optimum
GOP size depending on the type of application used. In a comparison test, the H.264 data partitioning and
the multi-layer data partitioning technique with different error rates and GOP are evaluated in terms of
quality perception. The results of the simulation confirm that Multi-layer data partitioning technique shows
a better performance at higher error rates with different GOP. Further experiments in this work have
shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences.
In this paper, an investigation of the effects of group of pictures on H.264 multiview video coding content over an error prone environment with varying packet loss rates is presented. We analyse the bitrate performance for different GOP and error rates to see the effects on the
quality of the reconstructed multiview video. However, by analysing the multiview video content
it is possible to identify an optimum GOP size depending on the type of application used. A comparison is demonstrated for the performances between widely known H.264 data partitioning error resilience technique and multi-layer data partitioning technique with different error rates and GOP in terms of their perceived quality. Our simulation results turned out that Multi-layer data partitioning technique shows a better performance at higher error rates with different GOP. Further experiments in this work have shown the effects of GOP in terms of visual quality and bitrate for different multiview video sequences.
Low complexity video coding for sensor networkeSAT Journals
Abstract Modern video codecs such as H.264/AVC give state-of-the-art compression performance. However, extensive use of optimization tools makes them highly complex and hence not suitable for wireless video sensor network. In this paper an efficient video codec with substantially reduced complexity is proposed. Simulation result shows that the proposed video codec gives comparable compression performance compared to H.264/AVC but at substantially reduced computational complexity. Keywords—Low complexity coding, Sensor network, Video coding, Wavelet transform.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVCIDES Editor
This paper proposes an efficient VLSI architecture
for the implementation of variable block size motion
estimation (VBSME). To improve the performance video
compression the Variable Block Size Motion Estimation
(VBSME) is the critical path. Variable Block Size Motion
Estimation feature has been introduced in to the H.264/AVC.
This feature induces significant complexities into the design
of the H.264/AVC video codec. This paper we compare the
existing architectures for VBSME. An efficient architecture
to improve the performance of Spiral Search for Variable Size
Motion Estimation in H.264/AVC is proposed. Among various
architectures available for VBSME spiral search provides
hardware friendly data flow with efficient utilization of
resources. The proposed implementation is verified using the
MATLAB on foreman, coastguard and train sequences. The
proposed Adaptive thresholding technique reduces the average
number of computations significantly with negligible effect
on the video quality. The results are verified using hardware
implementation on Xilinx Virtex 4 it was able to achieve real
time video coding of 60 fps at 95.56 MHz CLK frequency.
In this paper, we describe an FPGA H.264/AVC encoder architecture performing at real-time. To reduce the critical path length and to increase throughput, the encoder uses a parallel and pipeline architecture and all modules have been optimized with respect the area cost. Our design is described in VHDL and synthesized to Altera Stratix III FPGA. The throughput of the FPGA architecture reaches a processing rate higher than 177 million of pixels per second at 130 MHz, permitting its use in H.264/AVC standard directed to HDTV.
Efficient video compression using EZWTIJERA Editor
In this article, wavelet based lossy video compression algorithm is presented. The motion estimation and compensation, being an important part in the compression, is based on segment movements. The proposed work is based on wavelet transform algorithm Embedded Zeroed WaveletTransform (EZWT). Based on the results of peak signal to noise ratio (PSNR), mean squared error (MSE), different videos are analyzed. Maintaining the PSNR to acceptable limits the proposed EZWT algorithm achieves very good compression ratios making the technique more efficient than the 2-Discrete Cosine Transform (DCT) in the H.264/AVC codec. The method is being suitable for low bit rate video showing highest compression ratio and very good PSNR of more than 30dB.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODERcscpconf
This paper proposes about motion estimation in H.264/AVC encoder. Compared with standards
such as MPEG-2 and MPEG-4 Visual, H.264 can deliver better image quality at the same
compressed bit rate or at a lower bit rate. The increase in compression efficiency comes at the
expense of increase in complexity, which is a fact that must be overcome. An efficient Co-design
methodology is required, where the encoder software application is highly optimized and
structured in a very modular and efficient manner, so as to allow its most complex and time
consuming operations to be offloaded to dedicated hardware accelerators. The Motion
Estimation algorithm is the most computationally intensive part of the encoder which is simulated using MATLAB. The hardware/software co-simulation is done using system generator tool and implemented using Xilinx FPGA Spartan 3E for different scanning methods.
ER Publication,
IJETR, IJMCTR,
Journals,
International Journals,
High Impact Journals,
Monthly Journal,
Good quality Journals,
Research,
Research Papers,
Research Article,
Free Journals, Open access Journals,
erpublication.org,
Engineering Journal,
Science Journals,
Machine learning-based energy consumption modeling and comparing of H.264 and...IJECEIAES
Advancement of the prediction models used in a variety of fields is a result of the contribution of machine learning approaches. Utilizing such modeling in feature engineering is exceptionally imperative and required. In this research, we show how to utilize machine learning to save time in research experiments, where we save more than five thousand hours of measuring the energy consumption of encoding recordings. Since measuring the energy consumption has got to be done by humans and since we require more than eleven thousand experiments to cover all the combinations of video sequences, video bit rate, and video encoding settings, we utilize machine learning to model the energy consumption utilizing linear regression. VP8 codec has been offered by Google as a free video encoder in an effort to replace the popular H.264 video encoder standard. This research model energy consumption and describes the major differences between H.264/AVC and VP8 encoders based on of energy consumption and performance through experiments that are machine learning-based modeling. Twentynine uncompressed video segments from a standard data-set are used, with several sizes, details, and dynamics, where the frame sizes ranging from QCIF(176x144) to 2160p(3840x2160). For fairness in comparison analysis, we use seven settings in VP8 encoder and fifteen types of tuning in H.264/AVC. The settings cover various video qualities. The performance metrics include video qualities, encoding time, and encoding energy consumption.
Similar to Robust region of interest determination based on user attention model through visual rhythm analysis (20)
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
2. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
3. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
4. Introduction
ROI determination is required for video data
transmission.
Moving objects will catch users’ focus points as ROIs in
consecutive frames, but they are computational
intensive.
Visual rhythm can describe the characteristic of video
content.
ROI determination based on attention models through
visual rhythm analysis.
5. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
6. Visual Rhythm
Visual rhythm can efficiently
capture the temporal information
of a video.
7. Visual Rhythm
m
n
diagonal
Anti-diagonal
m : width of a video frame
n : height of a video frame
rd : the ratios of pixel sampling for diagonal
ra : the ratios of pixel sampling for diagonal
• Sampling lines:
Diagonal (D), Anti-diagonal (A),
Vertical (V), Horizontal (H).
8. Visual Rhythm
Di represents the gray scale value of the diagonal sampling pixels in the ith frame.
Ai represents the gray scale value of the anti-diagonal sampling pixels in the ith frame.
16. User Attention Models
(POSSIBLE EVENTS)Horizontal
attention
model
Vertical
attention
model
Expanding
attention
model
Absorbing
attention
model
Diagonal
attention
model
Anti-
diagonal
attention
model
Diagonal
sampling
Anti-
diagonal
sampling
Horizontal
sampling
Vertical
sampling
18. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
19. ROI Determination
Four sampling lines can obtain the efficient attention
model to characterize the event of a video and avoid
false alarm.
The center-crossed diagonal and anti-diagonal
sampling lines are first utilized to analyze the attention
model of the current frame, and then the vertical and
horizontal sampling lines are integrated to derive the
final user attention model in order to obtain the ROI.
20. ROI Determination
1) Visual Rhythm Creation
2)Difference calculation
3)Visual rhythm history
4)Binary thresholding
5)Morphological merging
21. ROI Determination
Fig. 4. Visual rhythms of diagonal and anti-diagonal sampling lines acquired
from Salesman QCIF sequence with 176 frames. (a) Diagonal and (b)
anti-diagonal.
22. Fig. 5. Visual rhythm difference images acquired from Fig. 4. (a) Diagonal and (b)
anti-diagonal.
• Obviously, the variation of the visual rhythms embeds significant information
about object movement shown below:
Difference calculation
23. Fig. 6. Visual rhythm historical images acquired from Fig. 5. (a) Diagonal
and (b) anti-diagonal.
• according to the variation of the visual rhythm:
Visual rhythm history
24. The threshold is calculated by averaging the historical values, which stand
for the variation of the visual rhythm.
Fig. 7. Binarized images derived from Fig. 6 by the thresholding process of
the historical statistics. (a) Diagonal and (b) anti-diagonal.
Binary thresholding
represents the binary image according to their magnitudes of variations.)(b
i z
26. Images of the scopes of user attention in the diagonal and anti- diagonal
visual rhythms. (a) Diagonal and (b) anti-diagonal.
Morphological merging
29. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
30. FMO-AWARE ROI DETERMINATION
FOR H.264/AVC VIDEO CODING
Flexible macroblock ordering (FMO) was introduced in
H.264/AVC through a new error resilience tool and can be
used for ROI video coding as well.
In H.264/AVC reference software JM 13.2, the FMO
functionality supports eight slice ordering numbers, from
0 to 7, with 0 as its first priority. Thus, the ROI
determination, which is followed by the FMO technique in
H.264/AVC , classifies the MBs into three slices from 0 to 2.
31. Skin Color Extraction and Visual
Rhythm ROI Determination
Since human faces are usually the loci of attention in
conversations, human faces should be regarded as the ROI
regions in the implementation.
Here, both skin color extraction and visual rhythm ROI
determination schemes can detect ROI areas.
Fig. 16 shows the results of each step in the proposed FMO-
aware ROI determination.
32. 16(b) and (d), the skin color pixels are
extracted and then categorized into a
macroblockbased image, respectively.
Then Fig. 16(e) sketches the contour of
the user attention region from the result
of Fig. 16(c).
Fig. 16(d) and (e) illustrate the
individual ROI results in terms of white
and black macroblocks, where white
macroblocks represent the ROI region.
FMO-AWARE ROI DETERMINATION
33. Extended ROI Macroblocks
In implementations, ROI regions do not always stay in the same
position in a consecutive sequence, and a macroblock may change its
ROI status between two consecutive frames.
Therefore, the variation of generated bits will be raised when a
macroblock changes its situation from a non-ROI region in the
previous frame to an ROI region in the current frame
Moreover, the visual quality suffers from obvious artifacts in the
boundary between ROI macroblocks and non-ROI ones.However, it is
observed in [24], [25] that an extended region around the ROI regions
is beneficial to reduce the artifact while ensuring regions with targets
are not missed
Therefore, the extended ROI macroblocks have the ROI regions
obtained above as its center in our implementation. Fig. 16(f) and (g)
illustrates the extended ROI regions marked by gray color.
34. ROI Scoreboard for FMO
To create a scoreboard of ROI macroblocks, points are given to classify
the category of each macroblock.
If a macroblock located in the background gets two points. If a
macroblock belongs to an extended region either in spatial or temporal
domains, it gets one point. Otherwise, a macroblock obtains zero point
when it belongs to the ROI region.
As illustrated in Fig. 16(h), each macroblock has its score from the
lookup table in Table IV, and then it is arranged into five distinct
ordered slices. Fig. 16(i) shows the original frame with the result of ROI
scoreboard in Fig. 16(h) to demonstrate the location of the
corresponding slices in a frame.
35. The higher the score, the less important a
macroblock is in a frame.
Corresponding score lookup table
ROI Scoreboard for FMO
36. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
40. Two walking taff in the office room.
Experimental Results
(a) (b) (c) (d)
(e) (f) (g) (h)
41. Time Consuming Analysis of Visual
Rhythm ROI Determination
Evaluated on 1.5 GHz Pentium-M laptop with 512 MB DDR RAMs
42. Implementation of H.264/AVC ROI
Video Coding
Indicate the importance of each slice in FMO
Ii : the importance factor
Ni : the number of macroblocks of the slice i
n stands for the number of slices in a frame
target bits bppi
B is the target bits used for the current frame and is estimated
by the JM encoder
QPi for the FMO
a and b are recommended as 14 and −0.32
46. Outline
Introduction
Visual Rhythm And User Attention Model
ROI Determination Through User Attention
Model
FMO-aware ROI Determination For H.264/AVC
Video coding
Experimental Results
Conclusion
47. Conclusion
This paper has presented a robust ROI determination
method based on user attention models through visual
rhythm analysis.
It has been the investigation of the visual rhythm
concept for analyzing video content to facilitate the
ROI determination.
Through visual rhythm, the proposed algorithm can
determine the highest potential ROI area in a fast,
simple, and robust way.
48. Future Work
An FMO-aware ROI determination has been proposed
for H.264/AVC video coding to enhance the quality of
ROI regions.
Based on the concept proposed in this paper, potential
developments of integrated applications are found
when the proposed scheme is combined with
chrominance information analysis.