The document discusses video summarization in video sensor networks. It proposes a distributed online multi-view video summarization algorithm to reduce storage space, transmission data, power consumption, and increase usability. Key aspects of the algorithm include online single-view summarization using Gaussian mixture models, and an inter-view stage for view selection by exchanging features and scores between sensors. Experiments show the algorithm achieves higher precision and recall than alternatives, and reduces power consumption by over 70% compared to no summarization. The algorithm is implemented on a wireless video sensor network using Raspberry Pi boards.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Embedded Implementations of Real Time Video Stabilization Mechanisms A Compre...YogeshIJTSRD
Video Stabilization has been widely researched and still is under active research considering the advancements that are being done in the field of digital imagery. Despite all the works, Hardware based Real time Video Stabilization systems especially works dealing with implementation of Technology on prototype Implementation boards have been very few. This review works focuses on the specific aspect of the modern day Stabilization systems implemented in prototyping boards and the algorithms that have been found suitable for such implementation taking considerations of cost, size and speed as the principal criterions. Mohammed Ahmed | Dr. Laxmi Singh "Embedded Implementations of Real Time Video Stabilization Mechanisms: A Comprehensive Review" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-3 , April 2021, URL: https://www.ijtsrd.com/papers/ijtsrd39963.pdf Paper URL: https://www.ijtsrd.com/engineering/other/39963/embedded-implementations-of-real-time-video-stabilization-mechanisms-a-comprehensive-review/mohammed-ahmed
A HYBRID FILTERING TECHNIQUE FOR ELIMINATING UNIFORM NOISE AND IMPULSE NOIS...sipij
A new hybrid filtering technique is proposed to improving denoising process on digital images.
This technique is performed in two steps. In the first step, uniform noise and impulse noise is
eliminated using decision based algorithm (DBA). Image denoising process is further improved
by an appropriately combining DBA with Adaptive Neuro Fuzzy Inference System (ANFIS) at
the removal of uniform noise and impulse noise on the digital images. Three well known images
are selected for training and the internal parameters of the neuro-fuzzy network are adaptively
optimized by training. This technique offers excellent line, edge, and fine detail preservation
performance while, at the same time, effectively denoising digital images. Extensive simulation
results were realized for ANFIS network and different filters are compared. Results show that
the proposed filter is superior performance in terms of image denoising and edges and fine
details preservation properties.
vSmart - an integrated system for the smart control of domestic appliances in local environments based on monitored user presence and activity; thereby achieving multidimensional user-convenience and using modern technology to endow a sustainable environment in our everyday lives.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Embedded Implementations of Real Time Video Stabilization Mechanisms A Compre...YogeshIJTSRD
Video Stabilization has been widely researched and still is under active research considering the advancements that are being done in the field of digital imagery. Despite all the works, Hardware based Real time Video Stabilization systems especially works dealing with implementation of Technology on prototype Implementation boards have been very few. This review works focuses on the specific aspect of the modern day Stabilization systems implemented in prototyping boards and the algorithms that have been found suitable for such implementation taking considerations of cost, size and speed as the principal criterions. Mohammed Ahmed | Dr. Laxmi Singh "Embedded Implementations of Real Time Video Stabilization Mechanisms: A Comprehensive Review" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-3 , April 2021, URL: https://www.ijtsrd.com/papers/ijtsrd39963.pdf Paper URL: https://www.ijtsrd.com/engineering/other/39963/embedded-implementations-of-real-time-video-stabilization-mechanisms-a-comprehensive-review/mohammed-ahmed
A HYBRID FILTERING TECHNIQUE FOR ELIMINATING UNIFORM NOISE AND IMPULSE NOIS...sipij
A new hybrid filtering technique is proposed to improving denoising process on digital images.
This technique is performed in two steps. In the first step, uniform noise and impulse noise is
eliminated using decision based algorithm (DBA). Image denoising process is further improved
by an appropriately combining DBA with Adaptive Neuro Fuzzy Inference System (ANFIS) at
the removal of uniform noise and impulse noise on the digital images. Three well known images
are selected for training and the internal parameters of the neuro-fuzzy network are adaptively
optimized by training. This technique offers excellent line, edge, and fine detail preservation
performance while, at the same time, effectively denoising digital images. Extensive simulation
results were realized for ANFIS network and different filters are compared. Results show that
the proposed filter is superior performance in terms of image denoising and edges and fine
details preservation properties.
vSmart - an integrated system for the smart control of domestic appliances in local environments based on monitored user presence and activity; thereby achieving multidimensional user-convenience and using modern technology to endow a sustainable environment in our everyday lives.
Outlook Email Recovery Tool for corrupt, lost, deleted and damaged PST files restoration. PST Recovery Tool can easily restore all PST items such as journals, notes, tasks, contacts, calendar items, attachments, distribution lists and can rum easily in all windows operating system.
TOOBEEZ has added 12 new products to the EZ-FORT product Line. The addition of the EZ-FORT Town Center and EZ-FORT Vehicles will allow children to reate their very own play places and vehicles! In addition to the new EZ-FORTSwe have added 19 new EZ-TOYS!
Rate and Performance Analysis of Indoor Optical Camera Communications in Opti...Willy Anugrah Cahyadi
It is a summary of my dissertation for the Ph.D. degree. The main topic is optical camera communication (OCC), which was being standardized under the revision to IEEE 802.15.7-2011 standard in 2018.
Fast object re-detection and localization in video for spatio-temporal fragme...LinkedTV
Fast object re-detection and localization in video for spatio-temporal fragment creation, Jul. 2013, San Jose, California, USA. Talk provided by Vasileios Mezaris.
Presentation slides for our paper "Combining Adversarial and Reinforcement Learning for Video Thumbnail Selection", ACM ICMR 2021. https://doi.org/10.1145/3460426.3463630.
We developed a new method for unsupervised video thumbnail selection. The developed network architecture selects video thumbnails based on two criteria: the representativeness and the aesthetic quality of their visual content. Training relies on a combination of adversarial and reinforcement learning. The former is used to train a discriminator, whose goal is to distinguish the original from a reconstructed version of the video based on a small set of candidate thumbnails. The discriminator’s feedback is a measure of the representativeness of the selected thumbnails. This measure is combined with estimates about the aesthetic quality of the thumbnails (made using a SoA Fully Convolutional Network) to form a reward and train the thumbnail selector via reinforcement learning. Experiments on two datasets (OVP and Youtube) show the competitiveness of the proposed method against other SoA approaches. An ablation study with respect to the adopted thumbnail selection criteria documents the importance of considering the aesthetics, and the contribution of this information when used in combination with measures about the representativeness of the visual content.
Presentation of the paper titled "Summarizing videos using concentrated attention and considering the uniqueness and diversity of the video frames", by E. Apostolidis, G. Balaouras, V. Mezaris, I. Patras, delivered at the ACM Int. Conf. on Multimedia Retrieval (ICMR’22), Newark, NJ, USA, June 2022. The corresponding software is available at https://github.com/e-apostolidis/CA-SUM.
Outlook Email Recovery Tool for corrupt, lost, deleted and damaged PST files restoration. PST Recovery Tool can easily restore all PST items such as journals, notes, tasks, contacts, calendar items, attachments, distribution lists and can rum easily in all windows operating system.
TOOBEEZ has added 12 new products to the EZ-FORT product Line. The addition of the EZ-FORT Town Center and EZ-FORT Vehicles will allow children to reate their very own play places and vehicles! In addition to the new EZ-FORTSwe have added 19 new EZ-TOYS!
Rate and Performance Analysis of Indoor Optical Camera Communications in Opti...Willy Anugrah Cahyadi
It is a summary of my dissertation for the Ph.D. degree. The main topic is optical camera communication (OCC), which was being standardized under the revision to IEEE 802.15.7-2011 standard in 2018.
Fast object re-detection and localization in video for spatio-temporal fragme...LinkedTV
Fast object re-detection and localization in video for spatio-temporal fragment creation, Jul. 2013, San Jose, California, USA. Talk provided by Vasileios Mezaris.
Presentation slides for our paper "Combining Adversarial and Reinforcement Learning for Video Thumbnail Selection", ACM ICMR 2021. https://doi.org/10.1145/3460426.3463630.
We developed a new method for unsupervised video thumbnail selection. The developed network architecture selects video thumbnails based on two criteria: the representativeness and the aesthetic quality of their visual content. Training relies on a combination of adversarial and reinforcement learning. The former is used to train a discriminator, whose goal is to distinguish the original from a reconstructed version of the video based on a small set of candidate thumbnails. The discriminator’s feedback is a measure of the representativeness of the selected thumbnails. This measure is combined with estimates about the aesthetic quality of the thumbnails (made using a SoA Fully Convolutional Network) to form a reward and train the thumbnail selector via reinforcement learning. Experiments on two datasets (OVP and Youtube) show the competitiveness of the proposed method against other SoA approaches. An ablation study with respect to the adopted thumbnail selection criteria documents the importance of considering the aesthetics, and the contribution of this information when used in combination with measures about the representativeness of the visual content.
Presentation of the paper titled "Summarizing videos using concentrated attention and considering the uniqueness and diversity of the video frames", by E. Apostolidis, G. Balaouras, V. Mezaris, I. Patras, delivered at the ACM Int. Conf. on Multimedia Retrieval (ICMR’22), Newark, NJ, USA, June 2022. The corresponding software is available at https://github.com/e-apostolidis/CA-SUM.
Key frame extraction for video summarization using motion activity descriptorseSAT Journals
Abstract Summarization of a video involves providing a gist of the entire video without affecting the semantics of the video. This has been implemented by the use of motion activity descriptors which generate relative motion between consecutive frames. Correctly capturing the motion in a video leads to the identification of the key frames in the video. This motion in the video can be obtained by using block matching techniques which is an important part of this process. It is implemented using two techniques, Diamond Search and Three Step Search, which have been studied and compared. The comparison process is tried across various videos differing in category, content, and objects. It is found that there is a trade-off between summarization factor and precision during the summarization process. Keywords: Video Summarization, Motion Descriptors, Block Matching
Key frame extraction for video summarization using motion activity descriptorseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MIPI DevCon 2021: MIPI CSI-2 v4.0 Panel Discussion with the MIPI Camera Worki...MIPI Alliance
Panel discussion with Haran Thanigasalam, Intel Corporation, MIPI Camera Working Group chair; Natsuko Ibuki, Google, LLC;
Yuichi Mizutani, Sony Corporation; and Wonseok Lee, Samsung Electronics, Co.
Presentation of the paper titled "Combining Global and Local Attention with Positional Encoding for Video Summarization", by E. Apostolidis, G. Balaouras, V. Mezaris, I. Patras, delivered at the IEEE Int. Symposium on Multimedia (ISM), Dec. 2021. The corresponding software is available at https://github.com/e-apostolidis/PGL-SUM.
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman
This presentation was prepared for ViPr Reading group at Multimedia University, Cyberjaya. The goal of this presentation was to make aware the lab members about the recent advancements in action recognition.
Towards using multimedia technology for biological data processingWesley De Neve
Towards using multimedia technology for biological data processing.
Presentation given during the Ghent University Global Campus (GUGC) Research Seminar on 19/1/2014.
Instrumenting Open vSwitch with Monitoring Capabilities: Designs and ChallengesAJAY KHARAT
With the advancement of SDN and NFV techniques a series of work was proposed:
OpenSketch, DREAM, FlowRadar, Trumpet
Hybrid solution that balances the tradeoff between FCAP (higher accuracy) and SMON (less memory)
Alternative data structure to Ring Buffer that would consume less memory
Achieve a design of integration that has the minimal forwarding-monitoring function interference, optimal code sharing and efficient CPU/Memory resource usage
Instrumenting Open vSwitch with Monitoring Capabilities: Designs and Challenges
Defense_20140625
1. Video Summarization in Video Sensor Networks
Presenter: Shun-Hsing Ou (歐順興)
Advisor: Shao-Yi Chien (簡韶逸博士)
Media IC & System Lab
Graduate Institute of Electronics Engineering
National Taiwan University
2. • Widely applied in our daily life
Video Sensor Network (1/2)
Media IC & System Lab Shun-Hsing Ou 2
TrafficSecurity Environment Monitoring
3. Video Sensor Network (2/2)
• The EYEs of Machine-to-Machine (M2M)
or Internet-of-Things (IoT)
Media IC & System Lab Shun-Hsing Ou 3
Plenty of video sensor companies in M2M or
IoT applications shown in Computex 2014.
Goal-line Technology in
FIFA World Cup 2014
4. Problems
• Video data is usually very large
– Large storage space
– Large transmission data
• Watching video is usually time consuming
Media IC & System Lab Shun-Hsing Ou 4
5. Wireless Video Sensor Network (1/2)
• Streaming videos through wireless
communication
– Without wire = more flexible
• Wider coverage
• Better view angles
Media IC & System Lab Shun-Hsing Ou 5
6. Wireless Video Sensor Network (2/2)
• Power is the key
– Powered by
• Batteries
• Energy harvest devices
– Streaming video requires large power.
Media IC & System Lab Shun-Hsing Ou 6
7. Media IC & System Lab Shun-Hsing Ou 7
An efficient video management and
filtering method is required
8. Redundancy of Video Data
• Video usually contains redundant data
– Repeated events
– Overlapped field-of-views
Media IC & System Lab Shun-Hsing Ou 8
9. Automatic Video Summarization
• Generating short representation of original
video
• Providing an excellent solution for video
management
Media IC & System Lab Shun-Hsing Ou 9
10. Our Idea
• Applying multi-view video summarization
in video sensor networks
– Saving storage space
– Saving transmission data
– Saving power
– Increasing usability
Media IC & System Lab Shun-Hsing Ou 10
Video Sensor
Sensor Encoder Transceiver
Server
Analyzer
data
info
Summarization Unit
11. Contributions
• Propose to apply video summarization algorithms
in (wireless) video sensor networks
– Saving 60% ~ 90% storage space & transmission data
– Saving 50% ~ 80% power
– Increasing usability
• Propose an efficient video summarization
algorithm
– Multi-view
– Distributed
– On-line
• Implement real wireless video sensor networks
with summarization system
Media IC & System Lab Shun-Hsing Ou 11
12. Outline
• Background
• Proposed summarization algorithm
• Experiments
• Implementations
• Conclusion
Media IC & System Lab Shun-Hsing Ou 12
14. Requirements (1/2)
• Multi-view
• On-line
• Distributed
• Low-complexity
Media IC & System Lab Shun-Hsing Ou 14
Video Sensor
Sensor Encoder Transceiver
Server
Analyzer
data
info
Summarization Unit
15. Requirements (2/2)
• 28 summarization methods were surveyed
– Only 4 on-line approaches
– Only 7 multi-view approaches
– No multi-view AND on-line approach
– Existing on-line approaches require large memory and computing
power
– Existing multi-view approaches are centralized
Media IC & System Lab Shun-Hsing Ou 15
TMM. 4
CVPR. 5
ICIP. 2
ACMMM. 4
ICME. 4
CSVT. 1
ICCV. 2
Other. 6• As a result, a new
summarization algorithm
is required
Conferences and journals
of the references
17. System Structure
• Two stages design
– Intra-view stage
– Inter-view stage
Media IC & System Lab Shun-Hsing Ou 17
Video
Sensor
On-line Single-view
Summarization
Content Matching &
View Selection
Sensor 1
Video
Sensor
On-line Single-view
Summarization
Content Matching &
View Selection
Sensor 2
Video
Sensor
On-line Single-view
Summarization
Content Matching &
View Selection
Sensor 3
Server
Video
Feature
Intra-view Stage Inter-view Stage
18. Intra-view Stage: Overview
• On-line single-view video summarization
– Clustering
• A common technique of video summarization
• Applied to reduce redundancy
– On-line clustering is applied in our system
Media IC & System Lab Shun-Hsing Ou 18
GMM
Cluster 1 Cluster 2 Cluster n…
On-line
Clustering
Feature
Extraction
Frame
Selection
Input
Frame
Summarization
19. Intra-view Stage: Feature Extraction
• Frame representative feature is required
• MPEG7 color-layout descriptor is applied
– Simple
– Good representative ability
Media IC & System Lab Shun-Hsing Ou 19
20. Intra-view Stage: Clustering (1/2)
• Gaussian Mixture Model
– Each cluster has three parameters
• Mean
• Covariance
• Weighting
– At time t, the probability of each feature can be
represented as
Media IC & System Lab Shun-Hsing Ou 20
21. Intra-view Stage: Clustering (2/2)
• Parameter estimation
– EM is usually applied in off-line applications
– On-line estimation
• Step 1: Matching
• Step 2: Updating
Media IC & System Lab Shun-Hsing Ou 21
:pre-defined learning rate
:1 for matched component, 0 otherwise
22. Intra-view Stage: Frame Selection
• Using clustering parameters
– Low-weighting cluster: rare events
– High-variance cluster: high activity events
• Algorithm:
– Step 1: Sort clusters in ascending order by
– Step 2: Keep frames if
Media IC & System Lab Shun-Hsing Ou 22
:pre-defined summarization rate
23. Intra-view Stage: Another Point of View (1/2)
• The difficulty of on-line summarization
– Partial Information
Media IC & System Lab Shun-Hsing Ou 23
Off-line Process
Video Data
On-line Process
On-line Process with
Memory Limitation
24. Intra-view Stage: Another Point of View (2/2)
• The Gaussian-Mixture-Model keeps the
information of previous frames
– A model for what is redundant and what is
active
• No frame buffer is required
Media IC & System Lab Shun-Hsing Ou 24
GMM
Cluster 1 Cluster 2 Cluster n…
On-line
Clustering
Feature
Extraction
Frame
Selection
Input
Frame
Summarization
25. Inter-view Stage: Overview
• View selection
• Distributed view selection
– Exchange features & scores between sensors
Media IC & System Lab Shun-Hsing Ou 25
Video
Sensor
On-line Single-view
Summarization
Content Matching &
View Selection
Sensor 1
Video
Sensor
On-line Single-view
Summarization
Content Matching &
View Selection
Sensor 2
Video
Sensor
On-line Single-view
Summarization
Content Matching &
View Selection
Sensor 3
Server
Video
Feature
Intra-view Stage Inter-view Stage
26. Inter-view Stage: Overview
• Step 1: Extract inter-view feature and score for
each frame
– Color Layout Descriptor is not suitable
• Step 2: Exchange features and scores with other
sensors
• Step 3: If there is a “matched” feature with higher
score, drop the current frame
Media IC & System Lab Shun-Hsing Ou 26
27. Inter-view Stage: Feature Extraction
• Step 1: Foreground mask
– By color layout feature & GMM
• Step 2: Extract HSV histogram of the foreground pixels. (H: 16,
S: 2, V: 2) as the inter-view feature
• Step 3: Mask size is used as the frame score
Media IC & System Lab Shun-Hsing Ou 27
30. Dataset (1/2)
• Three datasets are applied
– BL-7F: 19 videos, 320 x 240, 30 FPS
– Office1: 4 videos, 640 x 480, 30 FPS
– Lobby1: 3 videos, 640 x 480, 30 FPS
Media IC & System Lab Shun-Hsing Ou 30
1Yanwei Fu, et al., “Multi-view Video Summarization,” TMM 2010
31. Dataset (2/2)
• Ground truth
– People who have no knowledge of our project
were asked to mark time period of events in
each video
– They were also asked to add flags if two
segments from different views are the same
event
Media IC & System Lab Shun-Hsing Ou 31
33. Intra-view Stage: Evaluation
• Single-View Video Summarization
– Frame level precision & recall are applied
• Precision: the ability of the algorithm to remove
useless content
• Recall: the ability of the algorithm to keep important
events
Media IC & System Lab Shun-Hsing Ou 33
34. Intra-view Stage: Baseline
• Tree-based1
– D = 30
– D = 90
• Compressed domain2
Media IC & System Lab Shun-Hsing Ou 34
1Víctor Valdés, et al., “Binary Tree Based On-line Video Summarization,” TVS 2008
2J. Almeida, et al., “Online Video Summarization on Compressed Domain,” JVCIR 2012
37. Inter-view Stage: Evaluation
• Multi-View Video Summarization
– Cross-view redundant frame are calculated as
false positive
Media IC & System Lab Shun-Hsing Ou 37
38. Inter-view Stage: Baseline
• Baseline
– Concatenate the results of single-view
methods
• Tree-based
• Compressed domain
• The proposed GMM
– Graph-based1
• The results are provided by the authors
Media IC & System Lab Shun-Hsing Ou 38
1Yanwei Fu, et al., “Multi-view Video Summarization,” TMM 2010
41. Complexity (1/2)
• Tested on EeePC
– CPU: ATOM N570
– RAM: 2GB
• Dataset: Office
– 640 X 480
• All methods are implemented using C++
Media IC & System Lab Shun-Hsing Ou 41
42. Video Skimming: Complexity (2/2)
Media IC & System Lab Shun-Hsing Ou 42
Tree-Based,
D-30
Tree-Based, D-
90
Compressed
Domain
GMM
FPS (f/s) 21.8 18.8 9.3 34.7
Latency (s) 30 90 ~200 ~0
# Buffered Frames 900 2700 ~6000 1
Memory > 414.7 MB > 1244.1 MB > 2764.8 MB 474.6 KB
44. Power Analysis
• We compare the power consumption
– With/Without summarization
• Platform: EeePC
– Battery power is measured
– DVC is applied as the encoder
Media IC & System Lab Shun-Hsing Ou 44
1 S.-Y. Chien, et al., Power consumption analysis for distributed video sensors in machine-to-machine
networks,“ JETCAS 2013
45. Without Summarization
• Total power
– Encoding power (Pc)
– Transmission power (Pt)
Media IC & System Lab Shun-Hsing Ou 45
Wireless Video Sensor
Sensor Encoder Transceiver
data
Server
Analyzer info
46. With Summarization
• Total power
– Encoding power (Pc)
– Video transmission power (Pt)
– Feature transmission power (Pf)
– Summarization power (Ps)
Media IC & System Lab Shun-Hsing Ou 46
Video Sensor
Sensor Encoder Transceiver
Server
Analyzer
data
info
Summarization Unit
47. 0
20
40
60
80
100
120
DVC DVC + Intra-view Stage DVC + Inter-view Stage
Power(mW)
Pf: Feature Transmission Power
Ps: Summarization Power
Pt: Transmission Power
Pc: Encoding Power
Media IC & System Lab Shun-Hsing Ou 47
BL-7F, Processor-Based
73.5%
49. Implementation
• We use Raspberry Pi to implement our
wireless video sensor network
Media IC & System Lab Shun-Hsing Ou 49
50. Raspberry Pi
• Spec
– SoC: Broadcom BCM2835
– CPU: 700 MHz ARM11
– GPU: Broadcom VideoCore IV @ 250 MHz
– Memory: 512 MB
– Power: 5V x 700mA = 3.5W
• Related I/O
– 5V Micro USB power input
– Two USB I/O
– Camera Serial Interface (CSI)
Media IC & System Lab Shun-Hsing Ou 50
52. Video Acquisition and Encoding (1/2)
• We need raw RGB from camera module
– Color space conversion is slow
• We need to encode video after
summarization
– Encoding is a high-complexity task
Media IC & System Lab Shun-Hsing Ou 52
53. Video Acquisition and Encoding (2/2)
• Hardware Acceleration: Broadcom
VideoCore IV
– Hardware camera pipeline
– Hardware H.264 encoder/decoder
– OpenMAX API
Media IC & System Lab Shun-Hsing Ou 53
54. Synchronization
• Network Time Protocol (NTP)
– Error may be large when cross domains
(>100ms)
– Error is small in local (< 1ms)
• We create NTP server in our server
Media IC & System Lab Shun-Hsing Ou 54
58. Conclusion
• In this thesis, we propose to apply
summarization on video sensor network
– Saving 60% ~ 90% storage space & transmission
data
– Saving 50% ~ 80% power
• A distributed on-line multi-view
summarization algorithm is proposed
– Low-complexity, low memory requirement
– Generating comparable results with other
methods
• A wireless video sensor network is
implemented to validate the concept
Media IC & System Lab Shun-Hsing Ou 58
60. Appendix: Proposed System II -
Distributed On-line Multi-view Keyframe
Extraction
Media IC & System Lab Shun-Hsing Ou 60
61. Representation of Video Summarization (1/3)
• Video Skimming: A short video highlight
– More enjoyable to watch
– Better for further vision processing
• Keyframe Extraction: Representative
keyframes
– More compact representation
– Better for video browsing, surveillance, etc.
Media IC & System Lab Shun-Hsing Ou 61
62. Representation of Video Summarization (2/3)
• Storyboard: Arranged keyframes
• Fast forwards: Smart video player
• Video Synopsis: Retargeting in time
domain
Media IC & System Lab Shun-Hsing Ou 62
1Y. Pritch, et al., “Webcam Synopsis: Peeking Around the World,” ICCV 2007
63. Representation of Video Summarization (3/3)
• “Video skimming” and “Keyframe
extraction” are better for video sensor
networks
– The results are more suitable for other vision
processing
– We focus on data filtering instead of summary
representation
Media IC & System Lab Shun-Hsing Ou 63
64. Video-MMR1 (1/2)
• Video maximum marginal relevance
• Iterative algorithm
– Select one frame with max Video-MMR at one
time
Media IC & System Lab Shun-Hsing Ou 64
1Yingbo Li, et al., “Multi-video Summarization Based on Video-MMR,” WAMIAS 2010
- Frame
- Set of all frames
- Frames in summary
Represent ability Redundancy
65. Video-MMR1 (2/2)
• Centralized algorithm
• Off-line algorithm
Media IC & System Lab Shun-Hsing Ou 65
1Yingbo Li, et al., “Multi-video Summarization Based on Video-MMR,” WAMIAS 2010
66. Distributed On-line Video-MMR (1/2)
• Perform operation for every fixed time
period T
– is used instead of , where is the
set of frame captured from t to t + T
– Avoid buffering all frames
• If there are M camera
– We change MMR to
Media IC & System Lab Shun-Hsing Ou 66
68. Distributed On-line Video-MMR (3/3)
• First term can be calculated at each sensor
• Second term can be calculated by sending
all feature of from the server to sensors
– Large data overhead
• We send frames as
Media IC & System Lab Shun-Hsing Ou 68
69. Data Overhead
• There is large data overhead if we want to
send all features belong to to all sensors
• MsWave1 is applied
– MsWave is a distributed kNN/kFN algorithm
– MsWave reduce large amount of data
exchanged
Media IC & System Lab Shun-Hsing Ou 69
1J.-P. Wang, et al., “Communication-efficient distributed multiple reference pattern
matching for M2M systems, ” ICDM 2013
70. MsWAVE
• Distributed kNN/kFN search algorithm
between a group of sensors and a server
• Haar transform is applied to generate
coarse level feature
– Upper bond and lower bond are estimated
using the coarse feature
Media IC & System Lab Shun-Hsing Ou 70
73. Keyframe Extraction: Baseline
• Single-view
– Uniform sampling (US)
– Random sampling (RS)
– Visual attention based1 (VA)
• Multi-view
– MMR2
– K-means (KM)
Media IC & System Lab Shun-Hsing Ou 73
1Y.-F. Ma, “A Generic Framework of User Attention Model and Its Application in Video
Summarization,” TMM 2005
2Yingbo Li, et al., “Multi-video Summarization Based on Video-MMR,” WAMIAS 2010
74. Keyframe Extraction: Extra Data
• Since keyframes are much smaller than
video skimming
– Extra data becomes relatively large
• We compare extra data with centralized
method, which features of all frames are
sent
Media IC & System Lab Shun-Hsing Ou 74
75. Media IC & System Lab Shun-Hsing Ou 75
Single-view Multi-view
RS US VA KM MMR Ours
BL-7F
(19 videos)
Keyframe 77 77 82 77 77 77
Recall (%) 22 30 74 74 67 74
Redundant Frame 1 3 64 38 36 32
Data Sent (%) 0 0 0 100 100 33
Office
(4 videos)
Keyframe 94 94 116 94 94 94
Recall (%) 13 18 52 52 66 63
Redundant Frame 2 0 44 45 38 21
Data Sent (%) 0 0 0 100 100 26
Lobby
(3 videos)
Keyframe 70 70 117 70 70 70
Recall (%) 66 63 72 72 70 76
Redundant Frame 8 11 69 29 28 14
Data Sent (%) 0 0 0 100 100 16
77. On-line Summarization (1/3)
• Tree-based Method1
– Type: video skimming
– Method:
• On-line decision tree
– Cons
• Long latency
• Large memory required
Media IC & System Lab Shun-Hsing Ou 77
1Víctor Valdés, et al., “Binary Tree Based On-line Video Summarization,” TVS 2008
78. On-line Summarization (2/3)
• Summarization in compress domain1
– Type: video skimming
– Method
• On-line shot detection: calculate different between frames
• Redundancy removal
– Cons
• Long latency
• Large memory required
Media IC & System Lab Shun-Hsing Ou 78
1J. Almeida, et al., “Online Video Summarization on Compressed Domain,” JVCIR
2012
79. On-line Summarization (3/3)
• Visual Attention Model1
– Type: keyframe
– Method
• Visual attention index
• Attention curve peek detection
– Cons
• Not able to remove redundant frames
Media IC & System Lab Shun-Hsing Ou 79
1Y.-F. Ma, “A Generic Framework of User Attention Model and Its Application in Video Summarization,”
TMM 2005
80. Multi-view Summarization (1/2)
• Clustering1
– Type: video skimming
– Method
• Shot detection
• Graph
• Clustering
– Cons
• Centralized
• High-complexity
Media IC & System Lab Shun-Hsing Ou 80
1Yanwei Fu, et al., “Multi-view Video Summarization,” TMM 2010
81. Multi-view Summarization (2/2)
• MMR1
– Type: keyframe extraction
– Method:
• Video maximum marginal relevance
– Cons
• Centralized
• Large memory required
Media IC & System Lab Shun-Hsing Ou 81
1Yingbo Li, et al., “Multi-video Summarization Based on Video-MMR,” WAMIAS 2010
Represent ability Redundancy
83. Video Skimming
• The result is like video skimming
– Parameter updating is smooth
Media IC & System Lab Shun-Hsing Ou 83
84. Media IC & System Lab Shun-Hsing Ou 84
Tree-based, D=30
85. Media IC & System Lab Shun-Hsing Ou 85
Tree-based, D=90
86. Media IC & System Lab Shun-Hsing Ou 86
Compress Domain
87. Media IC & System Lab Shun-Hsing Ou 87
The Proposed GMM Approach
88. Video Skimming: Packet Loss
Media IC & System Lab Shun-Hsing Ou 88
• Dataset: BL-7F
• Each sensor has a uniform probability
failing to receive a feature
89. Platform
• Processor-based
– EeePC
– Battery power is measured
• ASIC-based1
– Transmission power is
estimated
– H.264 power is estimated
– Summarization power is
estimated
Media IC & System Lab Shun-Hsing Ou 89
1 S.-Y. Chien, et al., Power consumption analysis for distributed video sensors in machine-to-machine
networks,“ JETCAS 2013
90. Media IC & System Lab Shun-Hsing Ou 90
BL-7F, ASIC-Based
0
5
10
15
20
25
No motion DVC DVC + Intra Stage DVC + Inter Stage
Power(mW)
Pf: Feature Transmission Power
Ps: Summarization Power
Pt: Transmission Power
Pc: Encoding Power
83.4%
93. Communication Issues
• Feature broadcasting
– Only need to broadcast to nearby sensors
• Communication latency
– An additional buffer is needed
• Synchronization
– Clocks of all sensors are synchronized
Media IC & System Lab Shun-Hsing Ou 93
94. Wireless Video Sensor Network
• Connected by a single Wi-Fi AP
Media IC & System Lab Shun-Hsing Ou 94
95. Communication Channel
• 3 TCP channels are connected to the
server for each sensor
– Video Channel: Streaming video
– Feature Channel: Exchanging features
– Control Channel: Control signals, time
information
Media IC & System Lab Shun-Hsing Ou 95