SlideShare a Scribd company logo
1
2020/12/15
Summarizing Videos with Attention
Arithmer Dynamics Div. Christian Saravia
Self-Introduction
Christian Martin Saravia Hernandez
● Graduate School
○ Universidad Peruana de Ciencias Aplicadas
○ Bs. Software Engineering
● Former Job
○ Ficha Inc
■ Algorithm Development Engineer
● Research & application of deep learning in computer vision problems
● Current Job
○ Application of machine learning / deep learning to computer vision problems
■ Object detection
■ Object classification
○ Research topics
■ Attention & Transformers
Purpose
Purpose of this material
● Explore a solution to the task of video summarization using attention.
Agenda
Contents
● Introduction
○ Motivation
○ Contributions
● Dataset
● VASNet
○ Feature Extraction
○ Attention Network
○ Regressor Network
● Inference
○ Changepoint Detection
○ Kernel Temporal Segmentation
● Results
○ Measuring method
○ Dataset Results
Introduction
Motivation
● Early video summarization methods were based on unsupervised methods, leveraging low level spatio-temporal features and
dimensionality reduction with clustering techniques.Success of these methods solely stands on the ability to define distance/cost
functions between the keyshots/frames with respect to the original video.
● Current state of the art methods for video summarization are based on recurrent encoder-decoder architectures, usually with bi-
directional LSTM or GRU and soft attention. They are computationally demanding, especially in the bi-directional configuration.
Introduction
Contribution
1. A novel approach to sequence to sequence transformation for video summarization based on soft, self-attention mechanism. In
contrast, current state of the art relies on complex LSTM/GRU encoder-decoder methods.
1. A demonstration that a recurrent network can be successfully replaced with simpler, attention mechanism for the video
summarization.
Dataset
Dataset
● TVSum Dataset: https://github.com/yalesong/tvsum
● SumMe Dataset: https://gyglim.github.io/me/vsum/index.html
VASNet
Feature Extraction
● Given a time interval t, every 15 frames are collected in an ordered set X
● Each set then is used as input to GoogLeNet for feature extraction
● Then we extract the Pool 5 layer of GoogLeNet, which is a 1024 dimensional array (D = 1024).
VASNet
Attention Network
● Unnormalized self-attention weight et,i is calculated as an alignment between input feature Xt and the entire input sequence
Where,
N: Number of frames
U, V: Network weight matrices estimated together with other parameters of the network during optimization
s: Scale paramenter
● The attention weights αt are true probabilities representing the importance of input features with respect to the
desired frame level score at the time t
VASNet
Attention Network
VASNet
Regressor Network
● Linear transformation C is then applied to each input and the results then weighted with attention vector αt and averaged.
● The output is a context vector ct which is used for the final frame score regression.
● The context vector ct is then projected by a single layer, fully connected network with linear activation and residual sum followed by
dropout and layer normalization.
VASNet
Regressor Network
Inference
Inference
● The output of the model VASNet is a probability of importance per frame
● This probability must be analyzed in the range of the scene it corresponds
● However to get the number of frames is relative per video
● The problem to find the frames where a change a scene exist is called changepoint detection.
● For the datasets used, the changepoints (cps) are already calculated by using KTS algorithm with hyperparameter tuning
Inference
Changepoint detection
● In statistical analysis, change detection or change point detection
tries to identify times when the probability distribution of a
stochastic process or time series changes. In general the problem
concerns both detecting whether or not a change has occurred, or
whether several changes might have occurred, and identifying the
times of any such changes.
Inference
Kernel Temporal Segmentation (KTS)
● Kernel Temporal Segmentation (KTS) method splits the video into
a set of non-intersecting temporal segments.
● It treats the cps detection as a dynamic programming problem.
● The method is fast and accurate when combined with high-
dimensional descriptors.
Results
Measuring method
P: Precision
R: Recall
F Score: [2 * P * R / (P + R)] * 100
Results
Dataset Results
Results
Dataset Results
● Long video: https://youtu.be/873CBVbPJVE
● Summarize: https://youtu.be/weW4memH3Dg
● Full playlist: https://www.youtube.com/playlist?list=PLEdpjt8KmmQMfQEat4HvuIxORwiO9q9DB
References
Reference
● VASNet: https://arxiv.org/pdf/1812.01969.pdf
● VASNet official implementation: https://github.com/ok1zjf/VASNet
● KTS implementation: https://github.com/TatsuyaShirakawa/KTS
● Video summarization datasets and review: https://hal.inria.fr/hal-01022967/PDF/video_summarization.pdf
● Issue on testing on own videos: https://github.com/ok1zjf/VASNet/issues/2

More Related Content

What's hot

Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Universitat Politècnica de Catalunya
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Universitat Politècnica de Catalunya
 
Improving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computingImproving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computing
RAHUL BHOJWANI
 
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...
Universitat Politècnica de Catalunya
 
Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clusters
Universitat Politècnica de Catalunya
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Universitat Politècnica de Catalunya
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
Jungwon Kim
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Universitat Politècnica de Catalunya
 

What's hot (20)

Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
 
Improving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computingImproving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computing
 
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
Perceptrons (D1L2 2017 UPC Deep Learning for Computer Vision)
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...
 
Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clusters
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 

Similar to Summarizing videos with Attention

Video Annotation for Visual Tracking via Selection and Refinement_tran.pptx
Video Annotation for Visual Tracking via Selection and Refinement_tran.pptxVideo Annotation for Visual Tracking via Selection and Refinement_tran.pptx
Video Annotation for Visual Tracking via Selection and Refinement_tran.pptx
AlyaaMachi
 
CA-SUM Video Summarization
CA-SUM Video SummarizationCA-SUM Video Summarization
CA-SUM Video Summarization
VasileiosMezaris
 
Future semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMFuture semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTM
Kyuri Kim
 
IRJET- Study of SVM and CNN in Semantic Concept Detection
IRJET- Study of SVM and CNN in Semantic Concept DetectionIRJET- Study of SVM and CNN in Semantic Concept Detection
IRJET- Study of SVM and CNN in Semantic Concept Detection
IRJET Journal
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and Innovations
QuantUniversity
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
Roberto Iacoviello
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
CodeOps Technologies LLP
 
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IRJET Journal
 
Dynamic Adaptive Point Cloud Streaming
Dynamic Adaptive Point Cloud StreamingDynamic Adaptive Point Cloud Streaming
Dynamic Adaptive Point Cloud Streaming
Alpen-Adria-Universität
 
Survey paper on image compression techniques
Survey paper on image compression techniquesSurvey paper on image compression techniques
Survey paper on image compression techniques
IRJET Journal
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
akanksha kunwar
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
IRJET Journal
 
H43044145
H43044145H43044145
H43044145
IJERA Editor
 
Near Reversible Data Hiding Scheme for images using DCT
Near Reversible Data Hiding Scheme for images using DCTNear Reversible Data Hiding Scheme for images using DCT
Near Reversible Data Hiding Scheme for images using DCT
IJERA Editor
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Ganesan Narayanasamy
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detection
GauravBiswas9
 
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ..."Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
Edge AI and Vision Alliance
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
Mokhtar SELLAMI
 
“Stream loss”: ConvNet learning for face verification using unlabeled videos ...
“Stream loss”: ConvNet learning for face verification using unlabeled videos ...“Stream loss”: ConvNet learning for face verification using unlabeled videos ...
“Stream loss”: ConvNet learning for face verification using unlabeled videos ...
Elaheh Rashedi
 

Similar to Summarizing videos with Attention (20)

Video Annotation for Visual Tracking via Selection and Refinement_tran.pptx
Video Annotation for Visual Tracking via Selection and Refinement_tran.pptxVideo Annotation for Visual Tracking via Selection and Refinement_tran.pptx
Video Annotation for Visual Tracking via Selection and Refinement_tran.pptx
 
CA-SUM Video Summarization
CA-SUM Video SummarizationCA-SUM Video Summarization
CA-SUM Video Summarization
 
Future semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMFuture semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTM
 
IRJET- Study of SVM and CNN in Semantic Concept Detection
IRJET- Study of SVM and CNN in Semantic Concept DetectionIRJET- Study of SVM and CNN in Semantic Concept Detection
IRJET- Study of SVM and CNN in Semantic Concept Detection
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and Innovations
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
 
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
 
Dynamic Adaptive Point Cloud Streaming
Dynamic Adaptive Point Cloud StreamingDynamic Adaptive Point Cloud Streaming
Dynamic Adaptive Point Cloud Streaming
 
Survey paper on image compression techniques
Survey paper on image compression techniquesSurvey paper on image compression techniques
Survey paper on image compression techniques
 
Ajila (1)
Ajila (1)Ajila (1)
Ajila (1)
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
 
H43044145
H43044145H43044145
H43044145
 
Near Reversible Data Hiding Scheme for images using DCT
Near Reversible Data Hiding Scheme for images using DCTNear Reversible Data Hiding Scheme for images using DCT
Near Reversible Data Hiding Scheme for images using DCT
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
Pipeline anomaly detection
Pipeline anomaly detectionPipeline anomaly detection
Pipeline anomaly detection
 
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ..."Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
 
“Stream loss”: ConvNet learning for face verification using unlabeled videos ...
“Stream loss”: ConvNet learning for face verification using unlabeled videos ...“Stream loss”: ConvNet learning for face verification using unlabeled videos ...
“Stream loss”: ConvNet learning for face verification using unlabeled videos ...
 

More from Arithmer Inc.

コーディネートレコメンド
コーディネートレコメンドコーディネートレコメンド
コーディネートレコメンド
Arithmer Inc.
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
Arithmer Inc.
 
最適化
最適化最適化
最適化
Arithmer Inc.
 
Arithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システム
Arithmer Inc.
 
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer Inc.
 
Arithmer Robo Introduction
Arithmer Robo IntroductionArithmer Robo Introduction
Arithmer Robo Introduction
Arithmer Inc.
 
Arithmer AIチャットボット
Arithmer AIチャットボットArithmer AIチャットボット
Arithmer AIチャットボット
Arithmer Inc.
 
Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer R3 Introduction
Arithmer R3 Introduction
Arithmer Inc.
 
Arithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inspection Introduction
Arithmer Inspection Introduction
Arithmer Inc.
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
Arithmer Inc.
 
Arithmer NLP Introduction
Arithmer NLP IntroductionArithmer NLP Introduction
Arithmer NLP Introduction
Arithmer Inc.
 
Introduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesIntroduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave Machines
Arithmer Inc.
 
Arithmer OCR Introduction
Arithmer OCR IntroductionArithmer OCR Introduction
Arithmer OCR Introduction
Arithmer Inc.
 
Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Dynamics Introduction
Arithmer Dynamics Introduction
Arithmer Inc.
 
ArithmerDB Introduction
ArithmerDB IntroductionArithmerDB Introduction
ArithmerDB Introduction
Arithmer Inc.
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB images
Arithmer Inc.
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose Estimation
Arithmer Inc.
 
Recommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learningRecommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learning
Arithmer Inc.
 
Survey on self supervised image segmentation
Survey on self supervised image segmentationSurvey on self supervised image segmentation
Survey on self supervised image segmentation
Arithmer Inc.
 
dataScienceofPhysics
dataScienceofPhysicsdataScienceofPhysics
dataScienceofPhysics
Arithmer Inc.
 

More from Arithmer Inc. (20)

コーディネートレコメンド
コーディネートレコメンドコーディネートレコメンド
コーディネートレコメンド
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
 
最適化
最適化最適化
最適化
 
Arithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システム
 
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介
 
Arithmer Robo Introduction
Arithmer Robo IntroductionArithmer Robo Introduction
Arithmer Robo Introduction
 
Arithmer AIチャットボット
Arithmer AIチャットボットArithmer AIチャットボット
Arithmer AIチャットボット
 
Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer R3 Introduction
Arithmer R3 Introduction
 
Arithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inspection Introduction
Arithmer Inspection Introduction
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
Arithmer NLP Introduction
Arithmer NLP IntroductionArithmer NLP Introduction
Arithmer NLP Introduction
 
Introduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesIntroduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave Machines
 
Arithmer OCR Introduction
Arithmer OCR IntroductionArithmer OCR Introduction
Arithmer OCR Introduction
 
Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Dynamics Introduction
Arithmer Dynamics Introduction
 
ArithmerDB Introduction
ArithmerDB IntroductionArithmerDB Introduction
ArithmerDB Introduction
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB images
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose Estimation
 
Recommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learningRecommendation algorithm using reinforcement learning
Recommendation algorithm using reinforcement learning
 
Survey on self supervised image segmentation
Survey on self supervised image segmentationSurvey on self supervised image segmentation
Survey on self supervised image segmentation
 
dataScienceofPhysics
dataScienceofPhysicsdataScienceofPhysics
dataScienceofPhysics
 

Recently uploaded

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

Summarizing videos with Attention

  • 1. 1 2020/12/15 Summarizing Videos with Attention Arithmer Dynamics Div. Christian Saravia
  • 2. Self-Introduction Christian Martin Saravia Hernandez ● Graduate School ○ Universidad Peruana de Ciencias Aplicadas ○ Bs. Software Engineering ● Former Job ○ Ficha Inc ■ Algorithm Development Engineer ● Research & application of deep learning in computer vision problems ● Current Job ○ Application of machine learning / deep learning to computer vision problems ■ Object detection ■ Object classification ○ Research topics ■ Attention & Transformers
  • 3. Purpose Purpose of this material ● Explore a solution to the task of video summarization using attention.
  • 4. Agenda Contents ● Introduction ○ Motivation ○ Contributions ● Dataset ● VASNet ○ Feature Extraction ○ Attention Network ○ Regressor Network ● Inference ○ Changepoint Detection ○ Kernel Temporal Segmentation ● Results ○ Measuring method ○ Dataset Results
  • 5. Introduction Motivation ● Early video summarization methods were based on unsupervised methods, leveraging low level spatio-temporal features and dimensionality reduction with clustering techniques.Success of these methods solely stands on the ability to define distance/cost functions between the keyshots/frames with respect to the original video. ● Current state of the art methods for video summarization are based on recurrent encoder-decoder architectures, usually with bi- directional LSTM or GRU and soft attention. They are computationally demanding, especially in the bi-directional configuration.
  • 6. Introduction Contribution 1. A novel approach to sequence to sequence transformation for video summarization based on soft, self-attention mechanism. In contrast, current state of the art relies on complex LSTM/GRU encoder-decoder methods. 1. A demonstration that a recurrent network can be successfully replaced with simpler, attention mechanism for the video summarization.
  • 7. Dataset Dataset ● TVSum Dataset: https://github.com/yalesong/tvsum ● SumMe Dataset: https://gyglim.github.io/me/vsum/index.html
  • 8. VASNet Feature Extraction ● Given a time interval t, every 15 frames are collected in an ordered set X ● Each set then is used as input to GoogLeNet for feature extraction ● Then we extract the Pool 5 layer of GoogLeNet, which is a 1024 dimensional array (D = 1024).
  • 9. VASNet Attention Network ● Unnormalized self-attention weight et,i is calculated as an alignment between input feature Xt and the entire input sequence Where, N: Number of frames U, V: Network weight matrices estimated together with other parameters of the network during optimization s: Scale paramenter ● The attention weights αt are true probabilities representing the importance of input features with respect to the desired frame level score at the time t
  • 11. VASNet Regressor Network ● Linear transformation C is then applied to each input and the results then weighted with attention vector αt and averaged. ● The output is a context vector ct which is used for the final frame score regression. ● The context vector ct is then projected by a single layer, fully connected network with linear activation and residual sum followed by dropout and layer normalization.
  • 13. Inference Inference ● The output of the model VASNet is a probability of importance per frame ● This probability must be analyzed in the range of the scene it corresponds ● However to get the number of frames is relative per video ● The problem to find the frames where a change a scene exist is called changepoint detection. ● For the datasets used, the changepoints (cps) are already calculated by using KTS algorithm with hyperparameter tuning
  • 14. Inference Changepoint detection ● In statistical analysis, change detection or change point detection tries to identify times when the probability distribution of a stochastic process or time series changes. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times of any such changes.
  • 15. Inference Kernel Temporal Segmentation (KTS) ● Kernel Temporal Segmentation (KTS) method splits the video into a set of non-intersecting temporal segments. ● It treats the cps detection as a dynamic programming problem. ● The method is fast and accurate when combined with high- dimensional descriptors.
  • 16. Results Measuring method P: Precision R: Recall F Score: [2 * P * R / (P + R)] * 100
  • 18. Results Dataset Results ● Long video: https://youtu.be/873CBVbPJVE ● Summarize: https://youtu.be/weW4memH3Dg ● Full playlist: https://www.youtube.com/playlist?list=PLEdpjt8KmmQMfQEat4HvuIxORwiO9q9DB
  • 19. References Reference ● VASNet: https://arxiv.org/pdf/1812.01969.pdf ● VASNet official implementation: https://github.com/ok1zjf/VASNet ● KTS implementation: https://github.com/TatsuyaShirakawa/KTS ● Video summarization datasets and review: https://hal.inria.fr/hal-01022967/PDF/video_summarization.pdf ● Issue on testing on own videos: https://github.com/ok1zjf/VASNet/issues/2