SlideShare a Scribd company logo
1 of 13
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Unsupervised video summarization framework using
keyframe extraction and video skimming
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Video Summarization
Video is one of the robust sources of information and the consumption of online and offline videos has
reached an unprecedented level in the last few years.
Apart from context understanding, it almost impossible to create a universal summarized video for
everyone, as everyone has their own bias of keyframe.
Example: In a soccer game, a coach person might consider those frames which consist of information on
player placement, techniques, etc; however, a person with less knowledge about a soccer game, will focus
more on frames which consist of goals and score-board.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Video Summarization problems
A Summarized video must have the following properties:
• It must contain the high priority entities and events from the video.
• The summary should be free of repetition and redundancy.
Failure to exclude these components might lead to misinterpretation of the video from its summarized version.
A summary of video also varies from person to person, so a supervised video summarization can be seen as
personalized recommendation approach, but it requires extensive amount of data labeling by a person and yet
the trained model isn’t a generalized model among mass.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Our Approach & Contributions:
● We attempt to solve video summarization through unsupervised learning by employing traditional vision-
based algorithmic methodologies for accurate feature extraction from video frames.
● We have also proposed a deep learning-based feature extraction followed by multiple clustering methods
to find an effective way of summarizing a video by interesting keyframe extraction.
● We have compared the performance of these approaches on the SumMe dataset and showcased that using
deep learning-based feature extraction has been proven to perform better in case of dynamic viewpoint
videos.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Our Approach mainly have 2 key parts:
1. Keyframe Extraction via analysing features/interest points from different video frames.
1. Video Skimming: Add +2 and -2 seconds of points around interesting frame.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Methods for KeyFrame Extraction
1. Uniform Sampling
1. Image Histogram
1. SIFT(Scale Invariant Feature Transform)
1. VSUMM
1. ResNet16 pre-trained on ImageNet.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
EXPERIMENTS AND RESULTS
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Dataset
SumMe dataset which was created to be used as a benchmark for video summarization. The dataset contains
25 videos with the length ranging from one to six minutes. Each of these videos are annotated by at least 15
humans with a total of 390 human summaries. The annotations were collected by crowd sourcing.
The length of all the human generated summaries are restricted to be within 15% of the original video.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Evaluation Method
The SumMe dataset provides individual scores to each annotated frames. We evaluate our method by
measuring the F-score from the set of frames that have been selected by our method.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Results
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Conclusion
We can conclude that
1. Gaussian Clustering along with Convolutional Networks can give better performance than other methods
with moving point camera videos.
2. Even Uniform Sampling is giving better result for videos which have stable camera view point and very
less motion.
We can conclude that one single algorithm can’t be solution of video summarization, it is dependent of the
type of video, the motion inside video.
2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020)
[Shruti Jadon] [IEEE Member] [FD5][144]
Thank you!!!
For further queries you can email me at sjadon@umass.edu

More Related Content

Similar to Unsupervised video summarization framework using keyframe extraction and video skimming

Key frame extraction methodology for video annotation
Key frame extraction methodology for video annotationKey frame extraction methodology for video annotation
Key frame extraction methodology for video annotation
IAEME Publication
 
Iaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detectionIaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detection
Iaetsd Iaetsd
 
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia FrameworkA Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
ijtsrd
 

Similar to Unsupervised video summarization framework using keyframe extraction and video skimming (20)

A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
 
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
 
IRJET - Smart Parking Guidance System
IRJET -  	  Smart Parking Guidance SystemIRJET -  	  Smart Parking Guidance System
IRJET - Smart Parking Guidance System
 
C44081316
C44081316C44081316
C44081316
 
Video Summarization
Video SummarizationVideo Summarization
Video Summarization
 
Video Stabilization using Python and open CV
Video Stabilization using Python and open CVVideo Stabilization using Python and open CV
Video Stabilization using Python and open CV
 
Key frame extraction methodology for video annotation
Key frame extraction methodology for video annotationKey frame extraction methodology for video annotation
Key frame extraction methodology for video annotation
 
Iaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detectionIaetsd arm based remote surveillance and motion detection
Iaetsd arm based remote surveillance and motion detection
 
Real Time Head Generation for Video Conferencing
Real Time Head Generation for Video ConferencingReal Time Head Generation for Video Conferencing
Real Time Head Generation for Video Conferencing
 
M.tech Third progress Presentation
M.tech Third progress PresentationM.tech Third progress Presentation
M.tech Third progress Presentation
 
IRJET- Post Silicon Validation of In-Vehicle Infotainment ECU
IRJET- Post Silicon Validation of In-Vehicle Infotainment ECUIRJET- Post Silicon Validation of In-Vehicle Infotainment ECU
IRJET- Post Silicon Validation of In-Vehicle Infotainment ECU
 
IRJET - Automated Fraud Detection Framework in Examination Halls
 IRJET - Automated Fraud Detection Framework in Examination Halls IRJET - Automated Fraud Detection Framework in Examination Halls
IRJET - Automated Fraud Detection Framework in Examination Halls
 
Virtual Classroom(Android Application for Accessing Server using Wi-Fi Services)
Virtual Classroom(Android Application for Accessing Server using Wi-Fi Services)Virtual Classroom(Android Application for Accessing Server using Wi-Fi Services)
Virtual Classroom(Android Application for Accessing Server using Wi-Fi Services)
 
40120130405002
4012013040500240120130405002
40120130405002
 
IoT Based Advertising System
IoT Based Advertising SystemIoT Based Advertising System
IoT Based Advertising System
 
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia FrameworkA Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big Data
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 

Unsupervised video summarization framework using keyframe extraction and video skimming

  • 1. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Unsupervised video summarization framework using keyframe extraction and video skimming
  • 2. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Video Summarization Video is one of the robust sources of information and the consumption of online and offline videos has reached an unprecedented level in the last few years. Apart from context understanding, it almost impossible to create a universal summarized video for everyone, as everyone has their own bias of keyframe. Example: In a soccer game, a coach person might consider those frames which consist of information on player placement, techniques, etc; however, a person with less knowledge about a soccer game, will focus more on frames which consist of goals and score-board.
  • 3. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Video Summarization problems A Summarized video must have the following properties: • It must contain the high priority entities and events from the video. • The summary should be free of repetition and redundancy. Failure to exclude these components might lead to misinterpretation of the video from its summarized version. A summary of video also varies from person to person, so a supervised video summarization can be seen as personalized recommendation approach, but it requires extensive amount of data labeling by a person and yet the trained model isn’t a generalized model among mass.
  • 4. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Our Approach & Contributions: ● We attempt to solve video summarization through unsupervised learning by employing traditional vision- based algorithmic methodologies for accurate feature extraction from video frames. ● We have also proposed a deep learning-based feature extraction followed by multiple clustering methods to find an effective way of summarizing a video by interesting keyframe extraction. ● We have compared the performance of these approaches on the SumMe dataset and showcased that using deep learning-based feature extraction has been proven to perform better in case of dynamic viewpoint videos.
  • 5. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Our Approach mainly have 2 key parts: 1. Keyframe Extraction via analysing features/interest points from different video frames. 1. Video Skimming: Add +2 and -2 seconds of points around interesting frame.
  • 6. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144]
  • 7. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Methods for KeyFrame Extraction 1. Uniform Sampling 1. Image Histogram 1. SIFT(Scale Invariant Feature Transform) 1. VSUMM 1. ResNet16 pre-trained on ImageNet.
  • 8. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] EXPERIMENTS AND RESULTS
  • 9. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Dataset SumMe dataset which was created to be used as a benchmark for video summarization. The dataset contains 25 videos with the length ranging from one to six minutes. Each of these videos are annotated by at least 15 humans with a total of 390 human summaries. The annotations were collected by crowd sourcing. The length of all the human generated summaries are restricted to be within 15% of the original video.
  • 10. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Evaluation Method The SumMe dataset provides individual scores to each annotated frames. We evaluate our method by measuring the F-score from the set of frames that have been selected by our method.
  • 11. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Results
  • 12. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Conclusion We can conclude that 1. Gaussian Clustering along with Convolutional Networks can give better performance than other methods with moving point camera videos. 2. Even Uniform Sampling is giving better result for videos which have stable camera view point and very less motion. We can conclude that one single algorithm can’t be solution of video summarization, it is dependent of the type of video, the motion inside video.
  • 13. 2020 5th IEEE International Conference on Computing, Communication and Automation (ICCCA 2020) [Shruti Jadon] [IEEE Member] [FD5][144] Thank you!!! For further queries you can email me at sjadon@umass.edu