Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION

SEPTEMBER 26th, 2017
Neeraj Baghel
M.Tech, 178150005
Under the Supervision of
Prof. Charul Bhatnagar
Professor, Deptt. of CEA
GLA University, Mathura
1/20
FIRST PROGRESS PRESENTATION
ON
VIDEO SUMMARIZATION

OUTLINE
 Video Summarization
 Types of Video Summarization
 Applications
 Issues & Challenges
 Tools & Datasets
 Journals & Conferences
 Researchers & Groups
 References
2/20

Video
• Video data is a great asset
for information extraction
and knowledge discovery.
• Due to its size an variability,
it is extremely hard for
users to monitor.[4]
Video Summarization
• Intelligent video
summarization algorithms
allow us to quickly browse a
lengthy video by capturing
the essence and removing
redundant information.[4]
3/20
Video Summarization
[4] Sharghi, Aidean, "Query-focused video summarization: Dataset, evaluation, and a memory network based
approach." The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017.
[9] https://www.slideshare.net/MikolajLeszczuk/results-on-video-summarization (D.L.V 01/09/18)
Fig 1: Video Summarization Work Flow [9]

Video can be summarized by two different ways which are as
follows.
4/20
Types of Video summarization
Fig 2: Video Summarization Technique Classification [7]
[7] Mundur, Padmavathi, Yong Rao, and Yelena Yesha. "Keyframe-based video summarization using Delaunay
clustering." International Journal on Digital Libraries 6.2 (2006): 219-232. (D.L.V 20/08/18)

Key Frame Extraction
Fig 3: Key Frame Extraction [8]
5/20
[8] Souza, Celso L. de, et al. "A unified approach to content-based indexing and retrieval of digital videos from
television archives." (2014). (D.L.V 05/09/18)

Video Skims
• This is also called a moving-image abstract, moving story
board, or summary sequence.
• The original video is segmented into various parts which is a
video clip with shorter duration.
6/20
[11] https://www.cs.cmu.edu/~msmith/skim_homepage.html
Fig 4: Automated Video Skimming Informedia Digital Video Library Project [11]

Applications
The application of video summarization can be divided into three
main categories:
1) Consumer Video Applications
 Browsing the recorded content
 View the interesting parts quickly
7/20
Fig 4: View The Interesting Parts Quickly [12]
[12] https://www.youtube.com/watch?v=OHAWwaYu2H0&t=46s (D.L.V 20/09/18)

Cont…
2) Image-Video Databases Management
 Video search engine
 Digital video library
 Object indexing and retrieval
 Automatic object labeling
8/20
Fig 5: Digital video library [13]
[13] https://www.searchenginejournal.com/deep-learning-powers-video-seo/175145/ (D.L.V 21/09/18)

Cont…
2) Surveillance
 Outdoor Perimeter Security
 Internet Security Systems
 Parking Lots
 Traffic Monitoring
Fig 6 :Traffic Monitoring[14]
Fig 7:Outdoor Perimeter Security[14]
9/20
[14] https://www.framos.com/en/solutions/mobility/ (D.L.V 21/09/18)

Issues and Challenges
Some general issues and Challengesrelated to video
summarization:
 Loss of information
 Computationally expensive
 Evaluate the performance of a video summarizer
 No single video summarizer fits all users
10/20

Tools
 Matlab
Matlab is a commercial product that is pretty widely-used in the image
/video processing community. It also has an adequate image processing
`toolbox,' and toolboxes for things like Kalman filters, neural networks,
genetic algorithms, and so on. It runs on most Unices, including Linux, and
on Windows 95/NT. For people who are researching into vision algorithms,
the lack of source code is a killer.
 OpenCV
is a library of programming functions mainly aimed at real-time computer
vision. Originally developed by Intel. The library is cross-platform and free
for use under the open-source BSD license
11/20

Datasets
 UT Egocnetric (UTE)
The dataset contains 4 videos from head-mounted cameras, each about 3-
5 hours long. (Size: 1.4Gb)
 SumMe
The dataset consists of 25 videos which are single-shot and range in length
from 1-6 minutes. The dataset contains summaries created by 15 to 18
users with the constraint in length being that the summaries should be 5%
to 15% of the original video. (Size: 2.2 GB)
12/20

Datasets Cont…
Dataset
 YouTube-8M
YouTube-8M is a large-scale labeled video dataset that consists of millions of
YouTube video IDs and associated labels from a diverse vocabulary of 4700+
visual entities
• Each video must be public and have at least 1000 views
• Each video must be between 120 and 500 seconds long
• Each video must be associated with at least one entity from our target
vocabulary
• Adult & sensitive content is removed (as determined by automated classifiers)
May 2018 version (current): 6.1M videos, 3862 classes, 3.0 labels/video, 2.6B
audio-visual features
13/20

Datasets Cont…
Dataset
 MED Summaries
The "MED Summaries" is a dataset for evaluation of dynamic video
summaries. It contains annotations of 160 videos: a validation set of 60
videos and a test set of 100 videos. There are 10 event categories in the
test set. The current available dataset is from 235 users, all images are in
bitmap(*.bmp)format. The resolution of these images is 800 * 600 pixels.
(size:12Gb).
14/20

Journals
 IEEE Transactions on Pattern Analysis and Machine Intelligence
 IEEE Transactions on Image Processing
 SPINGER-IPSJ Transactions on Computer Vision and
Applications (CVA)
 ELSEVIER- Computer Vision and Image Understanding
 ELSEVIER-Pattern Recognition
 IJCV - International Journal of Computer Vision
 IJIPA- International Journal of Image Processing and Applications
 IET- The Institution of Engineering and Technology
15/20

Conferences
 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR)
 IEEE International Conference on Image Processing (ICIP)
 IEEE/CVF International Conference on Computer Vision (ICCV)
 IEEE Winter Conference on Applications of Computer Vision (WACV)
 ACCV - Asian Conference on Computer Vision
 ECCV - European Conference on Computer Vision
 CVIP- International Conference on Computer Vision and Image
Processing , India
 NCVPRIPG -National Conference on Computer Vision, Pattern
Recognition, Image Processing and Graphics , India
16/20

Research Group
17/20
Fei-Fei Li
Professor Director, Stanford AI Lab
Computer Science Department
Feifeili@cs.stanford.edu
Stanford Computer Vision Lab
Animesh Garg
Professor ,Stanford AI Lab
Computer Science Department
garg@cs.standford.edu

Research Group
18/20
Aidean Sharghi
Center for Research in Computer Vision,
University of Central Florida
aidean.sharghi@gmail.com
Boqing Gong
Assistant Professor
Center for Research in Computer Vision
Department of Computer Science
boqingGo@outlook.com
Center for Research in Computer Vision,

Research Group
19/20
Abhishek Sarkar
Senior Research Scientist
International Institute of Information Technology
Hyderabad, INDIA
Abhishek.sarkar@iiit.ac.in
Dr. C. V. Jawahar
Researcher,
Hyderabad, INDIA
jawahar@iiit.ac.in

References
[1] Song, Yale, et al. "Tvsum: Summarizing web videos using
titles." Proceedings of the IEEE conference on computer vision and pattern
recognition. 2015.
[2] Zhuang, Yueting, Ruogui Xiao, and Fei Wu. "Key issues in video summarization and
its application." Information, Communications and Signal Processing, 2003 and
Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint
Conference of the Fourth International Conference on. Vol. 1. IEEE, 2003.
[3] Kansagara, Ravi, Darshak Thakore, and Mahasweta Joshi. "A study on video
summarization tech-niques." International journal of innovative research in
computer and communication engi-neering 2 (2014).
[4] Sharghi, Aidean, Jacob S. Laurel, and Boqing Gong. "Query-focused video
summarization: Dataset, evaluation, and a memory network based
approach." The IEEE Conference on Computer Vision and Pattern Recognition (
(CVPR). 2017.
[5] Ramesh, Animesh, et al. "Video Summarization: An Overview of Techniques.“
20/20

References
[6] Sabbar, W.; Chergui, A.; Bekkhoucha, A., "Video summarization using shot
segmentation and local motion estimation," InnovativeComputing Technology
(INTECH), 2012 Second International Conference on, vol., no., pp.190, 193, 18-20
Sept. 2012
[7] Mundur, Padmavathi, Yong Rao, and Yelena Yesha. "Keyframe-based video
summarization using Delaunay clustering." International Journal on Digital
Libraries 6.2 (2006): 219-232.
[8] Souza, Celso L. de, et al. "A unified approach to content-based indexing and
retrieval of digital videos from television archives." (2014).
[9] https://www.slideshare.net/MikolajLeszczuk/results-on-video-summarization
[10] Landy, Michael S., Yoav Cohen, and George Sperling. "HIPS: A Unix-based image
processing system." Computer Vision, Graphics, and Image Processing 25.3
(1984): 331-347.
21/20

References
[11] https://www.cs.cmu.edu/~msmith/skim_homepage.html
[12] https://www.youtube.com/watch?v=OHAWwaYu2H0&t=46s
[13] https://www.searchenginejournal.com/deep-learning-powers-video-
seo/175145/
[14] https://www.framos.com/en/solutions/mobility/
22/20

Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION

Similar to Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION (20)

More from NEERAJ BAGHEL

More from NEERAJ BAGHEL (13)

Recently uploaded

Recently uploaded (20)

Mtech First progress PRESENTATION ON VIDEO SUMMARIZATION