5. Detection of “Talking Head”
Shots (1/2)
» Based on Mouth
Region of Interest
processing
» Processed shot-by-
shot
5
Face detection
Mouth
movement
detection
Cascade
classifier
Not Talking HeadTalking Head
6. Detection of “Talking Head”
Shots (2/2)
» Face detection using Haar Cascades
» Sensitivity 88%, Specificity 100%
» Integrated
6
7. Detection of Day & Night
Shots
» Based on
neural
network
» Tested on
>2000
photos
» Efficiency
>90%
» Integrated
7
8. Video Quality Indicators
» Video quality
assessment system for
video sequences
» Quality of Experience
(QoE)
» 13 quality parameters
» Temporal Activity (TA)
» Spatial Activity (SA)
» Integrated
8
9. Recognition Events for Purpose
of Summarizing Video Sequences
» Creation &
implementation of
algorithms to recognize
motions/gestures &
other events in video
sequences
» Pending
9
By Comixboy at English Wikipedia, CC BY 2.5,
https://commons.wikimedia.org/w/index.php?curid=9672553
10. Database Statistics
» Number of videos indexed – 5423
» Number of frames indexed – 27 384 115
» Features indexed:
– Shot Boundary Detection
– 13 Video Quality Indicators
– Spatial Activity
– Temporal Activity
» Features pending (expected May 2017):
– Automatic Speech Recognition
– Day/Night
10
15. Evaluation of Multimedia
Content Summarisation Algorithms
» Together with DEUSTO
» Review of State-of-the-Art
» Collaboration with
Video Quality Experts Group
– Project: Quality Assessment
for Recognition and Task-
based multimedia
applications (QART)
– Meeting in May 2017
» Pending
15
Editor's Notes
Face should be large enough (3% of the scene size)
One face only (90% of the frames in scene with 1 face)
Open/closed ratio 20% or higher
We have a video of a real event, for example the election in France. The entire recording is 15 minutes, we want to shorten it to 1.5 minutes. Algorithm cuts and processes video. Now we want to compare how much content from the original video got into the summary. Someone (some researchers from the project) watch these 15 mins and they make a summary, they tell the most important thing they learned. It would be good if they were journalists, not engineers. Now we can ask people to write down what they learned from summaries and do text mining, or this is true of the facts described by professionals, or we can ask to generate questions by specialists and taking a viewing test. In each of these cases we have the problem of knowing before looking at a summary that needs to be addressed in some way.