SlideShare a Scribd company logo
1 of 26
Download to read offline
1/26
Joint Action Recognition and Summarization
by Submodular Inference
Fairouz Hussein, Massimo Piccardi
University of Technology Sydney
UTS MMSP workshop, 12 April 2017
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
2/26
Aims
• Action recognition in video and video summarization are
two well-established research areas
• In this work, we explore the benefits of performing them
jointly
• We leverage submodular inference throughout
• Work published at ICASPP 2016 and in-press on ACM
TOMM
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
3/26
Action recognition
• Action recognition in video is a well-established research
area
• Leverages local features (STIP, DTF, ITF, MBH, bags of
concepts, deep learning . . . )
• Has reached remarkable accuracy also in realistic
scenarios
• At short distance, depth cameras are helping achieve
greater accuracy (e.g., Stereolabs ZED)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
4/26
Video summarization
• Video summarization is, too, a well-established research
area
• Summary as a set or sequence of frames
• Leverages clustering, shot detection or “key” frame
detection
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
5/26
Joint action recognition and video summarization
• Action recognition and video summarization can be
performed independently, cascaded . . .
• Recognising actions from a sub-set of key frames is a
practiced idea
• However, do the key frames meet the requirements of a
good summary, i.e. coverage and non-redundancy?
• Here, we attempt to perform them jointly with a single,
unified scoring function
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
6/26
Our graphical model
h1 h2 ht... hT...
x1 x2 xt... xT...
ht-1
xt-1
y action
summary
measurements
• y: action class; h: binary variables: ht = 1 → frame t is in
the summary; x: frame measurements
• Scoring function:
w ψ(x, y, h) = w
T
i,j=1,j=i
φ(xi, xj, hi, hj, y)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
7/26
Inference
• Given a trained model, w, and a measurement sequence,
x, how to find the best y, h?
• Inference:
y∗
, h∗
= argmaxy,h w ψ(x, y, h)
• Inference is not just left-to-right because nodes are all
connected
• Number of possible summaries = 2T ; 1 min video
→ T ≈ 1, 800
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
8/26
Submodularity
• Inference is rescued by submodularity
• A submodular function abides by the “law of diminishing
returns”:
A
V
B
f(A U V) - f(A) ≤ f(B U V) - f(B)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
9/26
Submodular inference
• The maximum achieved by a greedy algorithm over a
monotonic submodular function is ≥ 0.632 the actual
maximum [Nemhauser et al. 1978]
• The complexity of the greedy algorithm is O(T2) vs O(2T )
• We further constrain the summary to a maximum number
of frames (budget):.
y∗
, h∗
= argmaxy,h w ψ(x, y, h) s.t. |h| = B
• Complexity drops to O(BT)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
10/26
Submodular score function: summary
• For the summary, we use a submodular score function:
φ(xi, xj, hi, hj) = λ(hi, hj)s(xi, xj)
λ(hi, hj) =



λ1, hi = 1, hj = 0 (coverage)
−λ2, hi = 1, hj = 1 (non-redundancy)
0, hi = 0, hj = 0
λ1, λ2, s(xi, xj) ≥ 0
• s(xi, xj) is a similarity between frames i and j
• It is easy to prove that this function is submodular
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
11/26
Submodular score function: summary + action
• For action recognition, we add a unary term:
ψ(x, y, h) =
T
i,j=1,j=i
φ(xi, xj, hi, hj, y) +
T
i=1
λ3I[hi = 1, y]xi
action
• In the ICASSP paper, we prove that this function is still
submodular
• w ≥ 0 → positive sum of submodular functions
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
12/26
Learning
• Learning: structural SVM
• Given a training set {xn, hn, yn}, n = 1 . . . N, we find:
argminw,ξ
1
2
w 2
+ C
N
n=1
ξn
s.t. w ψ(xn
, yn
, hn
) − w ψ(xn
, y, h) ≥ ∆(yn
, y) − ξn
n = 1 . . . N, ∀y, h ∈ W
• W is a small (i.e., polynomial) set of constraints that
guarantees a solution arbitrarily close to the full solution
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
13/26
Learning: populating the constraint set
• In the structural SVM framework, the constraint set is
populated with the “most violating” labelings
• For every sample, the “most violating” labeling is given
by:
¯yn
, ¯hn
= argmaxy,h [w ψ(xn
, y, h) + ∆(yn
, y)]
• “Loss-augmented inference”: same greedy algorithm as
the regular inference
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
14/26
Learning: latent variables
• The ground truth for h is unknown! → the algorithm
alternates between these two steps until convergence:
Step 1:
minw,ξ
1
2
w 2
+ C
N
n=1
ξn
s.t. w [ψ(xn
, yn
, hi∗
) − ψ(xn
, y, h)] ≥ ∆(yn
, y) − ξn
∀n, y, h ∈ W
Step 2:
hn∗
= argmax
h
w ψ(xn
, yn
, h)
• At the first iteration, h is initialized arbitrarily
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
15/26
Method: recap
1. A sub-modular scoring function for summary + action:
w

ψ(x, y, h) =
T
i,j=1,j=i
φ(xi, xj, hi, hj, y) +
T
i=1
λ3I[hi = 1, y]xi


2. A greedy inference algorithm with performance guarantees
3. Learning by latent structural SVM from an arbitrary
initialisation of the summary variables
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
16/26
V-JAUNE: a measure for the quality of a video
summary
• Video summarization lacks a generally-accepted
performance measure which accounts for both the content
and the frame order
• We propose V-JAUNE:
∆(h, ¯h) =
B
i=1
δ(hi, ¯hi)
δ(hi, ¯hi) = min{ xhj
− x¯hi
2
}, s.t. i − ≤ j ≤ i +
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
17/26
Multi-annotator V-JAUNE
• Multi-annotator version:
∆(h1:M
, ¯h) =
M
m=1
∆(hm
, ¯h)
• We normalize it by the disagreement between the
annotators:
D =
2
M(M − 1) p,q
∆(hp
, hq
) p = 1 . . . M, q = p+1 . . . M
∆ (h1:M
, ¯h) = ∆(h1:M
, ¯h)/D
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
18/26
Experiments
• Dataset: MSR Daily Activities 3D: A dataset with 16 action
classes (drink, eat, read book, call cellphone, write on a
paper, use laptop, use vacuum cleaner, cheer up, sit still,
toss paper, play game, lie down on sofa, walk, play guitar,
stand up, sit down), 320 instances from 10 actors
• RGB, depth and skeletal data from MS Kinect
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
19/26
Experiments
• Dataset: Actions for Cooking Eggs (ACE aka KCSGR): A
dataset where actors cook egg recipes using 8 atomic
actions (cutting, seasoning, peeling, boiling, turning,
baking, mixing and breaking); 161 instances for training,
95 for testing (different actors)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
20/26
Experiments
• Measurements: STIP features only from depth frames,
encoded as VLAD (no RGB, no skeletons)
• Action recognition comparison:
• standard SVM
• the proposed method with no summary features and all
frames
• the proposed method with budget B = 10
• the proposed method on the RGB data
• results from the literature
• Summary comparison: sum of absolute differences (SAD)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
21/26
Results: action recognition on MSR
Method Accuracy
SVM 34.4%
Dynamic time warping 54.0%
Proposed method (no summary, all frames) 48.8%
Proposed method 60.6%
Proposed method (RGB videos) 46.3%
• ¡The summary component helps the action recognition
accuracy!
• Accuracy from depth frames is higher than from RGB, even
without dedicated features (skeletal data would deliver
higher accuracy, but are available in limited situations)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
22/26
Results: action recognition on ACE
Method Accuracy
SVM 62.1%
PA-Pooling 72.2%
Proposed method (no summary, all frames) 54.7%
Proposed method (no summary, 10 frames) 66.3%
Proposed method 77.9%
• The summary component helps, again
• Selecting the frames helps
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
23/26
Results: MSR summarization
• V-JAUNE is also better than SAD’s: 5.22 vs 5.65
• In many cases, the summaries from the proposed method
(top) appear more appealing than SAD’s (bottom):
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
24/26
Results: ACE summarization
• Here V-JAUNE is slightly worse than SAD’s: 0.947 vs
0.927
• 10% summary supervision improves both the accuracy
and V-JAUNE: 81.1%, 0.926
• Example with the proposed method:
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
25/26
Conclusions
• The action recognition accuracy is higher than that of
comparable methods (depth data only)
• We obtain a summary as a by-product! Action recognition
and video summarization proved synergistic
• In many cases, the summaries are more appealing than
with a low-level method (SAD)
• Efficient inference and loss-augmented inference
(Hamming loss, V-JAUNE loss under review)
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
26/26
Any questions?
• Thank you very much for your attention!
• Any questions?
Fairouz Hussein, Massimo Piccardi
University of Technology Sydney, NSW, Australia
Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference

More Related Content

Similar to Mmsp2017 piccardi

MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackarogozhnikov
 
Reject Inference in Credit Scoring
Reject Inference in Credit ScoringReject Inference in Credit Scoring
Reject Inference in Credit ScoringAdrien Ehrhardt
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Some Studies on Multistage Decision Making Under Fuzzy Dynamic Programming
Some Studies on Multistage Decision Making Under Fuzzy Dynamic ProgrammingSome Studies on Multistage Decision Making Under Fuzzy Dynamic Programming
Some Studies on Multistage Decision Making Under Fuzzy Dynamic ProgrammingWaqas Tariq
 
Neural Processes Family
Neural Processes FamilyNeural Processes Family
Neural Processes FamilyKota Matsui
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdfanandsimple
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image RestorationMathankumar S
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientFabian Pedregosa
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryRikiya Takahashi
 
Probability Collectives
Probability CollectivesProbability Collectives
Probability Collectiveskulk0003
 
Numerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theoryNumerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theoryHidenoriOgata
 
Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...Université de Liège (ULg)
 
Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...
Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...
Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...MLconf
 
Conjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images SegmentationConjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images SegmentationEL-Hachemi Guerrout
 
Optimization Methods in Finance
Optimization Methods in FinanceOptimization Methods in Finance
Optimization Methods in Financethilankm
 

Similar to Mmsp2017 piccardi (20)

MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Reject Inference in Credit Scoring
Reject Inference in Credit ScoringReject Inference in Credit Scoring
Reject Inference in Credit Scoring
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
Some Studies on Multistage Decision Making Under Fuzzy Dynamic Programming
Some Studies on Multistage Decision Making Under Fuzzy Dynamic ProgrammingSome Studies on Multistage Decision Making Under Fuzzy Dynamic Programming
Some Studies on Multistage Decision Making Under Fuzzy Dynamic Programming
 
Aggregation operator for image reduction
Aggregation operator for image reductionAggregation operator for image reduction
Aggregation operator for image reduction
 
Neural Processes Family
Neural Processes FamilyNeural Processes Family
Neural Processes Family
 
Introduction
IntroductionIntroduction
Introduction
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image Restoration
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
 
Probability Collectives
Probability CollectivesProbability Collectives
Probability Collectives
 
Numerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theoryNumerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theory
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
Session 4 .pdf
Session 4 .pdfSession 4 .pdf
Session 4 .pdf
 
Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...
 
Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...
Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...
Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...
 
Conjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images SegmentationConjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images Segmentation
 
Optimization Methods in Finance
Optimization Methods in FinanceOptimization Methods in Finance
Optimization Methods in Finance
 

Recently uploaded

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontangsiskavia95
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSDBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSSnehalVinod
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?RemarkSemacio
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Voces Mineras
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxAniqa Zai
 
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...rightmanforbloodline
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1sinhaabhiyanshu
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxStephen266013
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Klinik Aborsi
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...varanasisatyanvesh
 

Recently uploaded (20)

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSDBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
 

Mmsp2017 piccardi

  • 1. 1/26 Joint Action Recognition and Summarization by Submodular Inference Fairouz Hussein, Massimo Piccardi University of Technology Sydney UTS MMSP workshop, 12 April 2017 Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 2. 2/26 Aims • Action recognition in video and video summarization are two well-established research areas • In this work, we explore the benefits of performing them jointly • We leverage submodular inference throughout • Work published at ICASPP 2016 and in-press on ACM TOMM Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 3. 3/26 Action recognition • Action recognition in video is a well-established research area • Leverages local features (STIP, DTF, ITF, MBH, bags of concepts, deep learning . . . ) • Has reached remarkable accuracy also in realistic scenarios • At short distance, depth cameras are helping achieve greater accuracy (e.g., Stereolabs ZED) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 4. 4/26 Video summarization • Video summarization is, too, a well-established research area • Summary as a set or sequence of frames • Leverages clustering, shot detection or “key” frame detection Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 5. 5/26 Joint action recognition and video summarization • Action recognition and video summarization can be performed independently, cascaded . . . • Recognising actions from a sub-set of key frames is a practiced idea • However, do the key frames meet the requirements of a good summary, i.e. coverage and non-redundancy? • Here, we attempt to perform them jointly with a single, unified scoring function Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 6. 6/26 Our graphical model h1 h2 ht... hT... x1 x2 xt... xT... ht-1 xt-1 y action summary measurements • y: action class; h: binary variables: ht = 1 → frame t is in the summary; x: frame measurements • Scoring function: w ψ(x, y, h) = w T i,j=1,j=i φ(xi, xj, hi, hj, y) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 7. 7/26 Inference • Given a trained model, w, and a measurement sequence, x, how to find the best y, h? • Inference: y∗ , h∗ = argmaxy,h w ψ(x, y, h) • Inference is not just left-to-right because nodes are all connected • Number of possible summaries = 2T ; 1 min video → T ≈ 1, 800 Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 8. 8/26 Submodularity • Inference is rescued by submodularity • A submodular function abides by the “law of diminishing returns”: A V B f(A U V) - f(A) ≤ f(B U V) - f(B) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 9. 9/26 Submodular inference • The maximum achieved by a greedy algorithm over a monotonic submodular function is ≥ 0.632 the actual maximum [Nemhauser et al. 1978] • The complexity of the greedy algorithm is O(T2) vs O(2T ) • We further constrain the summary to a maximum number of frames (budget):. y∗ , h∗ = argmaxy,h w ψ(x, y, h) s.t. |h| = B • Complexity drops to O(BT) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 10. 10/26 Submodular score function: summary • For the summary, we use a submodular score function: φ(xi, xj, hi, hj) = λ(hi, hj)s(xi, xj) λ(hi, hj) =    λ1, hi = 1, hj = 0 (coverage) −λ2, hi = 1, hj = 1 (non-redundancy) 0, hi = 0, hj = 0 λ1, λ2, s(xi, xj) ≥ 0 • s(xi, xj) is a similarity between frames i and j • It is easy to prove that this function is submodular Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 11. 11/26 Submodular score function: summary + action • For action recognition, we add a unary term: ψ(x, y, h) = T i,j=1,j=i φ(xi, xj, hi, hj, y) + T i=1 λ3I[hi = 1, y]xi action • In the ICASSP paper, we prove that this function is still submodular • w ≥ 0 → positive sum of submodular functions Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 12. 12/26 Learning • Learning: structural SVM • Given a training set {xn, hn, yn}, n = 1 . . . N, we find: argminw,ξ 1 2 w 2 + C N n=1 ξn s.t. w ψ(xn , yn , hn ) − w ψ(xn , y, h) ≥ ∆(yn , y) − ξn n = 1 . . . N, ∀y, h ∈ W • W is a small (i.e., polynomial) set of constraints that guarantees a solution arbitrarily close to the full solution Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 13. 13/26 Learning: populating the constraint set • In the structural SVM framework, the constraint set is populated with the “most violating” labelings • For every sample, the “most violating” labeling is given by: ¯yn , ¯hn = argmaxy,h [w ψ(xn , y, h) + ∆(yn , y)] • “Loss-augmented inference”: same greedy algorithm as the regular inference Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 14. 14/26 Learning: latent variables • The ground truth for h is unknown! → the algorithm alternates between these two steps until convergence: Step 1: minw,ξ 1 2 w 2 + C N n=1 ξn s.t. w [ψ(xn , yn , hi∗ ) − ψ(xn , y, h)] ≥ ∆(yn , y) − ξn ∀n, y, h ∈ W Step 2: hn∗ = argmax h w ψ(xn , yn , h) • At the first iteration, h is initialized arbitrarily Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 15. 15/26 Method: recap 1. A sub-modular scoring function for summary + action: w  ψ(x, y, h) = T i,j=1,j=i φ(xi, xj, hi, hj, y) + T i=1 λ3I[hi = 1, y]xi   2. A greedy inference algorithm with performance guarantees 3. Learning by latent structural SVM from an arbitrary initialisation of the summary variables Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 16. 16/26 V-JAUNE: a measure for the quality of a video summary • Video summarization lacks a generally-accepted performance measure which accounts for both the content and the frame order • We propose V-JAUNE: ∆(h, ¯h) = B i=1 δ(hi, ¯hi) δ(hi, ¯hi) = min{ xhj − x¯hi 2 }, s.t. i − ≤ j ≤ i + Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 17. 17/26 Multi-annotator V-JAUNE • Multi-annotator version: ∆(h1:M , ¯h) = M m=1 ∆(hm , ¯h) • We normalize it by the disagreement between the annotators: D = 2 M(M − 1) p,q ∆(hp , hq ) p = 1 . . . M, q = p+1 . . . M ∆ (h1:M , ¯h) = ∆(h1:M , ¯h)/D Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 18. 18/26 Experiments • Dataset: MSR Daily Activities 3D: A dataset with 16 action classes (drink, eat, read book, call cellphone, write on a paper, use laptop, use vacuum cleaner, cheer up, sit still, toss paper, play game, lie down on sofa, walk, play guitar, stand up, sit down), 320 instances from 10 actors • RGB, depth and skeletal data from MS Kinect Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 19. 19/26 Experiments • Dataset: Actions for Cooking Eggs (ACE aka KCSGR): A dataset where actors cook egg recipes using 8 atomic actions (cutting, seasoning, peeling, boiling, turning, baking, mixing and breaking); 161 instances for training, 95 for testing (different actors) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 20. 20/26 Experiments • Measurements: STIP features only from depth frames, encoded as VLAD (no RGB, no skeletons) • Action recognition comparison: • standard SVM • the proposed method with no summary features and all frames • the proposed method with budget B = 10 • the proposed method on the RGB data • results from the literature • Summary comparison: sum of absolute differences (SAD) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 21. 21/26 Results: action recognition on MSR Method Accuracy SVM 34.4% Dynamic time warping 54.0% Proposed method (no summary, all frames) 48.8% Proposed method 60.6% Proposed method (RGB videos) 46.3% • ¡The summary component helps the action recognition accuracy! • Accuracy from depth frames is higher than from RGB, even without dedicated features (skeletal data would deliver higher accuracy, but are available in limited situations) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 22. 22/26 Results: action recognition on ACE Method Accuracy SVM 62.1% PA-Pooling 72.2% Proposed method (no summary, all frames) 54.7% Proposed method (no summary, 10 frames) 66.3% Proposed method 77.9% • The summary component helps, again • Selecting the frames helps Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 23. 23/26 Results: MSR summarization • V-JAUNE is also better than SAD’s: 5.22 vs 5.65 • In many cases, the summaries from the proposed method (top) appear more appealing than SAD’s (bottom): Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 24. 24/26 Results: ACE summarization • Here V-JAUNE is slightly worse than SAD’s: 0.947 vs 0.927 • 10% summary supervision improves both the accuracy and V-JAUNE: 81.1%, 0.926 • Example with the proposed method: Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 25. 25/26 Conclusions • The action recognition accuracy is higher than that of comparable methods (depth data only) • We obtain a summary as a by-product! Action recognition and video summarization proved synergistic • In many cases, the summaries are more appealing than with a low-level method (SAD) • Efficient inference and loss-augmented inference (Hamming loss, V-JAUNE loss under review) Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference
  • 26. 26/26 Any questions? • Thank you very much for your attention! • Any questions? Fairouz Hussein, Massimo Piccardi University of Technology Sydney, NSW, Australia Fairouz Hussein, Massimo Piccardi Joint Action and Summary by Submodular Inference