he presentation explores the different technique of assigning confidence score at a single feature-level and node-level ( a set of features ). This steps are crucial for constructing a graph representation of video segment and for determining node cost/penalty values before applying graph-cut method.
2. Intro
• Proposed to recognize human action from video
using Graph-Cut approach.
• Algorithmic stages can be defined as follows:
– Pre-GraphCut: Input video segment S should be
converted into graphical representation Gs(V,E)
– Pro-GraphCut: Given Gs(V,E) and optimum action-category-
labels Lv for its node V, which is the output
from Graph-Cut, select a set of sub-graph where
action(s) of interest might happened.
3. Some Notation
• “Video segment”, S, refers to a set of local feature point
extracted at X location, described by descriptor D:
• “Confidence score” refers to a likelihood of class label l
given observation o :
4. Pre-GraphCut
• Converting video segment into graphical representation
requires:
(1)Breaking down whole video segment S into spatio-temporal
grids. Each grid volume is node Vi connected
to its neibhourhood by edge Ei in graph Gs.
(2)Assigning confidence-score for node Vi
5. Node Confidence-Score
• The most challenging problem is (2): assigning
confidence score for each node:
– Node is, simply, a set of feature points within
grid volume:
– Therefore, node confidence can be defined by
an unknown function, g, over these feature
points inside.
6. Node Confidence-Score
• The naïve approach is to find confidence-score for
each feature point inside node and accumulate
these scores to get node-score:
Then, let us find feature confidence-score, i.e,
likelihood of class l given local feature fj.
8. Feature Confidence-Score (2)
• Constructed BOV histogram for each test video
segment, with centroids C:
• Trained binary linear SVM, to produce a support
vector for class label l:
9. Feature Confidence-Score (3)
• Given a feature point from test segment, then its
confidence score:
(1) Hard Assignment:
(2) N-Soft Assignment:
12. Node Cost-Value (1)
• Graph-Cut framework, it minimizes the total penalty/cost
value of single nodes and neighborhood nodes given node
label configuration L:
• Node cost score is inversely proportional to the likelihood
or confidence score:
13. Node Cost-Score (2)
• Assuming node confidence score is a sum of feature point
scores (using hard assignment):
• Considered following inverse relationship to derive node
cost score:
(1) Nlog ( Negative Log-likelihood)
(2) Norm ( Negative Normalized Confidence Score)
(3) Naive ( Negative Raw Confidence Score)
14. Method 1: NLog
• Probabilistic interpretation: According Platt[1], he showed
interpreting SVM confidence score in a probabilistic manner using
a parametric form of a sigmoid to :
• Negative Log Likelihood: In MRF (Graph-Cut), the cost values
often associated with neg-log of the measurement of noise. Similar,
once confidence values are translated into probability, operation is
applied to derive cost score:
•
•
15. Method 2: Norm
• The confidence score is scaled between 0 and 1. Then cost
value is associated with the negative of these values:
16. Method 3: Naive
• The cost value is directly associated with the negative of
the raw confidence score:
17. Method 3: Naive
• The cost value is directly associated with the negative of
the raw confidence score:
18. Experiment: Node Cost Score (1)
• With default parameters, Naive approach, surprisingly, outperforming other two methods. The
worst performance is observed with the Norm method
• The NLog approach performed lesser than my personal expectation. The reason, maybe,
associated with the tuning parameters, A and B, of the sigmoid equation:
• In particular, the parameter A is in control of slope. Let's inspect this parameter's effect on the
performance
19. Experiment: Node Cost Score (2)
• NLog approach: Sigmoid parameter A's effect on the performance
21. Conclusion
• In this slides, the two main questions being explored, which
all related to construction of video graph G and proposed a
few methods and did an experiment on the KTH dataset.
– (i) Assign confidence score at feature-level
● Soft-assignment
● Hard-assignment
– (ii) Assigning confidence score at node-level
● Nlog ( Negative likelihood )
● Norm
● Naive
22. Future Works
• Future work will explore:
– Alternative construction of video graph:
● Instead of defined grid, use super-voxel for
choosing node region.
– Single feature confidence score:
● Instead of BOF, using VLAD descriptor for
obtaining more discriminative representation of
feature.