Fast Object Instance Search From One Example

FAST OBJECT INSTANCE
SEARCH FROM ONE EXAMPLE
NAYAN SETH
JINGJING MENG, JUNSONG YUAN, YAP-PENG TAN, GANG WANG, "FAST OBJECT INSTANCE SEARCH IN VIDEOS FROM
ONE EXAMPLE"
ARIZONA STATE UNIVERSITY

OBJECTIVE
▸ To locate the object jointly across video frames using
spatiotemporal search.
Source: Object Localization Via Deep Neural Network

WHY?
▸ Vigilance
▸ Combining with AI technologies to help improve crop
productivity [Video]
▸ Autonomous Cars [Video]
▸ Robots which can catch moving objects

PREVIOUS WORK
▸ Boundary Box (made efﬁcient using Branch & Bound)
▸ Max Path (uses Sliding Window)
Tran, D., Yuan, J., & Forsyth, D. (2014). Video Event Detection: From Subvolume Localization to
Spatiotemporal Path Search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(2),
404-416.

FEW MORE METHODS
▸ Video Google
▸ TRECVID

RANDOMISED VISUAL PHRASE
▸ Generates frame-wise conﬁdence maps
▸ Method works on individual frames
▸ Why?
▸ Handle scale variations of object and provide robustness
▸ No need to rely on image segmentation

RANDOMISED VISUAL PHRASE (CONTD …)
▸ Local invariant features are extracted
▸ Random partition of image
▸ Each patch is bundled with visual phrases
▸ For each RVP, similarity score is computed with respect to query
object independently.
▸ Score used as voting weight for corresponding patch
▸ Final conﬁdence score of each pixel computed considering all
the voting weights for the pixel.

Yuning Jiang, Jingjing Meng, Junsong Yuan, and Jiebo Luo, “Randomized spatial context for object
search,” Image Processing, IEEE Transactions on, p. to appear, 2015.

LOCALISATION
▸ Object localisation done using RVP
▸ Drawbacks
▸ First, RVP depends on a heuristic segmentation coefficient
α (alpha) to locate target objects in an image
▸ Second, its performance drops with insufficient rounds of
partition, as the confidence map would not be salient.
▸ Hence used along with Max Path for better accuracy

ALGORITHM
▸ V = {F1, F2…Fn} where Fi ∈ V
▸ Assumption: Trajectories are non-overlapping
▸ V = {V1, V2…Vn} where Vi ∈ V
▸ Ti = {Ti1, Ti2…Til} where l is total number of object trajectories
▸ Tij = {Bij1, Bij2…Bijk} where k is total number of frames in trajectory Tij
▸ To ﬁnd the trajectory, Ti
*
= argmax s(Tij) where Tij ∈ Ti [equation 1]
▸ s(Tij) = ∑ s(B) where B ∈ Tij [equation 2]
▸ Once we have the best trajectory Ti
*
for each video chunk, we can then return
the ranked results of all trajectories.

WHERE DOES MAX PATH COME
INTO PLAY…
to solve equation 1

THERE’S A CATCH

COARSE TO FINE SEARCH
▸ Build two file indexes
▸ Coarsely Sampled Frames (filters low confidence video
chunks)
▸ Full Dataset Frames (computationally expensive)
▸ Ranking generated from the confidence score of Coarsely
Sampled Frames
▸ Top ranking chunks - per frame confidence map generated

OTHER METHODS DEPLOYED
▸ Hessian Afﬁne Detectors [4]
▸ FLANN [5]

RESULTS
▸ K = 200 rounds of partition for coarse round with α = 3
▸ K = 50 rounds of partition for ﬁne round with α = 3
▸ Max Path done at 1:1 aspect ratio with local neighborhood of
3x3, spatial step size of 10 and temporal step of 1 frame
▸ β = -2 (the -ve coefﬁcient)

EFFICIENCY
▸ Quad Core Dual Processor
▸ 2.3GHz and 32GB RAM (no GPU)
▸ Coarse Ranking & Filtering of 10 objects, the average time is
0.833 seconds
▸ 28.738 seconds for obtaining top 100 trajectories for each
query (3.833 seconds for frame wise voting map and 24.866
seconds for Max-Path search)
▸ 29.57 seconds excluding I/O for 5.5hr video dataset

OBJECTS QUERIED
Jingjing Meng, Junsong Yuan, Yap-Peng Tan, Gang Wang, "Fast Object Instance Search In
Videos From One Example"

Jingjing Meng, Junsong Yuan, Yap-Peng Tan, Gang Wang, "Fast Object Instance Search In

ACCURACY
Jingjing Meng, Junsong Yuan, Yap-Peng Tan, Gang Wang, "Fast Object Instance Search In Videos
From One Example"

REFERENCE
1. Tran, D., Yuan, J., & Forsyth, D. (2014). Video Event Detection: From Subvolume
Localization to Spatiotemporal Path Search. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 36(2), 404-416.
2. Yuning Jiang, Jingjing Meng, Junsong Yuan, and Jiebo Luo, “Randomized spatial context
for object search,” Image Processing, IEEE Transactions on, p. to appear, 2015.
3. Jingjing Meng, Junsong Yuan, Yap-Peng Tan, Gang Wang, "Fast Object Instance Search In
4. Michal Perd’och, Ondrej Chum, and Jiri Matas, “Ef- ﬁcient representation of local
geometry for large scale object retrieval,” in Proc. IEEE Conf. on Computer Vi- sion and
Pattern Recognition. IEEE, 2009, pp. 9–16.
5. Marius Muja and David G. Lowe, “Fast approximate nearest neighbors with automatic
algorithm conﬁgura- tion,” in International Conference on Computer Vision Theory and
Application VISSAPP’09). 2009, pp. 331– 340, INSTICC Press.

Fast Object Instance Search From One Example

More Related Content

Similar to Fast Object Instance Search From One Example

Recently uploaded

Fast Object Instance Search From One Example