SlideShare a Scribd company logo
1 of 1
Download to read offline
Vehicle Re-Identification in Camera Networks
with Spatial-temporal Constraints
Yanzi Jin, Vladimir Kozitsky, Peter Paul PARC, a Xerox company, Webster, NY
Experiment PreparationBackground Study Video Processing
➢ Motivation ➢ Problem Definition ➢ Preprocessing Tools
➢ Matching Results
➢ Experiment for Data Collection
➢ Related Work
Ubiquitous camera Network:
• Criminal investigation
•Retail analytics
Unique features of vehicle
•Rigid shape without joint.
•Severe deformation in different views.
•Regular temporal and spatial moving pattern.
Separation by feature matching only
Matching results by spatial-temporal constraints only
64 pair of ground truth, 30780 pairs of false matching.
➢ Challenges
Current work:
• Human re-identification: feature matching.
•Current vehicle re-identification: inductive
loops.
Related computer vision topics:
•Object Tracking object association across
videos.
•Object Co-detection: view change
•Image Retrieval: view change
➢ Dataset
•16 cameras covering Xerox campus, 9 on road
and 7 in parking lot.
•11 routes for participants plus extra 2 for
calibration.
•GPS trace logged by mobile app for ground
truth annotation.
Route design and camera location GPS log visualization
•Similar appearance among different vehicles
•Different appearance of same vehicle due to
view change
•Lightening condition (reflection) and occlusion
Assumption: road structure and camera
location/orientation known
In a camera network with non-overlapped
view, given a vehicle in a camera,
generate a candidate list in every other
camera by appearance features matching
and reduce the candidate size by spatial-
temporal constraints.
Object
Extractor
Which
frame has
object in a
2-hour
video?
Groundtruth
Annotator
Vehicle
Annotator
Video
Player
How to
see the
object
without
waiting?
How to get
the
interested
object?
How to
match
objects
across
camera?
Foreground
aware MoG
background
model with
'ghost'
elimination.
Fame
jumping
between
sequences
with moving
objects.
Save user's
drawing
and frame
number.
Reduce
annotation
candidates
by GPS
timestamp
matching.
Can we identify the same vehicle in
different videos?
•16 videos in total, with different viewpoints.
•All the videos start from the same time and
last two hours (about 216,000 frames each).
•16 GPS traces with 2 invalid.
•Three cameras along San Jose Road annotated
•64 pairs of annotated ground truth.
118_SE_Exterior: 86 objects
111_South_East: 153 objects
Euston_Rd_San_Jose: 74 objects

More Related Content

Similar to PARCE-SIPP-YJin

Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAMYu Huang
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and PredictionYu Huang
 
Deep vo and slam iii
Deep vo and slam iiiDeep vo and slam iii
Deep vo and slam iiiYu Huang
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Aritra Sarkar
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VYu Huang
 
Intelligente visie maakt drones autonoom
Intelligente visie maakt drones autonoomIntelligente visie maakt drones autonoom
Intelligente visie maakt drones autonoomEUKA
 
Deep vo and slam ii
Deep vo and slam iiDeep vo and slam ii
Deep vo and slam iiYu Huang
 

Similar to PARCE-SIPP-YJin (8)

Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
 
ObjectDetection.pptx
ObjectDetection.pptxObjectDetection.pptx
ObjectDetection.pptx
 
Deep vo and slam iii
Deep vo and slam iiiDeep vo and slam iii
Deep vo and slam iii
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
 
Intelligente visie maakt drones autonoom
Intelligente visie maakt drones autonoomIntelligente visie maakt drones autonoom
Intelligente visie maakt drones autonoom
 
Deep vo and slam ii
Deep vo and slam iiDeep vo and slam ii
Deep vo and slam ii
 

PARCE-SIPP-YJin

  • 1. Vehicle Re-Identification in Camera Networks with Spatial-temporal Constraints Yanzi Jin, Vladimir Kozitsky, Peter Paul PARC, a Xerox company, Webster, NY Experiment PreparationBackground Study Video Processing ➢ Motivation ➢ Problem Definition ➢ Preprocessing Tools ➢ Matching Results ➢ Experiment for Data Collection ➢ Related Work Ubiquitous camera Network: • Criminal investigation •Retail analytics Unique features of vehicle •Rigid shape without joint. •Severe deformation in different views. •Regular temporal and spatial moving pattern. Separation by feature matching only Matching results by spatial-temporal constraints only 64 pair of ground truth, 30780 pairs of false matching. ➢ Challenges Current work: • Human re-identification: feature matching. •Current vehicle re-identification: inductive loops. Related computer vision topics: •Object Tracking object association across videos. •Object Co-detection: view change •Image Retrieval: view change ➢ Dataset •16 cameras covering Xerox campus, 9 on road and 7 in parking lot. •11 routes for participants plus extra 2 for calibration. •GPS trace logged by mobile app for ground truth annotation. Route design and camera location GPS log visualization •Similar appearance among different vehicles •Different appearance of same vehicle due to view change •Lightening condition (reflection) and occlusion Assumption: road structure and camera location/orientation known In a camera network with non-overlapped view, given a vehicle in a camera, generate a candidate list in every other camera by appearance features matching and reduce the candidate size by spatial- temporal constraints. Object Extractor Which frame has object in a 2-hour video? Groundtruth Annotator Vehicle Annotator Video Player How to see the object without waiting? How to get the interested object? How to match objects across camera? Foreground aware MoG background model with 'ghost' elimination. Fame jumping between sequences with moving objects. Save user's drawing and frame number. Reduce annotation candidates by GPS timestamp matching. Can we identify the same vehicle in different videos? •16 videos in total, with different viewpoints. •All the videos start from the same time and last two hours (about 216,000 frames each). •16 GPS traces with 2 invalid. •Three cameras along San Jose Road annotated •64 pairs of annotated ground truth. 118_SE_Exterior: 86 objects 111_South_East: 153 objects Euston_Rd_San_Jose: 74 objects