The 2012 ICSI/Berkeley Video Location Estimation System

789 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
789
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The 2012 ICSI/Berkeley Video Location Estimation System

  1. 1. The 2012 ICSI / Berkeley Location Estimation System Jaeyoung Choi,Venkatesan Ekambaram, Gerald Friedland and Kannan Ramchandran ICSI / UC Berkeley, USA October 4th, 2012Thursday, October 4, 12 1
  2. 2. Agenda • Baseline Approach • Drawbacks • Graphical Model Framework • ResultThursday, October 4, 12 2
  3. 3. Baseline Approach • Investigate ‘Spatial Variance’ of feature: • spatial variance is small : feature is likely location-indicative • spatial variance is large : feature is likely not indicativeThursday, October 4, 12 3
  4. 4. Example Tag Matches in Spatial Variance Training set pavement 2 5.739 ucberkeley 4 0.132 berkeley 14 68.138 greek 0 N/A greektheatre 0 N/A spitonastranger 0 N/A live 91 6453.109 video 2967 6735.844Thursday, October 4, 12 4
  5. 5. Problem: Sparsity coming from biased datasetThursday, October 4, 12 5
  6. 6. The effect of sparsity 60" 50" Percentage&[%]& 40" 30" >6400" 20" 6400" 1600" 10" 400" 100" 0" & & 0& & e& <1 00 00 0≤ 00 e <1 00 0≤ 00 <1 ≤e <1 10 e 10 0≤ ≤e 10 00 10 Distance&error&(e)&between&ground&truth&and&es<ma<on&[km] *  Test"video"from"a"dense"area"has"higher"chance"of"being" es<mated"with"lower"error"in"distance.""" 6Thursday, October 4, 12 6
  7. 7. Geo-­‐tagging:  an  es-ma-on -­‐theore-c  viewpoint Observa(ons: Images: Tags: {berkeley,  sathergate,   campanile} , {berkeley,  haas} , , {campanile} {campanile,  haas} k {t1 } , k {t2 } , k , {t3 } k {t4 }Es(mate: Geo x1 , x2 , x3 , x4 loca-ons:Thursday, October 4, 12 7
  8. 8. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN }Thursday, October 4, 12 8
  9. 9. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Probability  of  loca-on  given  tags Y Tradi-onal  approaches  es-mate: k k p(xi |{ti }) p(xi |ti ) k where k is  obtained  from  the  training  set p(xi |ti )Thursday, October 4, 12 8
  10. 10. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Probability  of  loca-on  given  tags Y Tradi-onal  approaches  es-mate: k k p(xi |{ti }) p(xi |ti ) k where k is  obtained  from  the  training  set p(xi |ti ) Example:  the  distribu-on  for  the  tag   “washington”  is  depicted  hereThursday, October 4, 12 8
  11. 11. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Probability  of  loca-on  given  tags Y Tradi-onal  approaches  es-mate: k k p(xi |{ti }) p(xi |ti ) k where k is  obtained  from  the  training  set p(xi |ti ) Example:  the  distribu-on  for  the  tag   “washington”  is  depicted  here Z Loca-on  es-mate: k xi p(xi |{ti })dxiThursday, October 4, 12 8
  12. 12. Drawbacks Data  sparsity:    Not  all  tags  in  test  set  are  available  in  training  set.                  Hence  es-mate  of                      i  |tk  )can  be  bad     p(x                 i Sub-­‐op(mality:    The  approaches  are  subop-mal  given  the  data. What  we  ideally  want: k k k p(x1 , x2 , ....., xN |{t1 }, {t2 }, ..., {tN }) Mean  of  the  above  distribu-on  gives  the  best  es-mate  of  the  loca-ons i.e.  for  each  image  we  want k k k p(xi |{t1 }, {t2 }, ...., {tN }) Tradi-onal  algorithms  only  give: k p(xi |{ti })Thursday, October 4, 12 9
  13. 13. Bayesian  graphical  framework {berkeley,  sathergate,   {berkeley,  haas} campanile} Edge:  Correlated  loca-ons   (e.g.  common  tag) Node:  Geoloca-on  of  the   image k p(xj |{tk }) p(xi |{ti }) j p(xi , xj |{tk } i {tk }) j {campanile} {campanile,  haas} Edge  Poten(al:  Strength  of  an  edge,  (e.g.   posterior  distribu-on  of  loca-ons  given   common  tags)Thursday, October 4, 12 10
  14. 14. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-onThursday, October 4, 12 11
  15. 15. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Joint  probability  modeling: Y Y p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk }) 1 2 N p(xi |{tk }) i p(xi , xj |{tk } ⇥ {tk }) i j i (i,j) Pairwise  distribu-on  given  at  least  one  common  tagThursday, October 4, 12 11
  16. 16. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Joint  probability  modeling: Y Y p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk }) 1 2 N p(xi |{tk }) i p(xi , xj |{tk } ⇥ {tk }) i j i (i,j) Pairwise  distribu-on  given  at  least  one  common  tag k p(xi |{ti }) is  obtained  from  the  training  set  as  before p(xi , xj |{tk } i {tk }) Modeled  as  an  indicator  func-on j I(xi = xj )If  the  common  tag  has  low  spa-al  variance  or  occurs  infrequently,  e.g.  if  the  common  tag  is  “haas”,  its  very  likely  the  loca-ons  are  the  sameThursday, October 4, 12 11
  17. 17. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Joint  probability  modeling: Y Y p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk }) 1 2 N p(xi |{tk }) i p(xi , xj |{tk } ⇥ {tk }) i j i (i,j) Pairwise  distribu-on  given  at  least  one  common  tag k p(xi |{ti }) is  obtained  from  the  training  set  as  before p(xi , xj |{tk } i {tk }) Modeled  as  an  indicator  func-on j I(xi = xj )If  the  common  tag  has  low  spa-al  variance  or  occurs  infrequently,  e.g.  if  the  common  tag  is  “haas”,  its  very  likely  the  loca-ons  are  the  same Ques-on: How  to  es-mate  to  op-mal  marginal  distribu-on  ? k k k p(xi |{t1 }, {t2 }, ...., {tN })Thursday, October 4, 12 11
  18. 18. Belief  propaga-on  updates Itera-ve  algorithm  to  approximate   k k k p(xi |{t1 }, {t2 }, ...., {tN }) the  posterior  distribu-on k 2 Gaussian  modeling p(xi |{ti }) N (µi , i) 2 At  itera-on  0  each  node  calculates (µi , i) 1 (t 1) P 1(t) (t) 2 µi + k⇥N (i) ( (t) )2 µk (t) ( i ) k µi = (t) 2At  itera-on  t  each  node  updates   ( i )its  loca-on  as  a  weighted  mean  of  its  previous  loca-on  and  that  of  its   1 1 X 1neighbors (t) 2 = (t 1) 2 + (t 1) 2 ( i ) ( i ) k2i ( k ) The  weights  reflect  the  confidence  in  that  measurements,  i.e.  higher  the  spa-al  variance  lower  is  the  weightThursday, October 4, 12 12
  19. 19. Belief  propaga-on 2 (µ2 , 2) Posterior  mean  and  variance   2 (µ3 , 3) assuming  Gaussian  beliefs 2 (µ1 , 1) Audio  visual  features  are  incorporated  in  modeling  the  edge  and  node  poten-alsThursday, October 4, 12 13
  20. 20. Incorpora-ng  Audio-­‐Visual  features • GIST  features  are  extracted  for  the  images. • MFCC  features  are  extracted  for  the  audio. • These  are  now  incorporated  into  the  node  and  edge  poten-als  as   exponen-al  distribu-ons. ||xi xj || p(xi , xj |ai , aj ) ⇥ exp( ) ||ai aj || ai are  the  audio  features  associated  with  image  i The  intui-on  is  that  closer  the  audio  features  are,  higher  the   probability  that  the  geo-­‐loca-ons  are  closer. Similarly  this  can  be  included  in  the  node  poten-als  as  well  as  for   the  visual  features.Thursday, October 4, 12 14
  21. 21. Result • Percentage of test videos (out of 4182 videos)  correctly  es-mated  under   distances  in  the  top  row  from  the  groundtruth  loca-on.   – run1  -­‐  baseline  approach  without  using  gaze_eer – run2  -­‐  graphical  model  based  approach  with  gaze_eer – run3  -­‐  baseline  approach  with  gaze_eer – run4  -­‐  k-­‐NN  with  gist  visual  feature • Graphical  model  approach  with  gaze_eer  outperforms  baseline  approaches  in   range  above  1km.     14Thursday, October 4, 12 15
  22. 22. Conclusion • graphical  model  framework  can  achieve   performance  improvement  over  baseline   approach  by  incorpora-ng  results  from  test  data   • various  issues  remain  to  be  explored –  the  modeling  of  edge  poten-al   • text  :  hard  threshold  (current)  -­‐-­‐>  sod • visual/audio  features     –  assump-on  of  condi-onal  independence  of  loca-on   distribu-on  given  mul-ple  tags   15Thursday, October 4, 12 16
  23. 23. Thank You! Questions? http://mmle.icsi.berkeley.edu Work together with: Venkatesan Ekambaram, Kannan Ramchandran, Giulia Fanti Howard Lei, Adam Janin, and Gerald Friedland 16Thursday, October 4, 12 17
  24. 24. Thursday, October 4, 12 18
  25. 25. Thursday, October 4, 12 19

×