Face recognition
for augmented reality and
   media management
Viewdle




Face recognition based
media indexing, search,
navigation and editing
Viewdle: center of excellence for videos
       A silicon valley start-up in Ukraine and in California


 VC backed company                    Specialists in video analysis


 World renown scientist in            4 patents filed in the US
 Structural Recognition
 Professor Schlesinger
                                      Culture of innovation:
                                      Environment that fosters
 40 Engineers, 8 PhDs. All with       creativity, and dedication
 Advance degrees in
 Mathematics and Computer
 Science                              Head-quartered in Palo-Alto
Media indexing



    We find
   people in
photos & videos
Performance



 Accurate
 Fast

 Scalable

 Easy to use
Viewdle Video Indexing


   We find
  people in
    video
Why we search people in video?



   People names are one of
    most typical words used in
    search queries
Performance


 Accurate
 Fast

 Scalable

 Easy to use
Video indexing workflow


 Video 1 indexing &
                          Manual           Other modalities
faces agglomerative
                        face tagging
     clustering


         Person DB                            Fusion &
         population                         Search index
                                              creation

 Video 2 indexing &     Automatic face
faces agglomerative   suggestion/tagging
     clustering       (face recognition)
Search workflow



  Search within index         Play video           Provide
of tagged or recognized    from moment of        navigation by
         videos           person appearance   person appearances
Technology applications

 Video& photo archive search
 Personal media search


 Improved media sharing

  experience for desktop and
  mobile
 Augmented reality


 Video editing
Technology applications

continued
 Search engine optimization
 Content filtering


 Security
Face based search advantages

 More relevant results than
  from text
 Pointing to exact position in

  video
Not relevant OCR
Problems in video

   Low video quality (Youtube 420x300)
   Low quality faces even in high
    quality videos
   Complex lighting, facial expression,
     pose, occlusion (microphone or cup)‫‏‬
   Large # of persons to recognize
   Similar looking people
Digital TV not ideal
Faces clustering challenge
How we solve problems?

 Cascaded detectors
  and recognizers
 Large set of manually created

  ground truth data(> 100k
  images)
 Automated detectors and

  recognizers thresholds
  optimization
Video Processing Performance

 Accuracy: Recall 65+%, Precision
  95+%
 Realtime processing
Mobile targets

   Realtime 2D face tracking
   Online and offline face recognition
   Realtime 3D face tracking (future)
Teaser video
Mobile Challenges

   Realtime video stream processing
   Less powerfull processors (ARM
    1GHZ)
   Less capable OS (Android)
Desktop vs Mobile

   Every target platform requires
    specific tuning and optimization for
    best user experience
   Single algorithmic C++ code base
Future mobile hardware

   Qualcomm 4 core 2.5GHz
    Snapdragon ARM processor
   Texas Instruments OMAP 5 platform
   nVidia 4 core ARM with GPU
   AMD 2 core Bobcat with GPU
   Intel Atom for phones
Reuters Demo

   http://reuters.viewdle.com
Search page
Exact person selection
Video selection
Opening video at position
Video navigation by people
Photo vs Video challenges

   Face detect is slower (images are bigger
    dimensions, good faces can be smaller)
   Rotated and profile faces are harder to pick up
    (in videos they frequently can be tracked from
    frontals)
   More reliable face feature detection required
   Less # of faces require better face similarity
    function
   Face clustering heuristics are more important
Photo improvements done

   Added profiles
   Improved speed
   Multi-cascaded scheme for different rotations
   Improved general recall/precision
   Added face grouping heuristics
   Bring to production Photo-Video SDK
   Mobile specific tech tuning
Photo SDK demo
Mobile Demo
Things to improve

   Face detection recall/precision (improving
    training set)
   Feature detection improvement
   Profiles recognition
   Clustering heuristics tuning
   Speed improvement
Conclusion

 Viewdle provides quality
  product & technology for media
  indexing, search and navigation
 Viewdle creates new user

  experiences never existed
  before
Looking for talents

   Viewdle invites young scientists
    and engineers interested in image
    processing, machine learning,
    object tracking to work at Viewdle
   Send application to
    jobs@viewdle.com or
   ysm@viewdle.com
Thank you for your attention!




    Yuriy Musatenko, Ph.D., CTO.
 Ivan Kovtun, Ph.D., Head of research.
             Viewdle, Inc.

Face recognition for augmented reality and media management.Viewdle.2011.

  • 1.
    Face recognition for augmentedreality and media management
  • 2.
    Viewdle Face recognition based mediaindexing, search, navigation and editing
  • 3.
    Viewdle: center ofexcellence for videos A silicon valley start-up in Ukraine and in California VC backed company Specialists in video analysis World renown scientist in 4 patents filed in the US Structural Recognition Professor Schlesinger Culture of innovation: Environment that fosters 40 Engineers, 8 PhDs. All with creativity, and dedication Advance degrees in Mathematics and Computer Science Head-quartered in Palo-Alto
  • 5.
    Media indexing We find people in photos & videos
  • 6.
    Performance  Accurate  Fast Scalable  Easy to use
  • 10.
    Viewdle Video Indexing We find people in video
  • 11.
    Why we searchpeople in video?  People names are one of most typical words used in search queries
  • 12.
    Performance  Accurate  Fast Scalable  Easy to use
  • 13.
    Video indexing workflow Video 1 indexing & Manual Other modalities faces agglomerative face tagging clustering Person DB Fusion & population Search index creation Video 2 indexing & Automatic face faces agglomerative suggestion/tagging clustering (face recognition)
  • 14.
    Search workflow Search within index Play video Provide of tagged or recognized from moment of navigation by videos person appearance person appearances
  • 15.
    Technology applications  Video&photo archive search  Personal media search  Improved media sharing experience for desktop and mobile  Augmented reality  Video editing
  • 16.
    Technology applications continued  Searchengine optimization  Content filtering  Security
  • 17.
    Face based searchadvantages  More relevant results than from text  Pointing to exact position in video
  • 18.
  • 19.
    Problems in video  Low video quality (Youtube 420x300)  Low quality faces even in high quality videos  Complex lighting, facial expression, pose, occlusion (microphone or cup)‫‏‬  Large # of persons to recognize  Similar looking people
  • 20.
  • 21.
  • 22.
    How we solveproblems?  Cascaded detectors and recognizers  Large set of manually created ground truth data(> 100k images)  Automated detectors and recognizers thresholds optimization
  • 23.
    Video Processing Performance Accuracy: Recall 65+%, Precision 95+%  Realtime processing
  • 24.
    Mobile targets  Realtime 2D face tracking  Online and offline face recognition  Realtime 3D face tracking (future)
  • 25.
  • 26.
    Mobile Challenges  Realtime video stream processing  Less powerfull processors (ARM 1GHZ)  Less capable OS (Android)
  • 27.
    Desktop vs Mobile  Every target platform requires specific tuning and optimization for best user experience  Single algorithmic C++ code base
  • 28.
    Future mobile hardware  Qualcomm 4 core 2.5GHz Snapdragon ARM processor  Texas Instruments OMAP 5 platform  nVidia 4 core ARM with GPU  AMD 2 core Bobcat with GPU  Intel Atom for phones
  • 29.
    Reuters Demo  http://reuters.viewdle.com
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 36.
    Photo vs Videochallenges  Face detect is slower (images are bigger dimensions, good faces can be smaller)  Rotated and profile faces are harder to pick up (in videos they frequently can be tracked from frontals)  More reliable face feature detection required  Less # of faces require better face similarity function  Face clustering heuristics are more important
  • 37.
    Photo improvements done  Added profiles  Improved speed  Multi-cascaded scheme for different rotations  Improved general recall/precision  Added face grouping heuristics  Bring to production Photo-Video SDK  Mobile specific tech tuning
  • 38.
  • 39.
  • 40.
    Things to improve  Face detection recall/precision (improving training set)  Feature detection improvement  Profiles recognition  Clustering heuristics tuning  Speed improvement
  • 41.
    Conclusion  Viewdle providesquality product & technology for media indexing, search and navigation  Viewdle creates new user experiences never existed before
  • 42.
    Looking for talents  Viewdle invites young scientists and engineers interested in image processing, machine learning, object tracking to work at Viewdle  Send application to jobs@viewdle.com or  ysm@viewdle.com
  • 43.
    Thank you foryour attention! Yuriy Musatenko, Ph.D., CTO. Ivan Kovtun, Ph.D., Head of research. Viewdle, Inc.