Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lumos: A selfserve computer vision platform at AI NEXT conference

325 views

Published on

AI NEXT Conference 2017 Seattle by Fei Yang
Video: https://www.youtube.com/channel/UCj09XsAWj-RF9kY4UvBJh_A

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Lumos: A selfserve computer vision platform at AI NEXT conference

  1. 1. Lumos: A Self-Serve Computer Vision Platform Fei Yang Research Scientist Computer Vision, AML Facebook
  2. 2. }Why Computer Vision?
  3. 3. }Why Computer Vision? • Enhanced photo / video search Search photos posted by my friends containing a black bear
  4. 4. }Why Computer Vision? • Enhanced photo / video search • Detecting malicious content
  5. 5. }Why Computer Vision? • Enhanced photo / video search • Detecting malicious content • Helping visually impaired people
  6. 6. }Why Computer Vision? • Enhanced photo / video search • Detecting malicious content • Helping visually impaired people • Smart camera
  7. 7. }Challenges of CV platform • Large Scale • Low Latency • Reliability • Flexibility
  8. 8. Lumos Facebook’s Self-Serve Computer Vision Platform
  9. 9. Runs on Billions of images • Describes photos to the blind • Resurfaces notable memories • Provides better image and video search results • Protects people from objectionable content More than 200 visual models • Currently trained and deployed • Dozens of teams across the company self-serve build their own models 100+ Million examples in Lumos datasets and growing fast Lumos Lumos
  10. 10. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK (T1)TASK (T1) Lumos
  11. 11. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK (T1)TASK (T1) TRAINING: WEEKS Lumos
  12. 12. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK (T1)TASK (T1) TASK (T2)TASK (T2) TRAINING: WEEKS Lumos DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK
  13. 13. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK (T1)TASK (T1) nn n-1n-1 22 11 TASK (T2)TASK (T2) LESS COMPUTE/ LESS ACCURACY MORE COMPUTE/ MORE ACCURACY Lumos
  14. 14. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK (T1)TASK (T1) nn n-1n-1 22 11 TASK (T2)TASK (T2) COMPUTE TIME: 1-2 DAYS ACCURACY: LESS Lumos LESS COMPUTE/ LESS ACCURACY MORE COMPUTE/ MORE ACCURACY
  15. 15. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK (T1)TASK (T1) nn n-1n-1 22 11 TASK (T2)TASK (T2) COMPUTE TIME: 1 MONTH ACCURACY: MORE Lumos LESS COMPUTE/ LESS ACCURACY MORE COMPUTE/ MORE ACCURACY
  16. 16. DEEP RESIDUAL NETWORK DEEP RESIDUAL NETWORK TASK(T1)TASK(T1) nn n-1n-1 22 11 TASK(T2)TASK(T2) LESS COMPUTE/ LESS ACCURACY MORE COMPUTE/ MORE ACCURACY TASK(T3)TASK(T3) TASK(T4)TASK(T4) TASK(Tm)TASK(Tm) Lumos
  17. 17. ACCURACY COMPUTE Lumos TASK(T1)TASK(T1) TASK(T2)TASK(T2) TASK(T3)TASK(T3) TASK(T4)TASK(T4) TASK(Tm)TASK(Tm)
  18. 18. Lumos allows everyone at Facebook to build and deploy new computer vision models on the fly • Collect training data for your new model • Train your new model at the right accuracy/computational cost tradeoff • Refine your model based on live performance • Deploy your model to production Lumos Lumos
  19. 19. On this DayOn this Day AccessibilityAccessibility 360 Media Team360 Media TeamConnectivity LabConnectivity Lab Protect and CareProtect and Care MomentsMomentsNews FeedNews Feed Photo SearchPhoto Search Lumos
  20. 20. Continuous Stream of Photos
  21. 21. Automatic Alt Text
  22. 22. Connectivity Original GPW4 map Facebook high-res map
  23. 23. Detect Houses
  24. 24. • Indexing billions of photos • Finding similar photos in microseconds Binary encoding 1011001011…0101 1110101101…0010 0001111000…1010 1111111001…0001 1010101010…1001 0001111110…1010 0101101001…1111 1001111000…1010 0001001001…0010 Compact representations
  25. 25. Query imageQuery image Similar imagesSimilar images
  26. 26. • Clusters hundreds of millions photos into millions of clusters • Approach: A fast binary k-means algorithm – Works directly on similarity-preserving binary hashes of images. – Clusters image hashes into binary centers. – Builds hash indexes of binary centers to speedup computation.
  27. 27. Video Understanding Objects: Dog, Cat..Shot boundary detection Caption:Dog chasing cat in garden while people are laughing Action: Chasing Scene: Garden Summarization Saliency Detection Dynamic Compression Future Prediction Video Q&A
  28. 28. Beating humans on identifying sports
  29. 29. Continuous stream of videos
  30. 30. Mobile Vision Accuracy Speed Size Small, Fast, Accurate models
  31. 31. Mobile Vision
  32. 32. Pose estimation
  33. 33. 3D Point Cloud
  34. 34. }Thank you!

×