Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

WannaEat: A computer vision-based, multi-platform restaurant lookup app


Published on

"Developing a new product nowadays can be a challenge. A service cannot survive without supporting multiple platforms, having a smooth user experience, and providing features that just work. During this presentation, you can see how such a prototype was built in 10 days, rethinking best practices and introducing a novel approach for the restaurant suggestion. A number of technologies and platforms are covered, including service-side development with NodeJS, iOS and Android development with React-Native, and image processing with TensorFlow."

Published in: Technology
  • Login to see the comments

  • Be the first to like this

WannaEat: A computer vision-based, multi-platform restaurant lookup app

  1. 1. Oct.28.2017 Steve, Cedric, Liboon Rakuten, Inc.
  2. 2. 2
  3. 3. 3 Stevche Radevski Msc Software Engineering Working on internal business support/analytics tools. Mainly working with JavaScript (React, React Native, NodeJS) Cedric Konan Msc Ubiquitous Computing Working on Dining services tools. Mainly working with Java but a php lover Tan Li Boon Bsc Computer Science Working on big data with dev-ops. Mainly working with Java (Spark, Couchbase, Hadoop)
  4. 4. 4
  5. 5. 5 Given 3 months to do anything we want! Every Friday. Any topic. Work however we want. FREEDOM!
  6. 6. 6 Play around with Machine Learning Do something differently Utilize everyone’s specialized skills and learn from each other Multiplatform mobile app development
  7. 7. 7
  8. 8. 8 • • • • • • • • ’
  9. 9. 9
  10. 10. 11
  11. 11. 12
  12. 12. 13 …
  13. 13. 14 Fast Development Multi-platform Easily-interfaceable Plug and play
  14. 14. 15 A JavaScript based framework to build native, multi-platform mobile applications 1487 official contributors on GitHub 1000+ mobile apps created with it
  15. 15. 16 Easy to use (especially if you know React) Based on JavaScript Uses markup language syntax Reusable Components Very well supported (documentation, active community)
  16. 16. 17 App Result list Star Rating Component Row component List View
  17. 17. 18
  18. 18. 19
  19. 19. 20 Recognize what a restaurant sells based on their food pictures!
  20. 20. 21
  21. 21. 22
  22. 22. 23 No billing surprises Demonstration of expertise Specialized Models A technology company should maintain in-house infrastructure
  23. 23. 24 References: Yoshiyuki Kawano and Keiji Yanai, The University of Electro- Communications, Tokyo, Japan 256 food categories, 100 images per category UEC FOOD-256
  24. 24. 25 Inception-v3 is a convolution-based neural network (ConvNet). It takes 2 to 3 weeks on multiple GPUs to train a ConvNet from zero! If you need to tweak the network parameters, you have to re-train the whole thing. Clearly this is not reasonable. Enter Transfer Learning…
  25. 25. 26
  26. 26. 27 Use the outputs of another trained network as generic image feature detectors, and train a new shallow model using these outputs. softmax conv2 conv1 Images and tags loss softmax conv2 conv1 Images and tags loss Original Target Pre-trained
  27. 27. 30
  28. 28. 31
  29. 29. 32 Google Vision API Mine Restaurants Data Backend API Vision API WannaEat Vision Restaurants.json Get Tags Per Image Per Restaurant Image – Tag Dictionary 1 2 3 4 5 7 8 6 Store reduced tags per restaurant Get 1000 restaurants from Google Maps Generate tags list per image
  30. 30. 33 Mobile Google Vision API Backend API Vision API WannaEat Vision Restaurants.json User uploaded Image Image or Tag Top tag for image Matching Restaurants
  31. 31. 34
  32. 32. 35
  33. 33. 36 Collocation is great (no communication friction) Freedom: Possibility to choose topic and tech stack. Small team, so no need for long meetings, tickets, wikis (Trello and p2p talk). Clear system boundaries with clear interfaces between each boundary. Well-defined responsibilities (but still helping each other)
  34. 34. 37
  35. 35. 38 Nothing, we are that good!
  36. 36. 39 Took some time to remember what we did the last week Collecting data for restaurants was time-consuming Vision processing took equally long. Spent half the time to figure out what problem to tackle
  37. 37. 40
  38. 38. 41 Of course we did! Work in small teams! Have clearly defined responsibilities Do one thing at a time (switching between projects takes mental effort) Machine Learning is not difficult anymore (many API as a service providers) Rethink best practices (both development and UI/UX)