Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019

Chun-Yu Tseng
Chun-Yu TsengBackend Engineer at Umbo Computer Vision
Joe | AI Engineer
Building and Hosting Real-World Machine Learning Services from Scratch
How to make A.I. work?
Joe Tseng
• AI engineer
• Tainan.py - organizer
• PyConTW 15/16/18/19 - speaker
• Our booth
Why this talk?
😎 Cool stuff to share
🛠 An extended version of the previous talk
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
November 2014
@AppWorks
Conventional 2D Trip Zone
Umbo 3D Virtual Tripwire
Home Surveillance: Intruder Detection
pedestriancamera
intruder
cat
Object/Event Recognition
!
high precision / low false alarm 

performance
• 3D localized detection
• precise dimension measurement
• machine recognition
DEMO1
DEMO2
DEMO3
Advantage
November 2014
@AppWorks
@Songde Rd.
November 2015
@Songde Rd.
November 2015
June 2016
@Songde Rd.
Image-
LQ (t+1)
DQTN
DQTN
Image-
LQ (t+1)
Image-
HQ (t)
Image-
HQ (t+3)
Image-
LQ (t+2)
DQTN
DQTN
Image-
LQ (t+2)
CNN-3x3(3)
CNN-1x1(16)
CNN-3x3(16)
Reconstructed
HQ(t+1)
CNN-3x3(3)
CNN-1x1(16)
CNN-3x3(16)
Reconstructed
HQ(t+2)
Fusion Fusion
ACCV 2016
@Songde Rd.
Demo Workshop in CVPR 2016
@Songde Rd.
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
• Human segmentation + others
• Cloud-based solution
Convergence
I usually recommend to brainstorm at least
six different projects …
source
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
Idea: send camera streams to the cloud,
and then do cool stuff 🐣
Autonomous video security platform
Our Solution
Umbo Learning
Cameras
Real-time alerts
Umbo Light
Issues Need to Be Solved
Cameras - hardware/firmware/design
Cloud - provider/streaming/application
A.I. - PoC => service
WebSocket
• On production in 2016
• From PoC projects to online services
TruePlatform and Umbo Light
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
• Bugs
• Feature improvement
• Infrastructure improvement
• Deployment
• Healthy check
• Monitoring
🚒 🔥 Service 🔥 🚒
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
When Real-World Services Meet Real-World Challenges
Factory
95% accuracy 70% 45%
80%
Parking Lot
55% 40%
How to improve this?
Our A.I. solution consists of
heuristics algorithms,
machine learning models,
and, data processing pipelines.
Which one should we improve first? 🧐
Collect More Data Data Data
and tweak the machine learning models in the meantime.
Your data processing pipeline should support data
collection tasks:
customer feedback
sampling algorithms
false-negative miner
…
How and where to collect data?
• Can dispatch tasks to
• Our own labelers
• Third-party labelling services
• AWS Mechanical Turk
• 😃 integration welcome!
• Audit mechanism
• Backend:
• Flask + plugins
• NumPy
• Celery/Redis/Pymongo
• uWSGI
Labeling Platform
source: https://github.com/nightrome/cocostuff
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
The Simplified CV Architecture
Media
Server
CV
Forwarder
Load
Balancer
ZMQ
Stream
Router
Stream
Manager
Stream
Manager
Stream
Manager
Load
Balancer
CV
Worker
CV
Worker
CV
Worker
HTTP
Alert Endpoint
+
The Simplified CV Architecture
Media
Server
CV
Forwarder
Load
Balancer
ZMQ
Stream
Router
Stream
Manager
Stream
Manager
Stream
Manager
Load
Balancer
CV
Worker
CV
Worker
CV
Worker
HTTP
Alert Endpoint
+
• Stream Router/Manager was written by Python2 ❤ ❤
• Python2 => Python3
• byte string / decode
• isinstance
• format
• API changes
• Results
• Python3 ❤ ❤ ❤
• Enable a lot of cool features
• F-strings, typing, asyncio, tracemalloc, etc.
Python 2 to 3
• Refactor Stream Manager
• Happy to use async/await
• Use pipeline pattern
• Still use run_in_executor/
ThreadPoolExecutor if needed
• Results
• Cleaner architecture
• Performance boost
Adopt Python asyncio
• Refactor CV Worker
• Lua => Python
• Torch => PyTorch
• Use aiohttp server
• Results
• Easier to maintain/upgrade
• High performance
• GPU resources are the bottleneck
• Python package ecosystem 👍
Torch to PyTorch
• Debug
• Use pyflame to profile the program
• Use tracemaclloc to find memory
usage
• Use Valgrind to check your C++
code
• Results
• Understand more and gotcha!
Fighting Bugs
🏗 Boost maintainability
🕵 Better infrastructure monitoring
🐞 Have more features
Outcomes
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
Machine Learning Pipelines
Good examples
Uber: Michelangelo
Airbnb: Bighead
Please make sure you have clear goals,
practical user stories, and enough resources
Before you start building the pipeline
• Goals
• Timeline
• Outcome
• User stories
• Blueprint
• Concerns
• Resource
• Infra eng
• Data eng
• Researcher
Behind the Machine Learning Pipelines
• ✅ Data collection
• ✅ Model training
• ✅ Model evaluation
• 🧐 Model deployment
Our Progress
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
• Cloud
• Computing/Network => architecture
• GPU => throughput + algorithm
• ML pipeline
• Computing
• GPU => provider + integration
• Labeling => automation
Cost
• SLO
• Monitor first
• Set up alerts
• Oncall process
• Maintainability
• DevOps
• Adopt engineering best
practices
Service Level Objectives and Maintainability
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
Why Edge?
Cost / Performance / Latency / Network Traffic / Security
• Launched in 2019.06
• From Chaos to Stability 🚒
• Edge + Cloud solution
• ML pipeline is necessary
• Data
• Model
Our AiCamera
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
This is our story, how about yours?
🤔
Product mindset matters
After all, you have to serve your ML service to the real-world 😏
The takeaway
What’s next?
Thank you
• AI Engineer
• Backend Engineer
• Front-end Engineer
• https://umbocv.ai/join_us
• Our booth:
Join Our Story
Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019
1 of 56

More Related Content

Similar to Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019 (20)

Tops for Lean StartupTops for Lean Startup
Tops for Lean Startup
David Chen593 views

More from Chun-Yu Tseng(18)

Recently uploaded(20)

CXL at OCPCXL at OCP
CXL at OCP
CXL Forum183 views
Liqid: Composable CXL PreviewLiqid: Composable CXL Preview
Liqid: Composable CXL Preview
CXL Forum118 views
ThroughputThroughput
Throughput
Moisés Armani Ramírez28 views
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
Prity Khastgir IPR Strategic India Patent Attorney Amplify Innovation23 views
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
Maximiliano Firtman152 views

Build and Host Real-world Machine Learning Services from Scratch @ pycontw2019

  • 1. Joe | AI Engineer
  • 2. Building and Hosting Real-World Machine Learning Services from Scratch How to make A.I. work?
  • 3. Joe Tseng • AI engineer • Tainan.py - organizer • PyConTW 15/16/18/19 - speaker • Our booth
  • 4. Why this talk? 😎 Cool stuff to share 🛠 An extended version of the previous talk
  • 8. Conventional 2D Trip Zone Umbo 3D Virtual Tripwire Home Surveillance: Intruder Detection pedestriancamera intruder cat Object/Event Recognition ! high precision / low false alarm 
 performance • 3D localized detection • precise dimension measurement • machine recognition DEMO1 DEMO2 DEMO3 Advantage November 2014 @AppWorks
  • 12. Image- LQ (t+1) DQTN DQTN Image- LQ (t+1) Image- HQ (t) Image- HQ (t+3) Image- LQ (t+2) DQTN DQTN Image- LQ (t+2) CNN-3x3(3) CNN-1x1(16) CNN-3x3(16) Reconstructed HQ(t+1) CNN-3x3(3) CNN-1x1(16) CNN-3x3(16) Reconstructed HQ(t+2) Fusion Fusion ACCV 2016 @Songde Rd.
  • 13. Demo Workshop in CVPR 2016 @Songde Rd.
  • 15. • Human segmentation + others • Cloud-based solution Convergence I usually recommend to brainstorm at least six different projects … source
  • 17. Idea: send camera streams to the cloud, and then do cool stuff 🐣 Autonomous video security platform
  • 19. Issues Need to Be Solved Cameras - hardware/firmware/design Cloud - provider/streaming/application A.I. - PoC => service WebSocket
  • 20. • On production in 2016 • From PoC projects to online services TruePlatform and Umbo Light
  • 22. • Bugs • Feature improvement • Infrastructure improvement • Deployment • Healthy check • Monitoring 🚒 🔥 Service 🔥 🚒
  • 24. When Real-World Services Meet Real-World Challenges Factory 95% accuracy 70% 45% 80% Parking Lot 55% 40%
  • 25. How to improve this?
  • 26. Our A.I. solution consists of heuristics algorithms, machine learning models, and, data processing pipelines. Which one should we improve first? 🧐
  • 27. Collect More Data Data Data and tweak the machine learning models in the meantime.
  • 28. Your data processing pipeline should support data collection tasks: customer feedback sampling algorithms false-negative miner … How and where to collect data?
  • 29. • Can dispatch tasks to • Our own labelers • Third-party labelling services • AWS Mechanical Turk • 😃 integration welcome! • Audit mechanism • Backend: • Flask + plugins • NumPy • Celery/Redis/Pymongo • uWSGI Labeling Platform source: https://github.com/nightrome/cocostuff
  • 31. The Simplified CV Architecture Media Server CV Forwarder Load Balancer ZMQ Stream Router Stream Manager Stream Manager Stream Manager Load Balancer CV Worker CV Worker CV Worker HTTP Alert Endpoint +
  • 32. The Simplified CV Architecture Media Server CV Forwarder Load Balancer ZMQ Stream Router Stream Manager Stream Manager Stream Manager Load Balancer CV Worker CV Worker CV Worker HTTP Alert Endpoint +
  • 33. • Stream Router/Manager was written by Python2 ❤ ❤ • Python2 => Python3 • byte string / decode • isinstance • format • API changes • Results • Python3 ❤ ❤ ❤ • Enable a lot of cool features • F-strings, typing, asyncio, tracemalloc, etc. Python 2 to 3
  • 34. • Refactor Stream Manager • Happy to use async/await • Use pipeline pattern • Still use run_in_executor/ ThreadPoolExecutor if needed • Results • Cleaner architecture • Performance boost Adopt Python asyncio
  • 35. • Refactor CV Worker • Lua => Python • Torch => PyTorch • Use aiohttp server • Results • Easier to maintain/upgrade • High performance • GPU resources are the bottleneck • Python package ecosystem 👍 Torch to PyTorch
  • 36. • Debug • Use pyflame to profile the program • Use tracemaclloc to find memory usage • Use Valgrind to check your C++ code • Results • Understand more and gotcha! Fighting Bugs
  • 37. 🏗 Boost maintainability 🕵 Better infrastructure monitoring 🐞 Have more features Outcomes
  • 41. Please make sure you have clear goals, practical user stories, and enough resources Before you start building the pipeline
  • 42. • Goals • Timeline • Outcome • User stories • Blueprint • Concerns • Resource • Infra eng • Data eng • Researcher Behind the Machine Learning Pipelines
  • 43. • ✅ Data collection • ✅ Model training • ✅ Model evaluation • 🧐 Model deployment Our Progress
  • 45. • Cloud • Computing/Network => architecture • GPU => throughput + algorithm • ML pipeline • Computing • GPU => provider + integration • Labeling => automation Cost
  • 46. • SLO • Monitor first • Set up alerts • Oncall process • Maintainability • DevOps • Adopt engineering best practices Service Level Objectives and Maintainability
  • 48. Why Edge? Cost / Performance / Latency / Network Traffic / Security
  • 49. • Launched in 2019.06 • From Chaos to Stability 🚒 • Edge + Cloud solution • ML pipeline is necessary • Data • Model Our AiCamera
  • 51. This is our story, how about yours? 🤔
  • 52. Product mindset matters After all, you have to serve your ML service to the real-world 😏 The takeaway
  • 55. • AI Engineer • Backend Engineer • Front-end Engineer • https://umbocv.ai/join_us • Our booth: Join Our Story