15. Retraining
workflow
P40 GPU Server
Pass new data
(request)
Updated data
Rendering
(response)
Get
most recent data
New
trained model
Get
most recent model
Packing
Docker
build&push
Rolling update
in zero downtime
Inference Inference
Inference
Inference Inference
Inference
Response
Request
Loadbalancing
7. Architecture Design