How to fully automate a store.pptx

How to fully
automate a store
A Made in Italy use case

Alessio Elmi
Artificial Intelligence Engineer
Shruti Verma
Michele Toni
Bruno Abbate
Machine Learning Engineer
linkedin.com/in/alessioelm
i
linkedin.com/in/shrutiverma2
linkedin.com/in/bruno-abbate
linkedin.com/in/michele-toni
Naser Derakhshan
Computer Scientist
linkedin.com/in/naser-derakhshan-51951828
Pietro Tortella
Mathematician
linkedin.com/in/pietro-tortella-976839ab
Luca Lulleri
Industrial Designer
linkedin.com/in/lucalulleri
Alessandro Re
linkedin.com/in/akiross
Riccardo Di Guida
linkedin.com/in/riccardo-di-guida-
005764124
Davide Mazzini
Deep Learning Engineer
linkedin.com/in/davidemazzini
Mattia Santachiara
AR/VR Engineer
linkedin.com/in/mattia-santachiara-90a1b379
Igor Moiseev
Crazy CTO
linkedin.com/in/moiseevigor
4 PhD
8 MSc

Automated checkout
What a beast?

Three main ML problems
Object tracking and
Anomaly detection
Pose-estimation and
People Tracking
Assignment problem

It was required a
system which could
validate the correct
amount of goods
picked up or dropped
at the same time from
a user.
Hardware Design

Scales PCB/Firmware
The PICK action
(bottle of water)
The DROP action
(bottle of water)

Camera Positioning Study
1. Retrieve Cad Drawing of the space
2. 3D modeling of the space
3. Define camera position and direction
4. Grasshopper algorithm to make

Camera Positioning Study
Evolutionary Algorithm
and Particle Swarm
Optimization to
optimize camera
positioning.

Pose-estimation
● Train 2D pose estimation model using a top view dataset
including renderings from the synthetic datasets
● GPU version of the upsampling model (main bottleneck right now)
● Cameras “software” synchronization
● Reduce CPU and GPU load

Tracking: The problem
Match 2D-pose
detections from
different cameras to
create 3D-tracks.

Tracking: The glossary
Detection
One pose
in a given frame
at a given time
Reconstruction
Many detections
different frames
at a given time
Track
Many reconstructions
at different times

Tracking: The approach
Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking
by M. Hofmann, D. Wold, G. Rigoll. 2013

Tracking: The approach
● Construct all possible
reconstructions and links
● Associate probabilities to them
● Associate probabilities to links
● Create Hypergraph
● Reduce to BIP problem ● Boolean variable per vertex
● Boolean variable per link
● Two constraints per vertex
○ Incoming flow = vertex variable value
○ Outgoing flow = vertex variable value
● Additional constraints from detections
○ Each detection might belong to at most
one flow
● Cost per vertex variable from reconstruction
prob
● Cost per link variable from link prob
● Minimize cost of flow
Binary Integer Programming
Minimize cost with integer variables
satisfying given constraints

Tracking: The approach online
Window 0 Window 2
Window 1 Window 3

● Stabletracks
● Flexible
● No ID switch
● CPU Expensive
● Complexity
● Sensible to parameters calibration
● BIP is NP-Hard

Tracking: The doing
Introduced the 3D geometry of the store.
● Use geometric informations on cameras and obstacles to filter reconstructions
● Make all parameters position-dependent
RESULTS:
➔ Lighter graph (-50% variables, -20% equations)
➔ Reduced complexity → Better scalability to bigger stores

Object tracking and Anomaly detection

� Detect misplaced products in the
scales
� Detect extraneous objects in the
scales

1) ODIN (Out-of-distribution detection)
2) Reconstruction-based using Autoencoder
3) Object Detection using ResNet 50 + Faster RCNN

DB
Query
image
Input
Reference
image
Resnet-18 Backbone
Resnet-18 Backbone
Shared
weights
Concatenate
Features
Input
Features
Reference
Features
Input
Features
Reference
Multilayer Perceptron
Conform
-
anomaly
Output

Assignment problem
The aim is to combine data from cameras and scales to predict events
e = (timestamp, action, scale, product, quantity, user)
2 INPUT
SOURCES
CAMERAS
SCALES
DATA
PROCESSING
DATA
PROCESSING
DATA FUSION
SCALE
ACTION
PRODUCT +
QUANTITY
USER
FINAL OUTPUT
CARTS
TRIGGER

For each user we compute
the trajectories of the
distances between relevant
joints and the scale, around
the timestamp of the action.
We train the model to classify
the action on this data.
Assignment problem
wrist
elbow
shoulder

We defined some metrics to evaluate how well the
system is performing:
Metrics
RECORD DATA
ANNOTATION TOOL
CALCULATE,
STORE AND
ANALYZE METRICS
The same metrics can be defined in spaces where we
ignore either the user or the action variable.
We also evaluate these metrics on the space of the carts.

Annotation tool in collaboration with https://itrexgroup.com

Camera-0
Camera-n
... VideoCapture
- Triangulation
- Track creation
VideoManager
- Pose estimation
TensorRT
InferenceServer
Scale-0
Scale-9
...
Scale-0
Scale-9
...
Gateway-0
Gateway-n
... - Pick/Drop classification
- Product classification
- User assignment
DataFusion
- Visualization
- Config change
Dashboard
- Carts update
- Config management
- Check-in/out handling
Backend
- Authentication
- Check-in
CheckInUI
- Payment
- Check-out
CheckOutUI
MongoDB
MQTT p/s
Network
CANbus
dbactions

Content
Frontend/Backend
● Visualize what’s happening
○ Users & carts
○ 3D reconstructions
● High-level visualization of
system’s status
○ Scales gateways status
○ connection errors
● Interface for store
configuration
○ Racks & scales spatial layout
○ Planograms
○ Products DB
○ Cameras configurations
Toolbar
App
Snackbar
Cameras
Carts
Products
Racks
Shelf
Tracking3D
MQTT

Checkin / Checkout frontend
● AngularJS webapps for checkin and
checkout UI
● Customer interaction
● Show information and feedback to
the customer

How to fully automate a store.pptx

Recommended

Recommended

More Related Content

Similar to How to fully automate a store.pptx

Similar to How to fully automate a store.pptx (20)

Recently uploaded

Recently uploaded (20)

How to fully automate a store.pptx

Editor's Notes