Developing Visual AI
Solutions for Online
Marketplaces
Mladen Fernežir, Lead Data Scientist & Co-Founder
Talk Contents:
Introduction
Velebit AI Agency
Visual Use-cases, Challenges, and Solutions for
Online Marketplaces
Tech Stack
Other AI Projects
AI Agency Business Approach
Introduction
● Velebit AI team: many years of experience building
visual solutions for online marketplaces
● A lot of research, paper reading, and custom
development
● Multiple practical challenges to move from papers to
production
● How to solve the challenges and make business
clients and customers happy?
Velebit AI Agency
Davor
CEO
Mladen
Data Scientist
Tomislav
Software
Engineer
Ivan
Machine Learning
Engineer
Co-founders
About us
● AI consultancy
● AI custom R&D
● Fast prototyping
● Images, text, tabular data
● 8 years of experience with
online classifieds
● Data engineering
● Deployment and
monitoring
Visual Use-cases, Challenges,
and Solutions for Online
Marketplaces
Image (and text) categorization
● Item categorization is a seemingly simple problem
● Just use a pre-trained CNN or Transformer network?
● Thousands of real-world, messy categories in online
marketplaces’ category trees
● Similar items in different parts of the category tree
● Suggestions for the users where to place the item
they are selling have to be accurate and specific
Image (and text) categorization
The model suggestion
can handle different
options in the client’s
category tree, and
offer different degrees
of specificity
Image (and text) categorization
● We first take the raw
probability outputs from the
neural network model,
telling how likely is the each
leaf in the tree for the given
user item
Image (and text) categorization
● We calibrate the network
probabilities with so-called
“temperature” scaling 1
● Finally, we apply a custom
algorithm to balance node
suggestion specificity vs.
accuracy
1. “On Calibration of Modern Neural Networks”, https://arxiv.org/abs/1706.04599v2
Real time camera object
detection and classification
Babbly Mobile App Demo:
https://www.youtube.com/watch?v=__xichRmJ9w
The app detects objects
in real-time while looking
around with the camera
and offers translations
Find out how much your stuff is
worth
● Ad posting made faster &
easier
● Fun to play with
● In production on Njuškalo
https://www.youtube.com/watch?v=ogbN6H_0dMA
Real-time detection challenges
and solutions
● The models have to be fast and offer smooth user
experience
● Interpolation between bounding box predictions at
different time steps to reduce neural network
prediction frequency
● We first use similar item matching to offer more
robust price prediction
Visual search and recommendation
Query image Image search results
Image Search - Fashion Cam on
willhaben.at
● We provide the API to turn images into
vector descriptors by using neural
networks
● Vectors are binary and can be
compared fast for similarity
● The most similar target vectors are
picked and converted to original
images for recommendation
Visual recommendations
Visual search and recommendation
challenges and solutions
● Real-world item images can be messy: taken from
different angles, lightning, and quality
● Not as easy as recommending catalogue images
● Models require training on large amounts of labeled
user data
● How to construct good vector descriptors for visual
similarity?
Visual search and recommendation
challenges and solutions
● We combine different parts of the neural network to
construct better descriptors
● We add additional components related to attribute
prediction, e.g. vectors describing the color
● Triplet annotations for validation (tell which of A and
B closer to query Q)
Item attribute prediction
● Predicting different tags of interest, e.g.
color, material, style, brand, etc.
● Issues with positive-negative balance
and label quality
● Approaches such as Focal Loss
helpful for training to focus more on
hard examples (typically less
represented)
Tech Stack
Technology
● Python, PyTorch, TensorFlow
● Cloud deployment - Google Cloud,
Amazon AWS
● Tensorflow Serving or Triton -
model as a service
● Numpy & Cython
● ONNX and TensortRT for faster
model prediction
● NVIDIA GPU deployment
AI-facilitating plugins
Plugins for
● ElasticSearch
● Solr
● Postgres
● MySQL
Other AI Projects
Object detection in 3D digital twins
https://www.youtube.com/watch?v=TCvHThNNP-M
Short-term rainfall prediction
Medical segmentation
Oktay, Ozan et al. “Attention U-Net: Learning Where to Look for the
Pancreas.” ArXiv abs/1804.03999 (2018): n. pag.
● Important to involve a
domain expert
● Multiple medical specifics:
DCM and NIFTI files, MRI
and CT scans
● 3D problem
● Iterative data quality
improvement
BERTić model self-supervised tuning
28
AI content generation by using text prompts
and Stable Diffusion
https://www.reddit.com/r/dalle2/comments/x401g0/dalle2_vs_stable_diffusion_prompt_in_comments/
AI Agency Business Approach
Typical struggles
● Do you have the data?
● Is your data informative?
● What kind of data pipeline do you need?
● Does the business use-case value support the
investment?
● Managing client expectations on what is realistic
● Lowering the risks for both sides
Approaches
● Business use-case first
● Understanding all
stakeholder needs
● Iterative approach and
frequent communication
● Development sprints
● Data-centric approach
“Data-centric AI is the
discipline of
systematically
engineering the data
used to build an AI
system.” Andrew Ng
Thank you for your interest!
Mladen Fernežir
Lead Data Scientist | Co-founder
mladen.fernezir@velebit.ai | velebit.ai
Velebit AI LLC

[DSC Europe 22] Developing Visual AI Solutions for Online Marketplaces - Mladen Fernezir

  • 1.
    Developing Visual AI Solutionsfor Online Marketplaces Mladen Fernežir, Lead Data Scientist & Co-Founder
  • 2.
    Talk Contents: Introduction Velebit AIAgency Visual Use-cases, Challenges, and Solutions for Online Marketplaces Tech Stack Other AI Projects AI Agency Business Approach
  • 3.
    Introduction ● Velebit AIteam: many years of experience building visual solutions for online marketplaces ● A lot of research, paper reading, and custom development ● Multiple practical challenges to move from papers to production ● How to solve the challenges and make business clients and customers happy?
  • 4.
  • 5.
  • 6.
    About us ● AIconsultancy ● AI custom R&D ● Fast prototyping ● Images, text, tabular data ● 8 years of experience with online classifieds ● Data engineering ● Deployment and monitoring
  • 7.
    Visual Use-cases, Challenges, andSolutions for Online Marketplaces
  • 8.
    Image (and text)categorization ● Item categorization is a seemingly simple problem ● Just use a pre-trained CNN or Transformer network? ● Thousands of real-world, messy categories in online marketplaces’ category trees ● Similar items in different parts of the category tree ● Suggestions for the users where to place the item they are selling have to be accurate and specific
  • 9.
    Image (and text)categorization The model suggestion can handle different options in the client’s category tree, and offer different degrees of specificity
  • 10.
    Image (and text)categorization ● We first take the raw probability outputs from the neural network model, telling how likely is the each leaf in the tree for the given user item
  • 11.
    Image (and text)categorization ● We calibrate the network probabilities with so-called “temperature” scaling 1 ● Finally, we apply a custom algorithm to balance node suggestion specificity vs. accuracy 1. “On Calibration of Modern Neural Networks”, https://arxiv.org/abs/1706.04599v2
  • 12.
    Real time cameraobject detection and classification Babbly Mobile App Demo: https://www.youtube.com/watch?v=__xichRmJ9w The app detects objects in real-time while looking around with the camera and offers translations
  • 13.
    Find out howmuch your stuff is worth ● Ad posting made faster & easier ● Fun to play with ● In production on Njuškalo https://www.youtube.com/watch?v=ogbN6H_0dMA
  • 14.
    Real-time detection challenges andsolutions ● The models have to be fast and offer smooth user experience ● Interpolation between bounding box predictions at different time steps to reduce neural network prediction frequency ● We first use similar item matching to offer more robust price prediction
  • 15.
    Visual search andrecommendation Query image Image search results
  • 16.
    Image Search -Fashion Cam on willhaben.at ● We provide the API to turn images into vector descriptors by using neural networks ● Vectors are binary and can be compared fast for similarity ● The most similar target vectors are picked and converted to original images for recommendation
  • 17.
  • 18.
    Visual search andrecommendation challenges and solutions ● Real-world item images can be messy: taken from different angles, lightning, and quality ● Not as easy as recommending catalogue images ● Models require training on large amounts of labeled user data ● How to construct good vector descriptors for visual similarity?
  • 19.
    Visual search andrecommendation challenges and solutions ● We combine different parts of the neural network to construct better descriptors ● We add additional components related to attribute prediction, e.g. vectors describing the color ● Triplet annotations for validation (tell which of A and B closer to query Q)
  • 20.
    Item attribute prediction ●Predicting different tags of interest, e.g. color, material, style, brand, etc. ● Issues with positive-negative balance and label quality ● Approaches such as Focal Loss helpful for training to focus more on hard examples (typically less represented)
  • 21.
  • 22.
    Technology ● Python, PyTorch,TensorFlow ● Cloud deployment - Google Cloud, Amazon AWS ● Tensorflow Serving or Triton - model as a service ● Numpy & Cython ● ONNX and TensortRT for faster model prediction ● NVIDIA GPU deployment
  • 23.
    AI-facilitating plugins Plugins for ●ElasticSearch ● Solr ● Postgres ● MySQL
  • 24.
  • 25.
    Object detection in3D digital twins https://www.youtube.com/watch?v=TCvHThNNP-M
  • 26.
  • 27.
    Medical segmentation Oktay, Ozanet al. “Attention U-Net: Learning Where to Look for the Pancreas.” ArXiv abs/1804.03999 (2018): n. pag. ● Important to involve a domain expert ● Multiple medical specifics: DCM and NIFTI files, MRI and CT scans ● 3D problem ● Iterative data quality improvement
  • 28.
  • 29.
    AI content generationby using text prompts and Stable Diffusion https://www.reddit.com/r/dalle2/comments/x401g0/dalle2_vs_stable_diffusion_prompt_in_comments/
  • 30.
  • 31.
    Typical struggles ● Doyou have the data? ● Is your data informative? ● What kind of data pipeline do you need? ● Does the business use-case value support the investment? ● Managing client expectations on what is realistic ● Lowering the risks for both sides
  • 32.
    Approaches ● Business use-casefirst ● Understanding all stakeholder needs ● Iterative approach and frequent communication ● Development sprints ● Data-centric approach “Data-centric AI is the discipline of systematically engineering the data used to build an AI system.” Andrew Ng
  • 33.
    Thank you foryour interest! Mladen Fernežir Lead Data Scientist | Co-founder mladen.fernezir@velebit.ai | velebit.ai Velebit AI LLC