Image Classification and Retrieval logic

IMAGE CLASSIFICATION
PIPELINE
CF x IIF classiﬁcation logic strategy

DONE
• Information Retrieval (query the dataset)
• TF-IDF
• Machine Learning (classify new instances)
• CF-IIF
• Technologies: OpenIMAJ / Spark
• Image Pipeline
• Logica di classiﬁcazione
• Implementazione in java (e test)

DA RIVEDERE
• KNN with KDtree
• Cosine similarities, and distance metrics
• Improve cf-iif extractor (logica in spark)
• Tuning with hyper parameter
• Reduce features space: SVD
(scegliendo lo 0,01% di cluster sono 6000+ features)

TODO
• Re-engineering in spark
• testare differenti SIFT features (pyramid?)
• sostituire KNN con CNeuralNetworks
• (GraphLab/Deep4J)

FEATURE
EXTRACTOR
LOADER
MODEL TEST
PREDICTION
Build a classiﬁer, based on salient-feature vocabulary
created from the dataset
1. load images dataset inherent 3 distinct class
2. extract local features from each image,
3. create the codebook for classiﬁcation
4. train and test the model
IMAGE PIPELINE

FEATURES EXTRACTION
• Extracting features is where cool image
processing happens, and represent the key part
of the pipeline.
• Feature extractors make featureVector to
represent an image in a vector space.
• In a visual application systems, you need robust
features for classiﬁcation and search
FEATURE
EXTRACTOR
IMAGES
MODEL TEST
PREDICTION

WHY FEATURES EXTRACTION
• Typically, images features are numerical vectors
that can be used with ML techniques.
• FeatureVectors can be compared by measuring
a distance
• Is useful to groups similar similar features and
reduce the dimension space
• Indexing for Information Retrieval
FEATURE
EXTRACTOR

IMAGE FEATURES
Image features can be:
GLOBAL: single featureVector comes out
FEATURE
EXTRACTOR

IMAGE FEATURES
GRID BASED: multiple featureVectors from
each image blocks
FEATURE
EXTRACTOR

IMAGE FEATURES
LOCAL: multiple featureVector from interest
points (different from each image)
FEATURE
EXTRACTOR

IMAGE FEATURES
SEGMENTED: multiple featureVector from
region point (different from each image)
FEATURE
EXTRACTOR

SIFT FEATURES
Scale Invariant FeatureTransformation is an advanced technique
to extract local features from interest points of an images,
that are invariant to rotation, lighting changes…
Builds on the idea of a local gradient histogram by incorporating
spatial binning which in essence creates multiple gradient
histograms about interest points and appends them all together
Standard SIFT geometry appends a spatial 4x4 grid of histograms
with 8 orientations
Leading a 128-dims features vector which is highly discriminant
and robust128

OPENIMAJ
• Image processing Java libraries, that includes a lot of
feature extractor and other utilities for visual applications:
• DoGSIFT
• DenseSIFT
• PyramidSIFT
• …

SPARK
• Spark is used to scale out the applications: big
dataset, high feature dimensions…
• MLlib

IMAGE PIPELINE
sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZATOR
cf-iif
EXTRACTOR
PREDICTIONS

sift
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
EXTRACTOR
IMAGE PIPELINE
PATH LABEL
img1 car
img3 bycicle
img17 car
PATH LABEL siftFEAT
img1 car
img3 bycicle
128
6000+
PREDICTIONS
1) SIFT features extraction

sift
EXTRACTOR
DATASET
KNN
KME
my
EXTRACTOR
IMAGE PIPELINE
X
siftFEAT
128
6000+
#images
CLUSTER
128
PREDICTIONS
2) features quantisation

sift
EXTRACTOR
DATASET
KNN
KME
my
EXTRACTOR
IMAGE PIPELINE
X
siftFEAT
128
6000+
#images
CLUSTER
128
ﬁxed K = 300
PREDICTIONS
2) features reduction

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
A questo punto associamo a ciascuna immagine un
“cluster vector” (analogo del keyword vector nel
testo)
Cw=<w1, w2,...., wn> n=|C|
Per ciascuna immagine, e per ciascun cluster di
descrittori j, il corrispondente peso wj sara’ il
prodotto di due fattori:
• Cluster Frequency: percentuale di punti di quella
immagine che sono stati mappati nel cluster j
• Inverse Image Frequency: logaritmo del
rapporto tra la cardinalita’ del database e il numero
di immagini in cui descrittori mappati in quel
cluster sono presenti
PREDICTIONS

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
La logica è ripresa dalTF-IDF
molto comune in text retrieval.
Un cluster diventa discriminante
se contiene poche immagini!
PREDICTIONS

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
La logica è ripresa dalTF-IDF
molto comune in text retrieval.
Un cluster diventa discriminante
se contiene poche immagini!
Features-space: si passa dai
descrittori SIFT ai cluster-vector
PREDICTIONS

IMAGE PIPELINE
PATH LABEL siftFEAT
img can
img gatt
128
CLUSTER
V images
my
EXTR
PATH LABEL cﬁifFEAT
img can
img gatt

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
PATH LABEL myFEAT
img1 bycicle
img3 car
#clusters
#images
PREDICTIONS
3) features transformation

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
PATH LABEL myFEAT
img1 bycicle
img3 car
#clusters
#images
La cella wj del cluster-vector
dell’imm.A rappresenta il suo
peso nel cluster J
PREDICTIONS

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
Tutte le immagini sono
rappresentate dai cf-iif vector!PREDICTIONS
PATH LABEL myFEAT
img1 bycicle
img3 car
#clusters
#images

sift
EXTRACTOR
DATASET
KNN
KMEANS
QUANTIZAT
OR
my
IMAGE PIPELINE
This is own vocabulary!
PREDICTIONS
PATH LABEL myFEAT
img1 bycicle
img3 car

KNN
sift
EXTRACTOR
KNN
KMEANS
QUANTIZAT
OR
my
EXTRACTOR
IMAGE PIPELINE
cﬁifFEAT
TEST TRAIN 4) train a knn classiﬁer

sift
EXTRACTOR
KNN
KMEANS
QUANTIZAT
OR
my
EXTRACTOR
IMAGE PIPELINE
TEST TRAIN 4) learn the model
KNN

sift
EXTRACTOR
KNN
KMEANS
QUANTIZAT
OR
my
EXTRACTOR
IMAGE PIPELINE
PREDICTIONS
TEST TRAIN
myFEAT
KNN
4) test
LABEL
car
bycicle
motorbike

sift
EXTRACTOR
KNN
KMEANS
QUANTIZAT
OR
my
EXTRACTOR
IMAGE PIPELINE
PREDICTIONS
TEST TRAIN
myFEAT
KNN
This is the most similar!
LABEL
car
bycicle
motorbike
LABEL
car

IMAGE PIPELINES
• Image Features Extraction: ﬁnd interest points and extract discriminative
and robust features
• OPENIMAJ multiple algortihms
• Learn large codebooks from features
• Train the model (KNN)
• SPARK scalable models (3 days to train a KNN model on 15 images with
openimaj)
• SPARK multiple models (Bayes models, Neural Networks)
PAIN POINTS

• Efﬁcient Nearest Neighbour Search (test)
• KDTree
• HyperParametersTuning (in own pipeline used
for Kmeans, CFIIF and KNN)
IMAGE PIPELINESPAIN POINTS

• Image Features can be used to match music!
• Extractors can be used to ﬁnd objects!
(Face Detection)
OPENIMAJ++

https://github.com/gianvi
Thanks!

Image Classification and Retrieval logic

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Viewers also liked

Viewers also liked (6)

Similar to Image Classification and Retrieval logic

Similar to Image Classification and Retrieval logic (20)

More from Gianvito Siciliano

More from Gianvito Siciliano (7)

Recently uploaded

Recently uploaded (20)

Image Classification and Retrieval logic