Lessons learned using Machine Learning in Java
Jago de Vreede
# jfall.
PYTHON MACHINE LEARNING FRAMEWORKS
• Keras
• SciPy
• PyTorch
• Tensor
fl
ow
• Theano
• Pandas
JAVA MACHINE LEARNING FRAMEWORKS*
• deeplearning4j
• Elki
• DeepJavaLibrary (djl)
• Apache MXNet
• PyTorch
• Tensor
fl
ow
• ONNX Runtime
* With a release in the last year and free
BENCHMARKS
DJL DL4J
BENCHMARKS - TRAINING (LEGO)
Tensor
fl
ow 2.4 (py) PyTorch 2.0.1 (py) PyTorch 2.0.1 (Java) MxNet (Java)
Ran on AWS: g4dn.xlarge on GPU github.com/jagodevreede/ml-banchmark
BENCHMARKS - PREDICTION
Python Java
github.com/jagodevreede/ml-banchmark
Ran on laptop
BENCHMARKS - RESULTS
BENCHMARKS - RESULTS
HUMAN
HUMAN
Chihuahua
Muf
fi
n
Chihuahua
Muf
fi
n
DATASET
$ pip install simple_image_download
import simple_image_download.simple_image_download as simp
my_downloader = simp.Downloader()
my_downloader.download('Chihuahua', limit=2000)
my_downloader.download('Muffin', limit=2000)
DATASET
DATASET - CLEANUP
$ pip install ImageHash
DATASET - CLEANUP
DATASET - CLEANUP
86 images left…
$ pip install flickrapi
DATASET - FLICKR
DATASET - ARGUMENTING
DATASET - ARGUMENTING
$ pip install keras
datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
TIME TO BUILD
The number of samples that are processed at once
during each iteration of training.
BATCH SIZES
The presented results con
fi
rm that using small
batch sizes achieves the best training stability and
generalization performance, for a given
computational cost, across a wide range of
experiments. In all cases the best results have
been obtained with batch sizes m = 32 or
smaller, often as small as m = 2 or m = 4.
arxiv.org/abs/1804.07612
1 2 4 8 16 32 64
BATCH SIZES - SPEED (SECONDS)
ACCURACY
The lower the value of the entropy loss, the better the
model's predictions are aligned with the labels, and
take into account the con
fi
dence of the predictions
accuracy measures the proportion of
correctly classi
fi
ed instances out of the total
number of instances
VS
ENTROPYLOSS
Muf
fi
n Chihuahua
Muf
fi
n 5 3
Chihuahua 1 7
CONFUSION MATRIX
VALIDATION ACCURACY
Only works in DJL >= 0.24.0
Only works in DJL >= 0.24.0
LEARNING RATE
LEARNING RATE
LEARNING RATE
LEARNING RATE
LEARNING RATE - OPTIMIZERS
Muf
fi
n Chihuahua
Muf
fi
n 7 1
Chihuahua 0 8
CONFUSION MATRIX
TRANSFER LEARNING
Muf
fi
n Chihuahua
TRANSFER LEARNING
Muf
fi
n Chihuahua
PYTHON AND JAVA TOGETHER
CONVERT THE MODEL
import tensorflow as tf
import tensorflow.keras as keras
loaded_model = keras.models.load_model("flowers.h5")
tf.saved_model.save(loaded_model, "model/1")
GENERATE IMAGES
GENERATE IMAGES
Warning
Next slide contains
fl
ashing images
ZOOM OR NOT TO ZOOM
Sim
SIM - VS - REAL
Sim Real
SIM - VS - REAL
3001 3002 3004
3001 31 0 0
3002 28 0 0
3004 29 0 0
CONFUSION MATRIX EPOCH 17
3001 3002 3004
3001 4 27 0
3002 5 23 0
3004 8 21 0
CONFUSION MATRIX EPOCH 47
youtube.com/watch?v=3hOlZpZrsXY
SIM - VS - REAL
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
GRAYSCALE VS COLOR
GRAYSCALE VS COLOR
Gray Color
Ran on AWS: g4dn.xlarge
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
G
r
a
y
s
c
a
l
e
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
G
r
a
y
s
c
a
l
e
L
o
w
C
o
n
t
r
a
s
t
V
a
l
i
d
a
t
i
o
n
o
n
l
y
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
G
r
a
y
s
c
a
l
e
L
o
w
C
o
n
t
r
a
s
t
V
a
l
i
d
a
t
i
o
n
o
n
l
y
L
o
w
C
o
n
t
r
a
s
t
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
G
r
a
y
s
c
a
l
e
L
o
w
C
o
n
t
r
a
s
t
V
a
l
i
d
a
t
i
o
n
o
n
l
y
L
o
w
C
o
n
t
r
a
s
t
H
/
V
fl
i
p
p
i
n
g
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
G
r
a
y
s
c
a
l
e
L
o
w
C
o
n
t
r
a
s
t
V
a
l
i
d
a
t
i
o
n
o
n
l
y
L
o
w
C
o
n
t
r
a
s
t
H
/
V
fl
i
p
p
i
n
g
T
r
a
i
n
s
e
t
7
5
0
(
L
a
r
g
e
s
t
i
m
a
g
e
s
)
VALIDATION ACCURACY
B
a
s
e
l
i
n
e
H
i
g
h
-
R
e
s
G
r
a
y
s
c
a
l
e
L
o
w
C
o
n
t
r
a
s
t
V
a
l
i
d
a
t
i
o
n
o
n
l
y
L
o
w
C
o
n
t
r
a
s
t
H
/
V
fl
i
p
p
i
n
g
T
r
a
i
n
s
e
t
7
5
0
(
L
a
r
g
e
s
t
i
m
a
g
e
s
)
A
c
t
u
a
l
i
m
a
g
e
s
(
1
2
%
)
TAKEAWAYS
• Creating the dataset is the hardest part
• Java can be used for ML
• Java is a little bit faster than python
• Python has a bigger ecosystem
• Models in made with python can be
used in Java
@jagovreede
github.com/jagodevreede/
java-djl-dog-vs-muf
fi
n
ml-banchmark
lego-sorter
slideshare.net/JagodeVreede1/
Javamllegojfall

Java-ML-lego-j-fall