Monadic Genetic Kernel Models in Scala

Monadic Genetic
Kernel Models
in Scala
Patrick Nicolas
October 2015
patricknicolas.blogspot.com
www.slideshare.net/pnicolas

Introduction
The creation and update of models require constant
monitoring the quality of the prediction or classification. It
entails the re-evaluation of model parameters, and kernel
functions if necessary with a new set of labeled data.
This presentation describes a solution to automate the
evaluation and selection of a kernel function appropriate to
a specific training set in online training.

Kernel functions
Kernel functions are widely used in machine learning to
deal with non-linear models for which the dimension of
the problem is not readily known
• Kernel principal components analysis
• Kernelized multi-layer perceptron
• Support vector machines
• Kernelized clustering
• Bayesian kernel density estimation
• Kernelized ridge regression

Some background ….
Given a space of observations of n dimension, ℛ 𝑛 the
features space is an embedded manifold ℳ 𝑝
of dimension
p << n
A kernel function is defined as the mapping
𝜙: ℛ 𝑛 → ℳ 𝑝 𝑤′ = 𝜙 𝑤
The Euclidean metric is defined as
𝑤 = 𝑤. 𝑤

… on kernel models.
v
Projection:
Exponential family
w
φ(w)
K(v,w) =φ(v).φ(w)
A kernel function, K is represented as a projection of features
vectors v, w onto a manifold for which the similarity is
computed from the Riemann metric.
Similarity v.w

Kernel function
Define a kernel function as the composition of 2 functions gλ o h
• Similarity function h
• A function gλ from the exponential family
𝒦𝜆 𝐱, 𝐲 = 𝑔 𝜆(
𝑖
ℎ(𝑥𝑖, 𝑦𝑖))
𝐾 𝜎 𝑥, 𝑦 = 𝑒
−
𝑥−𝑦
𝜎
2
𝑔 𝜎 𝑥 = 𝑒
−
𝑥
𝜎
2
ℎ 𝑥, 𝑦 = 𝑥 − 𝑦
𝐾 𝑑 x, y = 1 + 𝑥. 𝑦 𝑑 𝑔 𝑑 𝑥 = 1 + 𝑥 𝑑 ℎ 𝑥, 𝑦 = 𝑥. 𝑦
Radius basis function (RBF)
Polynomial
Sigmoid
𝐾1 x, y = tanh(1 + 2 x. y) 𝑔1 𝑥 = tanh(1 + 2𝑥) ℎ 𝑥, 𝑦 = 𝑥. 𝑦

What challenge?
• Creating a model require the evaluation of multiple
kernel functions. This task is rarely automated and
consumes a significant amount of human and
computing resources
• Time series or sequential data may require “loose”
kernel model definition which evolve overtime (online
training)
We need a mechanism to automate the generation,
evaluation and optimization of kernel functions.

Challenge online training
Online training requires a constant re-evaluation of the
model and refinement of the underlying kernel function.
F1 score
Size training or observations set
Re-evaluation Re-evaluation
Once the precision, recall or F1 score reaches a
threshold, the model has to be retrained with the same
or different kernel functions.
Valid
range

An approach to automation
1. Automated generation of kernel functions:
A kernel function is composed from 3 different functions
that can be assembled through monadic composition
2. Automated optimization of kernel functions:
A genetic algorithm computes the loss function
associated with each kernel candidates (fitness) and
select the most appropriate one for any specific dataset.

Monadic composition
Genetic selection

Kernel function
Define a kernel function as the composition of 2 functions g o h
• Similarity function h
• A function g from the exponential family
𝒦 𝐱, 𝐲 = 𝑔(
𝑖
ℎ(𝑥𝑖, 𝑦𝑖))
… with its implementation in Scala

The similarity function, h is derived from the metric defined on the
tangent space of a manifold. Here are few examples of similarity
functions.
Similarity functions

The projection of the metric onto the manifold defines an
exponential map or family of functions. Here are some
examples from the exponential family {g}: Radius basis
functions kernel, polynomial kernel and sigmoid kernel
Exponential map functions

Parameterization
𝐾𝜆 𝐱, 𝐲 = 𝑔 𝜆
i
h(xi, yi)
Kernel function are actually parameterized with a single
value or argument λ. For example the parameter for the
radius basis function is λ =1/σ2 where σ is the standard
deviation and the parameter for the polynomial kernel is λ =
d where d is the degree of the polynomial function.
A parameterized kernel function is defined as

The composition of the similarity (or metric) function h and the
exponential map g can be accomplished through a monadic
composition:
Kernel composition
Monads are high level abstractions derived from the theory of
category. A monad defines a computation as a sequence of
transformations of two types
• map: transforms a ‘container’ type by modifying each of its
element with a given function
• flatMap: transforms a ‘container’ by promoting each of its
element as a container then reducing the resulting containers
to a single type

A monad is defined as a Scala trait for a container type M
The monadic implementation of a kernel function of type KF consists
of overriding the map and flatMap transformation.
Monadic composition

Monadic composition with parameterization
The computation of the parameterized kernel function Kλ(x, y)
is broken down into three sequential steps
1. Execution of similarity function h (map)
2. Application of the parameter λ (flatMap)
3. Execution of the exponential map g (flatMap)
The sequence is generated by applying a flatMap method to
the output to a map or another flatMap monadic operations.

i
h(xi, yi)a
b
b
Monadic composition of parameterized kernels
a
b
a
Implementation of monadic composition using the for
comprehensive loop notation

Brief introduction to genetic algorithms
Genetic algorithms use reproduction to evolve a population
of solutions to a problem.
The components of a genetic algorithm are:
• Genetic encoding (and decoding): Conversion between a
solution and a binary format (bits, string) known as
chromosome
• Genetic operations: Functions that extract the most
genetically fit solution
• Genetic fitness function: Criteria to evaluate each
solution.

Reproduction cycle
The reproduction cycle controls the population of chromosomes
using three genetic operators:
• Selection: ranks chromosomes according to a fitness function
• Crossover: pairs chromosomes to generate offspring
chromosomes
• Mutation: introduces minor alteration in the genetic code

Genetic crossover
The purpose of the genetic crossover is to expand the current
population of chromosomes in order to intensify the
competition and improve the genetic quality of the population
The offspring chromosomes are added to the population
along with their parent to increase genetic diversity

Genetic mutation
The mutation procedure inserts a small variation in a
chromosome to maintain some level of diversity between
generations
The mutated chromosome is added to the population along
with the original.

aa a
01 011 10110001010101011100010111010
0 nh ng nλ
Kernel function: encoding
i
h(xi, yi)
Here an example of flat encoding for a parameterized kernel
function.
Note: A tree representation of the kernel function (Genetic
programming) is a viable alternative.

Kernel function: cross-over 1
01 011 111010110101100101010Kλ 𝐱, 𝐲 = gλ
i
h(xi, yi)
𝐾𝛽
′
𝐱, 𝐲 = 𝑔 𝛽
′
i
ℎ′(xi, yi) 11 001 111010011010100001011
01 001
11 011𝐾𝜆
(2)
𝐱, 𝐲 = 𝑔 𝜆
(2)
i
ℎ′
(xi, yi)
h
h’
111010011010100001011
111010110101100101010
h
h’
λ
λ
β
β
𝐾β
(1)
𝐱, 𝐲 = 𝑔β
(1)
i
ℎ (xi, yi)
Parents
Off springs
g
g’
g’
g
Genetic cross-over indexed on the exponential map.

Kernel function: cross-over 2
01 011 111010110101100101010Kλ 𝐱, 𝐲 = gλ
i
h(xi, yi)
𝐾𝛽
′
𝐱, 𝐲 = 𝑔 𝛽
′
i
ℎ′(xi, yi) 11 001 111010011010100001011
01 011
11 001𝐾𝛽′
(2)
𝐱, 𝐲 = g 𝜷′
i
ℎ′
(xi, yi)
h
h’
111010011010100001011
111010110101100101010
h
h’
λ
β
β’
𝐾𝜆′
(1)
𝐱, 𝐲 = g 𝝀′
i
ℎ (xi, yi)
Parents
Off springs
g
g’
g’
g
Genetic cross-over indexed on the lambda parameter
λ’

Kernel function: mutation
01 011 111010110101100101010Kλ 𝐱, 𝐲 = gλ
i
h(xi, yi)
h λg
01 011 111010111101100101010Kλ 𝐱, 𝐲 = g 𝝀′
i
h(xi, yi)
h λ'g
Original
Mutated
One-bit XOR genetic mutation indexed on the lambda parameter

ℒ 𝐾 𝒘 =
𝑖
𝑦𝑖 − 𝑓(𝑥𝑖, 𝒘 )2
+ 𝛾 𝒘 2
Kernel function: fitness
Given a kernel function Kλ and a training set {xi}, a classifier
model w is generated by minimizing the loss Lk
The fitness of a kernel function is the F1 score over a new
validation set.

Scala: Encoder
Encoder for a Kernel function. The method apply (resp. unapply)
implements the encoding (resp. decoding) algorithm
Generic encoder for parameterized type

Scala: encoding
Encoding for a Kernel function kf, given an implicit quantization
scheme quant, for the lambda parameter, λ
Conversion of the similarity function h, parameter λ and
exponential map g into a bit stream.

Scala: Decoding
Conversion of bits representation of similarity function h,
lambda parameter λ and exponential map g.
Assembling of bit stream to instantiate a new kernel function

Scala: chromosome
Quantization for lambda parameter
Chromosome structure and genetic operators

Scala: cross-over
Chromosome cross-over at bit ‘index’
Gene splicing at bit ‘index’

Scala: mutation
Chromosome mutation at bit ‘index’
Gene cloning

Conclusion
The monadic composition and genetic encoding of
kernel functions allows analytical engine to adaptive
classification and prediction models to online
training
The approach selects the kernel that is the most
appropriate to new batches of labeled observations.

References
• Machine Learning: A Probabilistic Perspective §14.1 Kernels
Introduction K. Murphy – MIT Press 2012
• Genetic Algorithms in Search, Optimization and Machine
Learning D. Goldberg - Addison-Wesley 1989
• Scala for Machine Learning §10 Genetic Algorithms P. Nicolas -
Packt publishing 2014
• Introduction to Genetic Algorithms -§Scaling of Relative Fitness
E.D Goodman, Michigan State University 2009 World Summit on
Genetic and Evolutionary Computation
• Introduction to Evolutionary Computing §2 What is an
Evolutionary Algorithm? A. Eiben, J.E. Smith – Springer 2003

Monadic Genetic Kernel Models in Scala

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Monadic Genetic Kernel Models in Scala

Similar to Monadic Genetic Kernel Models in Scala (20)

More from Patrick Nicolas

More from Patrick Nicolas (6)

Recently uploaded

Recently uploaded (20)

Monadic Genetic Kernel Models in Scala

Editor's Notes