SlideShare a Scribd company logo
1 of 59
Download to read offline
Scala for
Machine Learning
Patrick Nicolas
December 2014
patricknicolas.blogspot.com
www.slideshare.net/pnicolas
What challenges?
Building scientific and machine learning applications
requires ….
1. Clearly defined abstractions
2. Flexible, dynamic models
3. Scalable execution
What makes Scala particularly suitable to solve
machine learning and optimization problems?
... and may involve mathematician, data scientists,
software engineers and dev. ops.
Scala tool box
Which elements in the Scala tool box are useful to meet
these challenges?
Actors
Composed futures
F-bound
Reactive
Abstraction
Non-linear learning models <= functorial tensors
Kernel monadic composition <= monads
Extending library types <= implicits
Flexibility
Scalability
Low dimension features
space (manifold)
embedded into an
observation space
(Euclidean)
Abstraction: Non-linear learning models
𝑓(𝑥, 𝑦, 𝑧)
𝛻𝑓 =
𝜕𝑓
𝜕𝑥
𝑖 +
𝜕𝑓
𝜕𝑦
𝑗 +
𝜕𝑓
𝜕𝑧
𝑘
Each type of tensors is a category, associated with a functor category.
• Field
• Vector field (contravariant)
• Inner product
• Covariant vector field (one-form/map)
• Tensor product ,exterior product , …
< 𝑣, 𝑤 > = f
𝛼 𝑤 =< 𝑣, 𝑤 >
𝑇 𝑚
𝑛⨂𝑇𝑝
𝑞
𝑑𝑥𝑖 ∧ 𝑑𝑥𝑗
Tensor fields are geometric entities defining linear relation between
vector fields, differential forms, scalars and other tensor fields
Abstraction: Non-linear learning models
Machine learning consists of identifying a low dimension
features space, manifold within an Euclidean observations
space. Computation of smooth manifolds relies on tensors
and tensor metrics (Riemann, Laplace-Beltrami,…)
Problem: How to represent tensors and metrics?
Solution: Functorial representation of tensors,
tensor products and differential forms.
Abstraction: Non-linear learning models
One option is to define a vector field as a collection (i.e. List) and
leverage the functor for the list.
Functor: f: U => V F(f): F(U) => F(V)
Abstraction: Non-linear learning models
Convenient but incorrect…
Let’s define a generic vector field and covector fields types
Abstraction: Non-linear learning models
Define a tensor as a Higher kind being either a vector or a
co-vector accessed through type projection.
The functor for the vector field relies on the projection (Hom
functor) of 2 argument type functor Tensor on covariant and
contravariant types.
Covariant Functor f: U => V F(f): F(U) => F(V)
Abstraction: Non-linear learning models
Contravariant functors are used for morphisms or transformation on
Covariant tensors (type CoVField)
Contravariant functor f: U => V F(f): F(V) => F(U)
Abstraction: Non-linear learning models
Product Functor
Tensor metrics and products requires other type of functors …
BiFunctor
(*) Paul Phillips’ cats framework https://github.com/non/cats
Abstraction: Non-linear learning models
Abstraction: Kernel monadic composition
Clustering or classifying observations entails computation of
inner product of observations on the manifold
Kernel functions are commonly used in training to separate
classes of observations with a linear decision boundary
(hyperplane).
Problem: Building a model entails creating,
composing and evaluating numerous kernels.
Solution: Define kernels as a 1st class
programming concept with monadic operations.
Define a kernel function as the composition of 2 functions g o h
𝒦𝑓 𝐱, 𝐲 = 𝑔(
𝑖
ℎ(𝑥𝑖, 𝑦𝑖))
Abstraction: Kernel monadic composition
We create a monad to generate any kind of kernel functions Kf, by
composing their component g: g1 o g2 o … o gn o h
A monad extends a functor with binding method (flatMap)
The monadic implementation of the kernel function component h
Abstraction: Kernel functions composition
Declaration explicit kernel function
𝒦 𝐱, 𝐲 = 𝑒
−
1
2
𝐱−𝐲
𝜎
2
h: 𝑥, 𝑦 → 𝑥 − 𝑦 g: 𝑥 → 𝑒
−
1
2𝜎2( 𝑥)2
Polynomial kernel
𝒦 𝐱, 𝐲 = (1 + 𝐱. 𝐲) 𝑑
h: 𝑥, 𝑦 → 𝑥. 𝑦 g: 𝑥 → (1 + 𝑥) 𝑑
Abstraction: Kernel functions composition
Radius basis function kernel
Our monad is ready for composing any kind of explicit kernels on
demand, using for-comprehension
Abstraction: Kernel functions composition
Notes
• Quite often monads defines filtering capabilities (i.e.
Scala collections).
• Accidently, the for-comprehension closure can be
also used to create dynamic workflow
Abstraction: Kernel functions composition
Abstraction: Extending library types
Scala libraries classes cannot always be sub-
classed. Wrapping library component in a
helper class clutters the design.
Implicit classes extends classes functionality
without cluttering name spaces (alternative to
type classes)
The purpose of reusability goes beyond refactoring code.
It includes leveraging existing well understood concepts
and semantic.
Data flow micro-router for successful and failed computation
by transforming Try to Either with recovery and processing
functions
scala.util.Try[T]
recover[U >: T](f: PartialFunction[Throwable, U]): Try[U]
getOrElse[U >: T](f: () => U): U
orElse[U :> T](f: () => Try[U]): Try[U]
toEither[U](rec: () => U)(f: T => T): Either[U, T]
Abstraction: Extending library types
.. as applied to a normalization problem.
4 lines of Scala code to extend Try with Either concept.
Abstraction: Extending library types
Notes
Abstraction: Extending library types
• Type conversion such as toDouble, toFloat can be
extended to deal rounding error or rendering
precision
• Creating a type class is a more generic (appropriate?)
methodology to extends functionality of a closed
model or framework. Is there a reason why Try in
Scala standard library does not support conversion to
Either ?
Abstraction
non-linear learning models <= functorial tensors
Kernel monadic composition <= monads
Extending library types <= implicits
Flexibility
Modeling <= Stackable traits
Scalability
Flexibility: modeling
Building machine learning apps requires
configurable, dynamic workflows that preserve
the model formalism
Leverage mixins, inheritance and abstract values
to create models and weave data transformation.
Factory design patterns have been used to model dynamic
systems (GoF). Are they adequate to model dynamic
workflow?
Flexibility: modeling
Traditional programming languages compare unfavorably to
scientific related language such as R because their inability
to follow a strict mathematical formalism:
1. Variable declaration
2. Model definition
3. Instantiation
Scala stacked traits and abstract values preserve the core
formalism of mathematical expressions.
𝑓 ∈ ℝ 𝑛 → ℝ 𝑛
𝑓 𝑥 = 𝑒 𝑥
𝑔 ∈ ℝ 𝑛
→ ℝ
ℎ = 𝑔𝑜𝑓
g 𝒙 = 𝑖 𝑥𝑖
Declaration
Model
Instantiation
Flexibility: modeling
Multiple models and algorithms are typically evaluated by
weaving computation tasks.
A learning platform is a framework that
• Define computational tasks
• Wires the tasks (data flow)
• Deploys the tasks (*)
Overcome limitation of monadic composition (3 level of
dynamic binding…)
(*) Actor-based deployment
Flexibility: modeling
Even the simplest workflow (model of data transformation) requires
flexibility …..
Flexibility: modeling
Data scientists should be able to
1. Given the objective of the computation, select the best
sequence of module/tasks (i.e. Modeling: Preprocessing +
Training + Validating)
2. Given the profile of data input, select the best data
transformation for each module (i.e. Data preprocessing:
Kalman, DFT, Moving average….)
3. Given the computing platform, select the best
implementation for each data transformation (i.e. Kalman:
KalmanOnAkka, Spark…)
Flexibility: modeling
Implementation of Preprocessing module
Flexibility: modeling
Implementation of Preprocessing module using discrete Fourier
… and discrete Kalman filter
Flexibility: modeling
d
d
Preprocessing
Loading
Reducing Training
Validating
Preprocessor
DFTFilter
Kalman
EM
PCA SVM
MLP
Reducer Supervisor
Clustering
Clustering workflow = preprocessing task -> Reducing task
Modeling workflow = preprocessing task -> model training
task -> model validation
Modeling
Flexibility: modeling
A simple clustering workflow requires a preprocessor &
reducer. The computation sequence exec transform a
time series of element of type U and return a time series
of type W as option
Flexibility: modeling
A model is created by processing the original time series of type TS[T]
through a preprocessor, a training supervisor and a validator
Flexibility: modeling
Putting all together for a conditional path execution …
Flexibility: modeling
1
Abstraction
Non-linear learning models <= functorial tensors
Kernel monadic composition <= monads
Extending library types <= implicits
Flexibility
Modeling <= Stackable traits
Scalability
Dynamic programming <= tail recursion
Online processing <= streams
Data flow control <= back-pressure strategy
Scalability: dynamic programming
Many machine learning algorithms (HMM,RL,
EM, MLP, …) relies on dynamic programming
techniques
Tail recursion is very efficient solution because it
avoids the creation of new stack frames
Choosing between iterative and recursive implementation
of algorithms is a well-documented dilemma.
Viterbi algorithm for hidden Markov Models
The objective is to find the most likely sequence of states
{qt} given a set of observations {Ot} and a λ-model
Scalability: dynamic programming
The algorithm recurses along the observations with N
different states.
Scalability: dynamic programming
Relative performance of the recursion w/o tail elimination
for the Viterbi algorithm given the number of observations
Scalability: dynamic programming
Scalability: online processing
Some problems lend themselves to process very
large data sets of unknown size for which the
execution may have to be aborted or re-applied
Streams reduce memory consumption by
allocating and releasing chunk of data (or slice or
time series) while allowing reuse of intermediate
results.
An increasing number of algorithms such as reinforcement
training relies on online (or on-demand) training.
X0 X1 ….... Xn ………. Xm
Data stream
1
2𝑚
𝑦 𝑛 − 𝑓 𝒘|𝑥 𝑛
2
+ 𝜆 𝒘 2
Garbage collector
Allocate
slice .take
Release slice .drop
Heap
Traversal loss function
Scalability: online processing
The large data set is converted into a stream then broken
down into manageable slices. The slices are instantiated,
processed (i.e. loss function) and released back to the
garbage collector, one at the time
Slices of NOBS observations are allocated one at the time, (.take)
processed, then released (.drop) at the time.
Scalability: online processing
The reference streamRef has to be weak, in order to have the slices
garbage collected. Otherwise the memory consumption increases
with each new batch of data.
(*) Alternatives: define strmRef as a def or use StreamIterator
Scalability: online processing
Comparing list, stream and stream with weak references.
Scalability: online processing
Operating zone
Notes:
Iterators:
• computation cannot not memoized. (“Iterators are the
imperative version of streams”)
• One element at a time
• Non-recursive (tail elimination)
Views:
• No intermediate results preserved
• One element at a time
Stream iterators:
• Lazy tails
Scalability: online processing
The execution of workflow may create a stream
bottleneck, for slow tasks and overflow local
buffers.
A flow control mechanism handling back pressure
on bounded mail boxes of upstream actors.
Actors provides a very efficient and reliable way to deploy
workflows and tasks over a large number of cores and
hosts.
Scalability: flow control
Scalability: flow control
Actor-based workflow has to consider
• Cascading failures => supervision strategy
• Cascading bottleneck => Mailbox back-pressure strategy
Workers
Router, Dispatcher, …
Akka has reliable mechanism to handle failure. What about
temporary disruptions?
Scalability: flow control
Messages passing scheme to process various data streams
with transformations.
Dataset
Workers
Controller
Watcher
Load->
Compute->
Bounded mailboxes
<- GetStatus
Status ->
Completed->
Worker actors processes data chunk msg.xt sent by the
Controller with the transformation msg.fct
Message sent by collector to trigger computation
Scalability: flow control
Watcher actor monitors messages queues report to collector with
Status message.
GetStatus message sent by the collector has no payload
Scalability: flow control
Controller creates the workers, bounded mailbox for each worker
actor (msgQueues) and the watcher actor.
Scalability: flow control
The Controller loads the data sets per chunk upon receiving the
message Load from the main program. It processes the results
of the computation from the worker (Completed) and throttle
the input to workers for each Status message.
Scalability: flow control
The Load message is implemented as a loop that create data chunk
which size is adjusted according to the load computed by the
watcher and forwarded to the controller, Status
Scalability: flow control
Simple throttle increases/decreases size of the batch of observations
given the current load and specified watermark.
Scalability: flow control
Selecting faster/slower and less/more accurate version of
algorithm can also be used in the regulation strategy
Feedback control loop adjusts the size of the batches given the
load in mail boxes and complexity of the computation
Scalability: flow control
• Feedback control loop should be smoothed (moving
average, Kalman…)
• A larger variety of data flow control actions such as
adding more workers, increasing queue capacity, …
• The watch dog should handle dead letters, in case of a
failure of the feedback control or the workers.
• Reactive streams introduced in Akka 2.2+ has a
sophisticated TCP-based propagation and back pressure
control flows
Notes
Scalability: flow control
… and there is more
There are many other Scala programming language constructs
I found particularly intriguing as far as for machine learning is
concerned …
Reactive streams (TCP)
Domain Specific Language
Emulate ‘R’ language for scientists to use the application.
Effective fault-tolerance & flow control mechanism
Delimited continuation
Save, restore, reuse computation states
Donate to Apache software and Eclipse foundations
Monads are Elephants J. Ivy –
james-iry.blogspot.com/2007/10/monads-are-elephans-part2.html
Extending the Cake pattern: Dependency injection in Scala A. Warski –
www.warski.org/blog/2010/12/di-in-scala-cake-pattern
Programming in Scala $12.5 Traits as stackable modification M. Odersky,
M. Spoon, L. Venners - Artima 2008
Introducing Akka J. Boner - Typesafe 2012
www.slideshare.net/jboner/introducing-akka
Scala in Machine Learning: $1 Getting started P. Nicolas –
Packt publishing 2014
Exploring Akka Stream’s TCP Back Pressure: U. Peter – Xebia 2015
blog.xebia.com/2015/01/14/exploring-akka-streams-tcp-back-pressure/
References
Cats functional library P. Phillips – https://github.com/non/cats

More Related Content

What's hot

Matlab for beginners, Introduction, signal processing
Matlab for beginners, Introduction, signal processingMatlab for beginners, Introduction, signal processing
Matlab for beginners, Introduction, signal processingDr. Manjunatha. P
 
application based Presentation on matlab simulink & related tools
application based Presentation on matlab simulink & related toolsapplication based Presentation on matlab simulink & related tools
application based Presentation on matlab simulink & related toolsEshaan Verma
 
Introduction to matlab lecture 1 of 4
Introduction to matlab lecture 1 of 4Introduction to matlab lecture 1 of 4
Introduction to matlab lecture 1 of 4Randa Elanwar
 
Matlab day 1: Introduction to MATLAB
Matlab day 1: Introduction to MATLABMatlab day 1: Introduction to MATLAB
Matlab day 1: Introduction to MATLABreddyprasad reddyvari
 
Introduction to MATLAB 1
Introduction to MATLAB 1Introduction to MATLAB 1
Introduction to MATLAB 1Mohamed Gafar
 
Basic matlab and matrix
Basic matlab and matrixBasic matlab and matrix
Basic matlab and matrixSaidur Rahman
 
Matlab programming project
Matlab programming projectMatlab programming project
Matlab programming projectAssignmentpedia
 
Matlab-Data types and operators
Matlab-Data types and operatorsMatlab-Data types and operators
Matlab-Data types and operatorsLuckshay Batra
 
Advance Scala - Oleg Mürk
Advance Scala - Oleg MürkAdvance Scala - Oleg Mürk
Advance Scala - Oleg MürkPlanet OS
 
09 implementing+subprograms
09 implementing+subprograms09 implementing+subprograms
09 implementing+subprogramsbaran19901990
 
Multi Processor Architecture for image processing
Multi Processor Architecture for image processingMulti Processor Architecture for image processing
Multi Processor Architecture for image processingideas2ignite
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Spark Summit
 
On Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic ProgramsOn Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic ProgramsLino Possamai
 

What's hot (20)

Fundamentals of matlab programming
Fundamentals of matlab programmingFundamentals of matlab programming
Fundamentals of matlab programming
 
Matlab for beginners, Introduction, signal processing
Matlab for beginners, Introduction, signal processingMatlab for beginners, Introduction, signal processing
Matlab for beginners, Introduction, signal processing
 
application based Presentation on matlab simulink & related tools
application based Presentation on matlab simulink & related toolsapplication based Presentation on matlab simulink & related tools
application based Presentation on matlab simulink & related tools
 
Pptchapter04
Pptchapter04Pptchapter04
Pptchapter04
 
Introduction to matlab lecture 1 of 4
Introduction to matlab lecture 1 of 4Introduction to matlab lecture 1 of 4
Introduction to matlab lecture 1 of 4
 
Matlab day 1: Introduction to MATLAB
Matlab day 1: Introduction to MATLABMatlab day 1: Introduction to MATLAB
Matlab day 1: Introduction to MATLAB
 
Introduction to MATLAB 1
Introduction to MATLAB 1Introduction to MATLAB 1
Introduction to MATLAB 1
 
Basic matlab and matrix
Basic matlab and matrixBasic matlab and matrix
Basic matlab and matrix
 
Matlab tutorial
Matlab tutorialMatlab tutorial
Matlab tutorial
 
Matlab programming project
Matlab programming projectMatlab programming project
Matlab programming project
 
Matlab-Data types and operators
Matlab-Data types and operatorsMatlab-Data types and operators
Matlab-Data types and operators
 
Advance Scala - Oleg Mürk
Advance Scala - Oleg MürkAdvance Scala - Oleg Mürk
Advance Scala - Oleg Mürk
 
09 implementing+subprograms
09 implementing+subprograms09 implementing+subprograms
09 implementing+subprograms
 
Matlab
MatlabMatlab
Matlab
 
MATLAB guide
MATLAB guideMATLAB guide
MATLAB guide
 
Multi Processor Architecture for image processing
Multi Processor Architecture for image processingMulti Processor Architecture for image processing
Multi Processor Architecture for image processing
 
Unit 4
Unit 4Unit 4
Unit 4
 
Devoxx
DevoxxDevoxx
Devoxx
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
 
On Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic ProgramsOn Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic Programs
 

Viewers also liked

Eigenfaces In Scala
Eigenfaces In ScalaEigenfaces In Scala
Eigenfaces In ScalaJustin Long
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentPatrick Nicolas
 
Distributed machine learning 101 using apache spark from the browser
Distributed machine learning 101 using apache spark from the browserDistributed machine learning 101 using apache spark from the browser
Distributed machine learning 101 using apache spark from the browserAndy Petrella
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInVitaly Gordon
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
 
PredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF ScalaPredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF Scalapredictionio
 
Hidden Markov Model & Stock Prediction
Hidden Markov Model & Stock PredictionHidden Markov Model & Stock Prediction
Hidden Markov Model & Stock PredictionDavid Chiu
 
Understanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the NetworkUnderstanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the Networkbradhedlund
 
Data Modeling using Symbolic Regression
Data Modeling using Symbolic RegressionData Modeling using Symbolic Regression
Data Modeling using Symbolic RegressionPatrick Nicolas
 
Learning from "Effective Scala"
Learning from "Effective Scala"Learning from "Effective Scala"
Learning from "Effective Scala"Kazuhiro Sera
 
Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?Timothy Danford
 
Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.
Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.
Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.GeeksLab Odessa
 
Adaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersAdaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersPatrick Nicolas
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
Using Deep Learning for Recommendation
Using Deep Learning for RecommendationUsing Deep Learning for Recommendation
Using Deep Learning for RecommendationEduardo Gonzalez
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark mldatamantra
 
How to integrate python into a scala stack
How to integrate python into a scala stackHow to integrate python into a scala stack
How to integrate python into a scala stackFliptop
 
Getting Started with Deep Learning using Scala
Getting Started with Deep Learning using ScalaGetting Started with Deep Learning using Scala
Getting Started with Deep Learning using ScalaTaisuke Oe
 

Viewers also liked (20)

Eigenfaces In Scala
Eigenfaces In ScalaEigenfaces In Scala
Eigenfaces In Scala
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentiment
 
Scala-Ls1
Scala-Ls1Scala-Ls1
Scala-Ls1
 
Distributed machine learning 101 using apache spark from the browser
Distributed machine learning 101 using apache spark from the browserDistributed machine learning 101 using apache spark from the browser
Distributed machine learning 101 using apache spark from the browser
 
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedIn
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
PredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF ScalaPredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF Scala
 
Hidden Markov Model & Stock Prediction
Hidden Markov Model & Stock PredictionHidden Markov Model & Stock Prediction
Hidden Markov Model & Stock Prediction
 
Understanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the NetworkUnderstanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the Network
 
Data Modeling using Symbolic Regression
Data Modeling using Symbolic RegressionData Modeling using Symbolic Regression
Data Modeling using Symbolic Regression
 
Learning from "Effective Scala"
Learning from "Effective Scala"Learning from "Effective Scala"
Learning from "Effective Scala"
 
Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?
 
Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.
Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.
Java/Scala Lab 2016. Александр Конопко: Машинное обучение в Spark.
 
Adaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersAdaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning Classifiers
 
Piazza 2 lecture
Piazza 2 lecturePiazza 2 lecture
Piazza 2 lecture
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
Using Deep Learning for Recommendation
Using Deep Learning for RecommendationUsing Deep Learning for Recommendation
Using Deep Learning for Recommendation
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 
How to integrate python into a scala stack
How to integrate python into a scala stackHow to integrate python into a scala stack
How to integrate python into a scala stack
 
Getting Started with Deep Learning using Scala
Getting Started with Deep Learning using ScalaGetting Started with Deep Learning using Scala
Getting Started with Deep Learning using Scala
 

Similar to Scala for Machine Learning

Fuel Up JavaScript with Functional Programming
Fuel Up JavaScript with Functional ProgrammingFuel Up JavaScript with Functional Programming
Fuel Up JavaScript with Functional ProgrammingShine Xavier
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
Extending Rotor with Structural Reflection to support Reflective Languages
Extending Rotor with Structural Reflection to support Reflective LanguagesExtending Rotor with Structural Reflection to support Reflective Languages
Extending Rotor with Structural Reflection to support Reflective Languagesfranciscoortin
 
Patterns in Python
Patterns in PythonPatterns in Python
Patterns in Pythondn
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency ConstructsTed Leung
 
MDE=Model Driven Everything (Spanish Eclipse Day 2009)
MDE=Model Driven Everything (Spanish Eclipse Day 2009)MDE=Model Driven Everything (Spanish Eclipse Day 2009)
MDE=Model Driven Everything (Spanish Eclipse Day 2009)Jordi Cabot
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx0567Padma
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
P Training Presentation
P Training PresentationP Training Presentation
P Training PresentationGaurav Tyagi
 
Introduction To Design Patterns
Introduction To Design PatternsIntroduction To Design Patterns
Introduction To Design Patternssukumarraju6
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
chapter 5 Objectdesign.ppt
chapter 5 Objectdesign.pptchapter 5 Objectdesign.ppt
chapter 5 Objectdesign.pptTemesgenAzezew
 
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Raffi Khatchadourian
 

Similar to Scala for Machine Learning (20)

Fuel Up JavaScript with Functional Programming
Fuel Up JavaScript with Functional ProgrammingFuel Up JavaScript with Functional Programming
Fuel Up JavaScript with Functional Programming
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
Extending Rotor with Structural Reflection to support Reflective Languages
Extending Rotor with Structural Reflection to support Reflective LanguagesExtending Rotor with Structural Reflection to support Reflective Languages
Extending Rotor with Structural Reflection to support Reflective Languages
 
10-DesignPatterns.ppt
10-DesignPatterns.ppt10-DesignPatterns.ppt
10-DesignPatterns.ppt
 
Patterns in Python
Patterns in PythonPatterns in Python
Patterns in Python
 
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency Constructs
 
MDE=Model Driven Everything (Spanish Eclipse Day 2009)
MDE=Model Driven Everything (Spanish Eclipse Day 2009)MDE=Model Driven Everything (Spanish Eclipse Day 2009)
MDE=Model Driven Everything (Spanish Eclipse Day 2009)
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
P Training Presentation
P Training PresentationP Training Presentation
P Training Presentation
 
Introduction To Design Patterns
Introduction To Design PatternsIntroduction To Design Patterns
Introduction To Design Patterns
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
chapter 5 Objectdesign.ppt
chapter 5 Objectdesign.pptchapter 5 Objectdesign.ppt
chapter 5 Objectdesign.ppt
 
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
 

More from Patrick Nicolas

Autonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersAutonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersPatrick Nicolas
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningPatrick Nicolas
 
AI for electronic health records
AI for electronic health recordsAI for electronic health records
AI for electronic health recordsPatrick Nicolas
 
Taxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingTaxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingPatrick Nicolas
 
Multi-tenancy in Private Clouds
Multi-tenancy in Private CloudsMulti-tenancy in Private Clouds
Multi-tenancy in Private CloudsPatrick Nicolas
 

More from Patrick Nicolas (6)

Autonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersAutonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformers
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
 
AI for electronic health records
AI for electronic health recordsAI for electronic health records
AI for electronic health records
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Taxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingTaxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads Targeting
 
Multi-tenancy in Private Clouds
Multi-tenancy in Private CloudsMulti-tenancy in Private Clouds
Multi-tenancy in Private Clouds
 

Recently uploaded

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATIONLakpaYanziSherpa
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 

Recently uploaded (20)

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 

Scala for Machine Learning

  • 1. Scala for Machine Learning Patrick Nicolas December 2014 patricknicolas.blogspot.com www.slideshare.net/pnicolas
  • 2. What challenges? Building scientific and machine learning applications requires …. 1. Clearly defined abstractions 2. Flexible, dynamic models 3. Scalable execution What makes Scala particularly suitable to solve machine learning and optimization problems? ... and may involve mathematician, data scientists, software engineers and dev. ops.
  • 3. Scala tool box Which elements in the Scala tool box are useful to meet these challenges? Actors Composed futures F-bound Reactive
  • 4. Abstraction Non-linear learning models <= functorial tensors Kernel monadic composition <= monads Extending library types <= implicits Flexibility Scalability
  • 5. Low dimension features space (manifold) embedded into an observation space (Euclidean) Abstraction: Non-linear learning models
  • 6. 𝑓(𝑥, 𝑦, 𝑧) 𝛻𝑓 = 𝜕𝑓 𝜕𝑥 𝑖 + 𝜕𝑓 𝜕𝑦 𝑗 + 𝜕𝑓 𝜕𝑧 𝑘 Each type of tensors is a category, associated with a functor category. • Field • Vector field (contravariant) • Inner product • Covariant vector field (one-form/map) • Tensor product ,exterior product , … < 𝑣, 𝑤 > = f 𝛼 𝑤 =< 𝑣, 𝑤 > 𝑇 𝑚 𝑛⨂𝑇𝑝 𝑞 𝑑𝑥𝑖 ∧ 𝑑𝑥𝑗 Tensor fields are geometric entities defining linear relation between vector fields, differential forms, scalars and other tensor fields Abstraction: Non-linear learning models
  • 7. Machine learning consists of identifying a low dimension features space, manifold within an Euclidean observations space. Computation of smooth manifolds relies on tensors and tensor metrics (Riemann, Laplace-Beltrami,…) Problem: How to represent tensors and metrics? Solution: Functorial representation of tensors, tensor products and differential forms. Abstraction: Non-linear learning models
  • 8. One option is to define a vector field as a collection (i.e. List) and leverage the functor for the list. Functor: f: U => V F(f): F(U) => F(V) Abstraction: Non-linear learning models Convenient but incorrect…
  • 9. Let’s define a generic vector field and covector fields types Abstraction: Non-linear learning models Define a tensor as a Higher kind being either a vector or a co-vector accessed through type projection.
  • 10. The functor for the vector field relies on the projection (Hom functor) of 2 argument type functor Tensor on covariant and contravariant types. Covariant Functor f: U => V F(f): F(U) => F(V) Abstraction: Non-linear learning models
  • 11. Contravariant functors are used for morphisms or transformation on Covariant tensors (type CoVField) Contravariant functor f: U => V F(f): F(V) => F(U) Abstraction: Non-linear learning models
  • 12. Product Functor Tensor metrics and products requires other type of functors … BiFunctor (*) Paul Phillips’ cats framework https://github.com/non/cats Abstraction: Non-linear learning models
  • 13. Abstraction: Kernel monadic composition Clustering or classifying observations entails computation of inner product of observations on the manifold Kernel functions are commonly used in training to separate classes of observations with a linear decision boundary (hyperplane). Problem: Building a model entails creating, composing and evaluating numerous kernels. Solution: Define kernels as a 1st class programming concept with monadic operations.
  • 14. Define a kernel function as the composition of 2 functions g o h 𝒦𝑓 𝐱, 𝐲 = 𝑔( 𝑖 ℎ(𝑥𝑖, 𝑦𝑖)) Abstraction: Kernel monadic composition We create a monad to generate any kind of kernel functions Kf, by composing their component g: g1 o g2 o … o gn o h
  • 15. A monad extends a functor with binding method (flatMap) The monadic implementation of the kernel function component h Abstraction: Kernel functions composition
  • 16. Declaration explicit kernel function 𝒦 𝐱, 𝐲 = 𝑒 − 1 2 𝐱−𝐲 𝜎 2 h: 𝑥, 𝑦 → 𝑥 − 𝑦 g: 𝑥 → 𝑒 − 1 2𝜎2( 𝑥)2 Polynomial kernel 𝒦 𝐱, 𝐲 = (1 + 𝐱. 𝐲) 𝑑 h: 𝑥, 𝑦 → 𝑥. 𝑦 g: 𝑥 → (1 + 𝑥) 𝑑 Abstraction: Kernel functions composition Radius basis function kernel
  • 17. Our monad is ready for composing any kind of explicit kernels on demand, using for-comprehension Abstraction: Kernel functions composition
  • 18. Notes • Quite often monads defines filtering capabilities (i.e. Scala collections). • Accidently, the for-comprehension closure can be also used to create dynamic workflow Abstraction: Kernel functions composition
  • 19. Abstraction: Extending library types Scala libraries classes cannot always be sub- classed. Wrapping library component in a helper class clutters the design. Implicit classes extends classes functionality without cluttering name spaces (alternative to type classes) The purpose of reusability goes beyond refactoring code. It includes leveraging existing well understood concepts and semantic.
  • 20. Data flow micro-router for successful and failed computation by transforming Try to Either with recovery and processing functions scala.util.Try[T] recover[U >: T](f: PartialFunction[Throwable, U]): Try[U] getOrElse[U >: T](f: () => U): U orElse[U :> T](f: () => Try[U]): Try[U] toEither[U](rec: () => U)(f: T => T): Either[U, T] Abstraction: Extending library types
  • 21. .. as applied to a normalization problem. 4 lines of Scala code to extend Try with Either concept. Abstraction: Extending library types
  • 22. Notes Abstraction: Extending library types • Type conversion such as toDouble, toFloat can be extended to deal rounding error or rendering precision • Creating a type class is a more generic (appropriate?) methodology to extends functionality of a closed model or framework. Is there a reason why Try in Scala standard library does not support conversion to Either ?
  • 23. Abstraction non-linear learning models <= functorial tensors Kernel monadic composition <= monads Extending library types <= implicits Flexibility Modeling <= Stackable traits Scalability
  • 24. Flexibility: modeling Building machine learning apps requires configurable, dynamic workflows that preserve the model formalism Leverage mixins, inheritance and abstract values to create models and weave data transformation. Factory design patterns have been used to model dynamic systems (GoF). Are they adequate to model dynamic workflow?
  • 25. Flexibility: modeling Traditional programming languages compare unfavorably to scientific related language such as R because their inability to follow a strict mathematical formalism: 1. Variable declaration 2. Model definition 3. Instantiation Scala stacked traits and abstract values preserve the core formalism of mathematical expressions.
  • 26. 𝑓 ∈ ℝ 𝑛 → ℝ 𝑛 𝑓 𝑥 = 𝑒 𝑥 𝑔 ∈ ℝ 𝑛 → ℝ ℎ = 𝑔𝑜𝑓 g 𝒙 = 𝑖 𝑥𝑖 Declaration Model Instantiation Flexibility: modeling
  • 27. Multiple models and algorithms are typically evaluated by weaving computation tasks. A learning platform is a framework that • Define computational tasks • Wires the tasks (data flow) • Deploys the tasks (*) Overcome limitation of monadic composition (3 level of dynamic binding…) (*) Actor-based deployment Flexibility: modeling
  • 28. Even the simplest workflow (model of data transformation) requires flexibility ….. Flexibility: modeling
  • 29. Data scientists should be able to 1. Given the objective of the computation, select the best sequence of module/tasks (i.e. Modeling: Preprocessing + Training + Validating) 2. Given the profile of data input, select the best data transformation for each module (i.e. Data preprocessing: Kalman, DFT, Moving average….) 3. Given the computing platform, select the best implementation for each data transformation (i.e. Kalman: KalmanOnAkka, Spark…) Flexibility: modeling
  • 30. Implementation of Preprocessing module Flexibility: modeling
  • 31. Implementation of Preprocessing module using discrete Fourier … and discrete Kalman filter Flexibility: modeling
  • 32. d d Preprocessing Loading Reducing Training Validating Preprocessor DFTFilter Kalman EM PCA SVM MLP Reducer Supervisor Clustering Clustering workflow = preprocessing task -> Reducing task Modeling workflow = preprocessing task -> model training task -> model validation Modeling Flexibility: modeling
  • 33. A simple clustering workflow requires a preprocessor & reducer. The computation sequence exec transform a time series of element of type U and return a time series of type W as option Flexibility: modeling
  • 34. A model is created by processing the original time series of type TS[T] through a preprocessor, a training supervisor and a validator Flexibility: modeling
  • 35. Putting all together for a conditional path execution … Flexibility: modeling 1
  • 36. Abstraction Non-linear learning models <= functorial tensors Kernel monadic composition <= monads Extending library types <= implicits Flexibility Modeling <= Stackable traits Scalability Dynamic programming <= tail recursion Online processing <= streams Data flow control <= back-pressure strategy
  • 37. Scalability: dynamic programming Many machine learning algorithms (HMM,RL, EM, MLP, …) relies on dynamic programming techniques Tail recursion is very efficient solution because it avoids the creation of new stack frames Choosing between iterative and recursive implementation of algorithms is a well-documented dilemma.
  • 38. Viterbi algorithm for hidden Markov Models The objective is to find the most likely sequence of states {qt} given a set of observations {Ot} and a λ-model Scalability: dynamic programming
  • 39. The algorithm recurses along the observations with N different states. Scalability: dynamic programming
  • 40. Relative performance of the recursion w/o tail elimination for the Viterbi algorithm given the number of observations Scalability: dynamic programming
  • 41. Scalability: online processing Some problems lend themselves to process very large data sets of unknown size for which the execution may have to be aborted or re-applied Streams reduce memory consumption by allocating and releasing chunk of data (or slice or time series) while allowing reuse of intermediate results. An increasing number of algorithms such as reinforcement training relies on online (or on-demand) training.
  • 42. X0 X1 ….... Xn ………. Xm Data stream 1 2𝑚 𝑦 𝑛 − 𝑓 𝒘|𝑥 𝑛 2 + 𝜆 𝒘 2 Garbage collector Allocate slice .take Release slice .drop Heap Traversal loss function Scalability: online processing The large data set is converted into a stream then broken down into manageable slices. The slices are instantiated, processed (i.e. loss function) and released back to the garbage collector, one at the time
  • 43. Slices of NOBS observations are allocated one at the time, (.take) processed, then released (.drop) at the time. Scalability: online processing
  • 44. The reference streamRef has to be weak, in order to have the slices garbage collected. Otherwise the memory consumption increases with each new batch of data. (*) Alternatives: define strmRef as a def or use StreamIterator Scalability: online processing
  • 45. Comparing list, stream and stream with weak references. Scalability: online processing Operating zone
  • 46. Notes: Iterators: • computation cannot not memoized. (“Iterators are the imperative version of streams”) • One element at a time • Non-recursive (tail elimination) Views: • No intermediate results preserved • One element at a time Stream iterators: • Lazy tails Scalability: online processing
  • 47. The execution of workflow may create a stream bottleneck, for slow tasks and overflow local buffers. A flow control mechanism handling back pressure on bounded mail boxes of upstream actors. Actors provides a very efficient and reliable way to deploy workflows and tasks over a large number of cores and hosts. Scalability: flow control
  • 48. Scalability: flow control Actor-based workflow has to consider • Cascading failures => supervision strategy • Cascading bottleneck => Mailbox back-pressure strategy Workers Router, Dispatcher, … Akka has reliable mechanism to handle failure. What about temporary disruptions?
  • 49. Scalability: flow control Messages passing scheme to process various data streams with transformations. Dataset Workers Controller Watcher Load-> Compute-> Bounded mailboxes <- GetStatus Status -> Completed->
  • 50. Worker actors processes data chunk msg.xt sent by the Controller with the transformation msg.fct Message sent by collector to trigger computation Scalability: flow control
  • 51. Watcher actor monitors messages queues report to collector with Status message. GetStatus message sent by the collector has no payload Scalability: flow control
  • 52. Controller creates the workers, bounded mailbox for each worker actor (msgQueues) and the watcher actor. Scalability: flow control
  • 53. The Controller loads the data sets per chunk upon receiving the message Load from the main program. It processes the results of the computation from the worker (Completed) and throttle the input to workers for each Status message. Scalability: flow control
  • 54. The Load message is implemented as a loop that create data chunk which size is adjusted according to the load computed by the watcher and forwarded to the controller, Status Scalability: flow control
  • 55. Simple throttle increases/decreases size of the batch of observations given the current load and specified watermark. Scalability: flow control Selecting faster/slower and less/more accurate version of algorithm can also be used in the regulation strategy
  • 56. Feedback control loop adjusts the size of the batches given the load in mail boxes and complexity of the computation Scalability: flow control
  • 57. • Feedback control loop should be smoothed (moving average, Kalman…) • A larger variety of data flow control actions such as adding more workers, increasing queue capacity, … • The watch dog should handle dead letters, in case of a failure of the feedback control or the workers. • Reactive streams introduced in Akka 2.2+ has a sophisticated TCP-based propagation and back pressure control flows Notes Scalability: flow control
  • 58. … and there is more There are many other Scala programming language constructs I found particularly intriguing as far as for machine learning is concerned … Reactive streams (TCP) Domain Specific Language Emulate ‘R’ language for scientists to use the application. Effective fault-tolerance & flow control mechanism Delimited continuation Save, restore, reuse computation states
  • 59. Donate to Apache software and Eclipse foundations Monads are Elephants J. Ivy – james-iry.blogspot.com/2007/10/monads-are-elephans-part2.html Extending the Cake pattern: Dependency injection in Scala A. Warski – www.warski.org/blog/2010/12/di-in-scala-cake-pattern Programming in Scala $12.5 Traits as stackable modification M. Odersky, M. Spoon, L. Venners - Artima 2008 Introducing Akka J. Boner - Typesafe 2012 www.slideshare.net/jboner/introducing-akka Scala in Machine Learning: $1 Getting started P. Nicolas – Packt publishing 2014 Exploring Akka Stream’s TCP Back Pressure: U. Peter – Xebia 2015 blog.xebia.com/2015/01/14/exploring-akka-streams-tcp-back-pressure/ References Cats functional library P. Phillips – https://github.com/non/cats

Editor's Notes

  1. Context of the presentation: The transition from Java and Python to Scala is not that easy: It goes beyond selecting Scala for its obvious benefits. - support functional concepts - leverage open source libraries and framework if needed - fast, distributed enough to handle large data sets Scala was the most logical choice. Scientific programming may very well involved different roles in a project: Mathematicians for formulas Data scientists for data processing and modeling Software engineering for implementation Dev. Ops and performance engineers for deployment in production In order to ease the pain, we tend to to learn/adopt Scala incrementally within a development team.. The problem is that you end up with an inconsistent code base with different levels of quality and the team developed a somewhat negative attitude toward the language. The solution is to select a list of problems or roadblocks (in our case Machine learning) and compare the solution in Scala with Java, Python ... (You sell the outcome not the process). Presentation A set of diverse Scala features or constructs the entire team agreed that Scala is a far better solution than Python or Java.
  2. There are really 3 dimensions to building machine learning apps. Abstraction is critical to encode or translate mathematical concepts Ability to create workflow is needed for chaining data transformation and reduction Scalable execution comes into play when deploying application in production How can the main players (Mathematician, data scientific, developers or dev. Ops.) leverage Scala in their daily tasks?
  3. Being an object oriented and functional language, Scala has a lot of features and powerful constructs to choose from…. Here is a list of some of the features of Scala that are particularly valuable for writing machine learning algorithms or building applications that use these algorithms.
  4. Let’s start with ability in Scala to represent or implement abstractions such as Mathematical concepts, equations Programming concepts already defined in the Scala standard library.
  5. Geometric entities on a differential (Riemann) manifold are defined as tensors. Tensor can be Covariant Contra variant A bilinear form such as Tensor product Inner product n-differential forms …
  6. A contravariant vector or vector field is associated with a covariant functor. A covariant vector (linear map V -> Field) is associated with a contravariant functor A Tensor product is a bilinear form associated with a BiFunctor (or Hom functor) An inner product is associated with a ProFunctor ……
  7. Mathematician and data scientists are faced with this problem.
  8. Covariant functors are used to define the morphisms or transformations on vectors fields (contravariant) with type VField[U] = List[U] where U is a field ( u(x, y,…)). If U and V are groups of fields f(xi) then F[U => V] := F[V] => F[V] Functors as defined in Scala are actually covariant functor as defined in algebraic topology.
  9. Contravariant functors are used to define the morphisms or transformations on covariant vectors (or linear maps) with type CoField[U,T] = U => T where U is a (Contravariant). A U is an type argument in a higher Kind CoFunctor, this functor has to be contravariant: F[U => V] := F[V] => F[V] Functors as defined in Scala are actually covariant functor as defined in algebraic topology. Note: The inversion of the arrow for the morphism f in the map method is equivalent to raising/lowering the index in the tensor (Einstein)
  10. Morphisms applied to each type of tensor which require a specific functors (ProductFunctor, Bifunctor, HOM functor….). As example the definition of the functors for morphisms on the inner product, ProdFunctor CoField dot VField => Set and the bi-functor BiFunctor. Paul Phillips has a great open source library of various type of functors, applicatives and monads.
  11. Machine learning is used to group or classify observations into clusters, classes, patterns…. Two observations are compared using a cosine (or inner product). The inner product works well for linear models (tangent plane on the manifold). For non-linear model the inner product (known as the tensor or Riemann metric) has to be projected to the manifold. The projection of the inner product (linear plane) to the manifold is mathematically resolved using the Laplace-Beltrami operator and the heat equation. The solution is a exponential map defined by a kernel function. I assume you are familiar with Monads are used for collection, error handling in Scala standard library. There is no need to know kernel to understand the benefits of using Monads in this particular case. A kernel function is used in a lot of supervised learning such as support vector machine and neural networks as well as dimension reduction, kernel principal components analysis as an example. The challenge is to create and reuse kernel functions during multiple training runs in an attempt to extract the fittest model from the data set. By 1st class programming element, I mean Kernel functions have to be defined as a class Kernel functions can be categorized by sharing basic group functions (or monadic operations)
  12. The purpose here is to generate and experiment with any kind of explicit kernels by defining and composing two g and h functions A function h operates on each feature or component of the vector A function g is the transformation of the dot product of the two vectors. The dot product is computed by applying the function to all the elements and compute the sum. The “dot” product K is computed by traversing the two observations (vector of features), computing the sum and finally applying the g transform. The variable type is the type of the function g (F1 = Double => Double)
  13. Once the functor is defined, the monad is created by adding the flatMap method. The monad, KFMonad which take a kernel function as argument is defined as an implicit class so kernel functions “inherits” the monadic methods. The map and flatMap transformation applies to the g function or transformation on the inner product. The flatMap method is implemented by creating a new Kernel and applying the transformation to only one of the component of the Kernel function: function h (in red). This “partial” monadic operation is good enough for building Kernel functions on the fly.
  14. Kernel functions that project the inner product to the manifold for non-linear models belong to the family of exponential functions Polynomial functions and Radius basis functions are two of the most commonly used kernel functions. Note: The source code is shown here to illustrate the fact that the implementation in any other language would be a lot more messy and won’t be able to fit in any of those slides.
  15. Finally we can chain flatMap to map to compose two kernel function and compute the dot/inner product of the resulting kernel function. Note that composed kernel function used the h function of the last invocation in the for instruction. The method does not expose the functor or KF classes that wraps the components of the kernel function.
  16. Implicit kernels that are defined iteratively or as a solution of equation or ODE such as Laplace-Beltrami operator on Manifold require a different class and monadic implementation The two filter functions in Scala standard library are filter and filterWith. For-comprehension is a “sugar” syntax that wraps a stack of flatMap and map invocation. It can be used to create any complex container or programming entity including dynamic workflows. You will be introduced to a different approach to create workflows by weaving transformation/reduction on datasets.
  17. We looked at the abstraction of mathematical concepts. What about abstraction of programming concepts? Let’s consider the problem of handling a computation error and recovery the workflow to its previous known stable state (transactional computation). There are two outcomes If the current step in the computation succeeds, the next computation stage is started If the step fails, the workflow rolls back to the previous known stable state. Can we leverage existing Scala constructs without the need to introduce a new pattern? Standard library classes are either final or implemented a sealed trait or abstract class. Creating a helper or utility class (reuse by composition) adds a new indirection and clutter the design.
  18. Here is the dilemma: On one hand, the concept of transactional computation (all or nothing) is better expressed with the Either class On the other hand, exceptions are implemented through the Try class (as in C++ and Java) We need to combine these two concepts (Try and Either) by extending “transparently” the Try class with Either. This is accomplished by wrapping Try with an implicit class containing toEither (Try -> Either) method.
  19. Let’s apply the implicit class Try2Either to a simple data transformation: normalization. If the denominator “sum” were too small, downstream computation would incur significant rounding errors. In this case, the normalization aborts and a recovery mechanism (instance of class Recover) is invoke. If the normalization proceeds without rounding errors, the next computation step (creating a log of the vector x) is triggered.
  20. As I mentioned previously, such a pattern can be used as a micro-router. I used the Try.toEither to roll back to the previous known stable state of the computation which is stored in file. Note: Delimited continuation seems would be an interesting alternative to investigate for transactional computation.
  21. The second requirement for a programming language (to be a good fit for implementing machine learning) is the ability to create dynamic and configurable data flows/workflows as a sequence of transformation/reduction on data.
  22. The first thing to come to mind in creating complex system from existing objects (or classes) is a factory design pattern. Design patterns have been introduced by the “Gang of four” in the eponymous “Design Patterns: Elements of Reusable Object-Oriented software” some 20 years ago… The list of factory design patterns includes Builder, Prototype, Factory method, Composite, Bridge and obviously Singleton. Those patterns are not very convenient for weaving data transformation (these transformation being defined as class or interface). This is where dependency injection popularized by the Spring framework comes into play. Beyond composition and inheritance, Scala enables us to implement and chain data transformations/reductions by stacking the traits that declare these transformations or reductions.
  23. We briefly mentioned that the for comprehension can be used to chain/stack data transformation. Dependency injection provides a very flexible approach to create workflows dynamically, sometimes referred as the Cake pattern. Note: As far the 3rd point, deployment of tasks, it usually involves a actor-based (non blocking) distributed architecture such as Akka and Spark. We will mention it briefly later in this presentation is introducing mailbox back-pressure mechanism.
  24. The implementation in Scala matches perfectly the universal mathematical formalism Here is another example Declaration variable 𝑥∈ℝ, 𝑦∈ℝ Declaration of model f(x,y)=𝑥+𝑦 Instantiation of variable 𝑥=5, 𝑦=7; 𝑓 5,7 =12
  25. Let’s look at the Training module/task as an example. The task of training a model is executed by a Supervisor instance that can be either a support vector machine or a multi-layer perceptron, in this simplistic case. Each of these two “supervisors” can have several implementations (single host, distributed through a low-latency network,…) Once defined, the modules are to be weaved/chained by making sure that output of a module/tasks matches the input of the subsequent task. Notes: The training module can be broken down further into generative and discriminative models. Real-world applications are significantly more complex and would include REST service, DAO to access relational database, caches…. The terms “module”, “tasks” or “computational tasks” are used interchangeably in this section.
  26. Let’s consider the Preprocessing module (or task) implemented as trait Preprocessing of a data set is performed by a processor of type Preprocessor that is defined at run time. Therefore the preprocessor has to be declared as an abstract value. The three preprocessors defined in the preprocessing modules are Kalman filter, Moving Average (MovAv) and Discrete Fourier filter (DFTF). Those 2 inner classes act as adapter or stub to the actual implementation of those algorithm.
  27. Here is an implementation of the Kalman and Discrete Fourier transform band-pass filter. The2 inner classes, Kalman and DFTF act as adapter or stub to the actual implementation of those algorithm. It allows the implementation may consist of multiple version. For instance filtering.Kalman is a trait with several implementation of the algorithm (single host, distributed, using Spark…) Such design allows to Select the type of preprocessing algorithm within the Preprocessing module or namespace Select the implementation of this particular type of algorithm/preprocessor in the filtering package
  28. Clustering workflow is created by weaving (or stacking) a preprocessor and a reducer. Modeling workflow is generated by stacking a preprocessor, a supervisor and a validator.
  29. From the data management perspective, Clustering implements two consecutive data transformations: preprocessing and dimension reduction.
  30. Modeling workflow is created by chaining an implementation of the filter, training and validator, all selected at run-time. Modeling is therefore implemented as a stack of 3 traits, each representing a transformation or reduction on data sets.
  31. Computational tasks related to machine learning can be complex and lengthy. The process should be able to select the appropriate date flow (or sequence of data transformation or reduction) at run time, according to the state of the computation. In this simple case, a clustering task is triggered if anomalyDetection is needed, training a model is launched otherwise. These conditional path execution are important for complex analysis or lengthy computation that require unattended execution (i.e. overnight or over the week-end). Note: The overriding of the abstract value for the Modeling workflow are omitted here for the sake of clarity Summary: This factory pattern operates on 3 level of componentization: Dynamic selection of 1- Workflow or sequence of tasks according to the objective of the computation (i.e. Clustering => Preprocessing) 2- Task processing algorithm according to the data (i.e. Preprocessing => Kalman filter) 3- Implementation of task processing according to the environment (i.e. Kalman filter => Implementation on Apache Spark)
  32. At last, Scala has to be performant and scalable to execute some lengthy computations.
  33. Let’s start with the ubiquitous tail recursion. Although tail recursion is widely known and used, I have to include in this presentation because it is essential to the implementation of dynamic programming techniques in Scala. Quite a few machine learning algorithm does not rely on a close, explicit, global minimization of the loss function (or maximization of the log likelihood). Dynamic programming techniques are therefore required for such classifier as Q-learning, hidden Markov model or back propagation neural network. For better or worse, recursion has the reputation of being an elegant but inefficient option for implement dynamic programming algorithm. Scala provides us with a viable alternative with tail recursion.
  34. The Viterbi algorithm is an excellent candidate for tail recursion because the number recursions (equal to the length of the sequence of states) is finite and reasonable and the number of parameters to be passed is small.
  35. Here is the skeleton of the implementation of the Viterbi algorithm that traverses a sequence of observations. This implementation is representative of implementation of dynamic programming algorithms in Scala: First invocation (on observation of index 0) initializes the context of the tail recursion (hidden states in our case) Subsequent invocations update the states parameters and recurse to the next observation (index t+1) Finally the last invocation on the last observation (index obs.size -1) updates the last state’ parameter, performs some cleaning activities if needed and exits. As a side note, using a closure for which the recursive call update a global values or class attribute will degrade the performance of the recursion.
  36. Simple performance test using Fibonacci sequence shows a 3 fold improvement. This “back of the envelop” performance test consists of Viterbi to compare the tail recursion and the recursion without tail elimination using 5 samples with increasing number of observations. Although it is not a rigorous and scientific finding, the tail recursion becomes very effective for test with more than 60 observations It would be interesting to compare the performance of those two recursions with the performance of an iterative function.
  37. Real-time streaming is becoming popular (i.e. Apache Spark streaming library, Akka reactive streams….). Short of using one of these frameworks, you can create a simple streaming mechanism for large data sets that require. Streams vs. Iterator: Iterator does not allow to dynamically select the chunk of memory or preserve it if necessary for future computation. It is not uncommon to have to train a model as labeled data becomes available (online training). In this case, the size of the data set can be very large and unknown. The processing of the data would result in high memory consumption and heavy load on the Garbage collector. Finally, the traversing the entire data sets (~ allocating large memory chunk) may not even needed as some computation may abort if some condition is met. Scala’s streams can help!
  38. Most of machine learning algorithms consists of minimizing a loss function or maximizing a likelihood. For each iteration (or recursion) the cumulative loss is computed and passed to the optimizer (i.e. Gradient descent or variant) to update the model parameters. In order to minimize the memory footprint, two actions have to take place Allocate slice of the data set (memory chunk) from the heap using take method. Release the memory chunk back to the Garbage collection through a drop method. Each slice of n observations is allocated, processed by the loss function then released back to the Garbage collector.
  39. The Loss class has a single method, exec, that traverses the stream. Once again, the loss is computed using a tail recursion. An observation is defined as y = f(x) where x the feature set containing for instance, age, ethnicity of patient, body temperature and y is the label value such as the diagnosed disease.. The tail recursion allocates the next slice of STEP observations through the take method, computes the lost, nextLoss, then drop the slice. The reference is recursively redefined as the reference to the remaining stream. The problem is that the garbage collector cannot reclaim the memory because the first reference to the stream is created outside the recursion. The solution is to declare the reference to the stream as weak so it chunks of memory associated to the slices/batches of observations already processed can be reclaimed.
  40. The reference of the stream is created as a Weak java reference to an instance created by the stream constructor, Stream.iterate. In this case, the weak reference has been used to show Java concepts are still relevant.
  41. Let’s compare the memory consumption of three strategies to compute the loss function on a very large dataset. A list A stream with standard reference A stream with weak reference In the first scenario, and as expected, the memory for the entire data set is allocated before processing. The memory requirement for the stream with a strong reference increases each time a new slice is instantiated, because the memory block is held by the reference to the original stream. Only the stream with a weak reference guarantees that only the memory for a slice of STEP observations is needed through the entire execution.
  42. Let’s compare the memory consumption of three strategies to compute the loss function on a very large dataset. A list A stream with standard reference A stream with weak reference In the first scenario, and as expected, the memory for the entire data set is allocated before processing. The memory requirement for the stream with a strong reference increases each time a new slice is instantiated, because the memory block is held by the reference to the original stream. Only the stream with a weak reference guarantees that only the memory for a slice of STEP observations is needed through the entire execution.
  43. We mentioned earlier that a learning platform requires implementing, wiring and deploying tasks. Akka or framework derived from Akka, are commonly used to deploy workflow for large datasets because of the immutability, non-blocking and supervision characteristic for actors. Scala/Akka actors are resilient because that are defined with hierarchical context for which an actor because a supervisor to other actors. In this slide, a router is a supervising actor to the workers and, depending on the selected strategy, is responsible of restarting a worker in case of failure. But, what about the case for which the load (number of messages in the mail box) for each actor increases indefinitely?
  44. The objective is to avoid bottleneck in the computation data flow which would result in overflowing actors’ mail box/local buffers. A strategy to control the flow (or back-pressure flow) is needed to regulate the data flow across all modules. This example use a back-pressure handling mechanism that consists of monitoring bounded mail boxes. This is a simplistic approach to flow control described for the sake of illustrating the concept. As we will see later, there is a far more effective mechanism to deal with back-pressure.
  45. In this example, an actor, Controller loads chunk of data, partitions and distributed across multiple Worker actors, along with a data transformation. Upon receiving the message ‘Compute’ the workers process data given a transformation function. The workers returns the processed data through a Completed message. The purpose of the watch dog actor, ‘Watcher’ is to monitor the utilization of mailbox and report it to the Controller This is a simple feedback control: 1- Watcher monitors the utilization of the mailbox (average length) 2- The controller adjust the size of each batch in the load message handler (throttling) 3- The workers process the next batch
  46. Let’s start with the Worker actor. The load on a worker depends on three variables 1- The amount of data to process 2- The complexity of the data transformation 3- The underlying system (cores, memory..) The controller provides the slice of data to be processed by the workers msg.xt as well as the data transformation msg.fct.
  47. Let’s look at our watch dog actor, Watcher: It computes the load as the average mailbox utilization and send it back to the controller through a Status message.
  48. As its name implies, the controller configure and manage the dynamic workflow and control the back pressure from the worker actors. As far as the configuration is concerned, the Controller generates a list of workers, the bounded mailboxes, msgQueues, for the workers and ultimately the watcher actor. The worker and watcher are created within the Controller context using the Akka actorOf constructor.
  49. As far as the management of data flow and feedback control loop, the Controller loads partition and distribute batch of data points to the worker actors (message: Load) processes the results of the computation in workers (message: Completed) - Throttle up or down the flow upon receiving the status on utilization of mail boxes from the watcher (message: Status)
  50. The composition of the messages processed by the controller are self-explanatory. It Adjusts the size of the next batch if required (throttle method) Extracts the next batch of data from the input stream Partition and distribute the batch across the worker actors Send the partition along with the data transformation to the workers.
  51. The implementation of the throttle method is rather simple. It takes the load computed by the watcher actor and the current batch size (number of data points to be processed) as input. It update the batch size using a simple ratio relative to the watermark. For instance if load is below the watermark, the batch size is increased.
  52. The bottom graph describes the throttle action and the complexity of data transformation. The complexity of the data transformation and has an impact on the load on workers. It varies from 0 (simple map operation) to 2 (complex data processing involving recursion or iterations). The throttle intensity ranges between -6 (rapid decrease of size of batches) and +6 (rapid increase of size of batches of data) The top graph displays the actual utilization of the mail boxes with capacity of 512 messages as regulated by the feedback control loop (executed by the controller).
  53. The deployment of a reactive data flow in production would require significant improvement on our Naïve model. The feedback control loop could be smoothed with a moving average technique or Kalman filter to avoid erratic behavior We would need to provide a larger range of options for control actions beside adjusting the size of data batches: increase of number of workers, mail box capacity, caching strategy, .. A fine grained set of actions reduces also the risk of instable systems. The watch dog should be able to handle dead letters in case of failure (mailbox overflowing) Reactive streams control the flow back pressure at the TCP connection level. It is far more accurate and responsive that mailbox utilization.
  54. The implementation of machine learning algorithm would benefit from few other interesting features of the Scala language, I have not used yet. DSL are used to introduce domain specific semantic in the library. It is also intensively used in reactive programming. As stated earlier, Akka reactive streams are an efficient approach to managing data flows. It would be particularly valuable to combine the flexibility of dependency injection with the efficiency of Akka streams and reactors.
  55. I hope you will find some of these references, useful