MACHINE LEARNING
WITH SCALA
SUSAN ERALY
SKYMIND
ANDY
PETRELLA
DATA FELLAS
MACHINE LEARNING
Not Terminator! (despite our name)
Applications are everywhere.
 OCR
 Netflix recommendations
NEURAL NETS & DEEP LEARNING
A Perceptron/Neuron can be loosely
compared to a NAND gate
Non linear functions can be
constructed
With more compute and with big
data…
DEEPLEARNING4J
dl4j is the first commercial-grade,
open-source, distributed deep-learning
library written for Java and Scala.
Skymind, is it’s commercial support
arm
SCIENTIFIC COMPUTING & THE JVM
Problems when considering HPC
• Vectorization
• Array indexing, 32 bit address space
FULLY native backend
https://github.com/deeplearning4j/libn
d4j
JAVACPP OpenMP CUDA
MICRO-SERVICES + ML?
Kinda like micro-services
Reduce lock in
Take math, data cleaning, model
training, choosing algorithms ...
… and separate them
SGD: SERIAL VS. PARALLEL
Model
Training Data
Worker 1
Master
Partial
Model
Global Model
Worker 2
Partial Model
Worker N
Partial
Model
Split 1 Split 2 Split 3
…
MAP REDUCE VS. PARALLEL ITERATIVE
Input
Output
Processor Processor Processor
Superstep 1
Processor Processor
Superstep 2
. . .
Processor
NLP AND DL
• Topic Modeling/Sentiment Analysis
• Machine Translation
• Question Answer
NLP is hard
“The best part of the movie is the
end credits”
“It should have been a great
movie…”
RECURRENT NN
 Loops
 Temporal
behavior
 Used for
temporal
series
LSTMS
-The solution to exploding and vanishing
gradients
APPLICATIONS
- Sequence to Sequence
Credits, Andrej Karpathy
WORD2VEC
-Word embeddings that represent
meaning/context
King – Man + Woman ~ Queen
WORD2VEC
APPLICATIONS
- Sequence to Sequence
Second Lord:
They would be ruled after this chamber, and
my fair nues begun out of the fact, to be
conveyed,
Whose noble souls I'll have the heart of the
wars.
Clown:
Come, sir, I will make did behold your
worship.
VIOLA:
I'll drink it.
Credits, Andrej Karpathy
SENTIMENT ANALYSIS
Sentiment
Review
DATA FELLAS
Spark-Notebook is the only Scala based
notebook. It is scalable and enables
interactive work on Spark, Akka,
Cassandra, Kafka and can plot
interactive plots on any Scala type.
Data Fellas enables data-driven
business, bringing productivity to
data science in enterprise.

Machine Learning with Scala

  • 1.
    MACHINE LEARNING WITH SCALA SUSANERALY SKYMIND ANDY PETRELLA DATA FELLAS
  • 2.
    MACHINE LEARNING Not Terminator!(despite our name) Applications are everywhere.  OCR  Netflix recommendations
  • 3.
    NEURAL NETS &DEEP LEARNING A Perceptron/Neuron can be loosely compared to a NAND gate Non linear functions can be constructed With more compute and with big data…
  • 4.
    DEEPLEARNING4J dl4j is thefirst commercial-grade, open-source, distributed deep-learning library written for Java and Scala. Skymind, is it’s commercial support arm
  • 5.
    SCIENTIFIC COMPUTING &THE JVM Problems when considering HPC • Vectorization • Array indexing, 32 bit address space FULLY native backend https://github.com/deeplearning4j/libn d4j JAVACPP OpenMP CUDA
  • 6.
    MICRO-SERVICES + ML? Kindalike micro-services Reduce lock in Take math, data cleaning, model training, choosing algorithms ... … and separate them
  • 7.
    SGD: SERIAL VS.PARALLEL Model Training Data Worker 1 Master Partial Model Global Model Worker 2 Partial Model Worker N Partial Model Split 1 Split 2 Split 3 …
  • 8.
    MAP REDUCE VS.PARALLEL ITERATIVE Input Output Processor Processor Processor Superstep 1 Processor Processor Superstep 2 . . . Processor
  • 9.
    NLP AND DL •Topic Modeling/Sentiment Analysis • Machine Translation • Question Answer NLP is hard “The best part of the movie is the end credits” “It should have been a great movie…”
  • 10.
    RECURRENT NN  Loops Temporal behavior  Used for temporal series
  • 11.
    LSTMS -The solution toexploding and vanishing gradients
  • 12.
    APPLICATIONS - Sequence toSequence Credits, Andrej Karpathy
  • 13.
    WORD2VEC -Word embeddings thatrepresent meaning/context King – Man + Woman ~ Queen
  • 14.
  • 15.
    APPLICATIONS - Sequence toSequence Second Lord: They would be ruled after this chamber, and my fair nues begun out of the fact, to be conveyed, Whose noble souls I'll have the heart of the wars. Clown: Come, sir, I will make did behold your worship. VIOLA: I'll drink it. Credits, Andrej Karpathy
  • 16.
  • 17.
    DATA FELLAS Spark-Notebook isthe only Scala based notebook. It is scalable and enables interactive work on Spark, Akka, Cassandra, Kafka and can plot interactive plots on any Scala type. Data Fellas enables data-driven business, bringing productivity to data science in enterprise.