Save the princess

Save the princess!
Simon Belak

@sbelak

simon@metabase.com

We will build an AI to play a silly little game
by training a policy network deﬁned using
Cortex, using a hot new training algorithm we
will implement from the paper ﬁrst using
Neanderthal and then make massively
parallel using Onyx.

The game
• Find the shortest path to the princess

• Moves: up, down, left, right

• Don’t fall oﬀ the edge of the world

Computers playing
computer games

Reinforcement learning
• Interact with the environment [embodied cognition]

• Not a single solution but an action to take given environment
[model of the world + model of self, consciousness?]

• Learns via positive/negative feedback

Reinforcement learning:
how it’s usually done
Train a deep neural network using raw sensor
data, usually pixels (ie. no feature engineering)

population
mutate crossover
next generation
solution
jitter jitter … jitter
update
populate
sample weighted
Classic evolutionary algorithm Evolution strategies
combine weighted

Using ES to train a neural
network
Beneﬁts

• highly parallelizable

• more robust (less hyperparameters, more
stabile, doesn’t care about the properties of
reward function)

• can exploit structure

• less computationally expensive

Downsides

• takes longer to converge

• noise must lead to diﬀerent outcomes
Instead of backpropagation use ES on weights

Neanderthal
• Blazing fast matrix and linear algebra library

• Based on ATLAS and LAPACK

• Runs on CPUs and GPUs

• A study in writing eﬃcient code

• Somewhat terse API (fluokitten helps)

Onyx
a masterless, cloud scale, fault tolerant,
high performance distributed computation
system

Job =
[[:input :processing-1]
[:input :processing-2]
[:processing-1 :output-1]
[:processing-2 :output-2]]
[{:flow/from :input-stream
:flow/to [:process-adults]
:flow/predicate :my.ns/adult?
:flow/doc "Emits segment if an adult.”}]
workﬂow
+ ﬂow conditions
+ catalogue[{:onyx/name :add-5
:onyx/fn :my/adder
:onyx/type :function
:my/n 5
:onyx/params [:my/n]}
{:onyx/name :in
:onyx/plugin :onyx.plugin.core-async/input
:onyx/type :input
:onyx/medium :core.async
:onyx/batch-size batch-size
:onyx/max-peers 1
:onyx/doc "Reads segments from a core.async channel"}
{:onyx/name :out
:onyx/plugin :onyx.plugin.core-async/output
:onyx/type :output
:onyx/doc "Writes segments to a core.async channel"}]

[{:onyx/name :add-5
:onyx/fn :my/adder
:onyx/type :function
:my/n 5
:onyx/params [:my/n]}
{:onyx/name :in
:onyx/plugin :onyx.plugin.core-async/input
:onyx/type :input
:onyx/batch-size batch-size
:onyx/max-peers 1
:onyx/doc "Reads segments from a core.async channel"}
{:onyx/name :out
:onyx/plugin :onyx.plugin.core-async/output
:onyx/type :output
:onyx/doc "Writes segments to a core.async channel"}]
Job =
[[:input :processing-1]
[:input :processing-2]
[:processing-1 :output-1]
[:processing-2 :output-2]]
[{:flow/from :input-stream
:flow/to [:process-adults]
:flow/predicate :my.ns/adult?
:flow/doc "Emits segment if an adult.”}]
workﬂow
+ ﬂow conditions
+ catalogue
Describing computation
with data

in
update
out
monitor
populate

same channel
in
update
out
monitor
populate

accumulates state :(
in
update
out
monitor
populate

Resilience and handling
state
• Activity log

• Window and trigger states checkpointed

• Resume points (transfer state from job to job)

• Conﬁgurable ﬂux policies (continue/kill/recover)

Computation graphs are
a great way to structure
data processing code

Cortex
• Neural networks, regression and feature learning

• Clean idiomatic Clojure API

• Computation encoded as data (and makes good use of it)

• Uses core.matrix for heavy lifting

Encode princess = 1, hero = -1

Simulation
• Find the shortest path to the
princess

• Don’t fall oﬀ the edge of the world

Reward function
• Play the entire game (planning)

• Collect multiple playthoughts to lessen eﬀects of
randomness

Explore

Have fun

Go on an adventure!

Questions
Simon Belak

@sbelak

simon@metabase.com

Save the princess

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Save the princess

Similar to Save the princess (20)

More from Simon Belak

More from Simon Belak (20)

Recently uploaded

Recently uploaded (20)

Save the princess