SlideShare a Scribd company logo
11
Clojure for Data Science
Mike Anderson
26 January 2016
2
Contents
 Why Clojure for Data Science
 Array Programming Essentials
 core.matrix
 Library Ecosystem Overview
 Examples and discussion
3
Why Clojure for Data Science
Attribute Clojure Python R Julia Scala Haskell JavaScript
Strong general
purpose language ✓ ✓ ✓ ✓ ✓
Functional language
✓ ✓ ✓
JVM Ecosystem
(Hadoop, Spark etc.) ✓ ✓
Near-native runtime
performance ✓ ✓ ✓ ✓
Dynamic language
✓ ✓ ✓ ✓ ✓
Client side execution
✓ ✓
“Code is Data”
✓
4
Contents
 Why Clojure for Data Science
 Array Programming Essentials
 core.matrix
 Library Ecosystem Overview
 Examples and discussion
5
Plug-in paradigms
Paradigm Exemplar language Clojure implementation
Functional
programming
Haskell
clojure.core
Meta-programming Lisp
Logic programming Prolog core.logic
Process algebras /
CSP
Go core.async
Array programming APL core.matrix
6
APL
Venerable
history
Has its own
keyboard
Interesting
perspective on
code
readability
 Notation invented in 1957 by Ken Iverson
 Implemented at IBM around 1960-64
life←{↑1 ⍵∨.∧3 4=+/,¯1 0
1∘.⊖¯1 0 1∘.⌽⊂⍵}
7
Modern array programming
Standalone environment for statistical
programming / graphics
Python library for array programming
A new language (2012) based on
array programming principles
.... and many others
8
"It is better to have 100 functions
operate on one data structure than
10 functions on 10 data structures."
—Alan Perlis
abstraction
Design wisdom
9
What is an array?
0 1 2
0 1 2
3 4 5
6 7 8
1
2
3
Dimensions Example
Vector
Matrix
3D Array
(3rd order Tensor)
Terminology
N ND Array
0 1 2
3 4 5
6 7 8
0 1 2
3 4 5
6 7 8
0 1 2
3 4 5
6 7 8
...
...
10
Multi-dimensional array properties
0 1 2
3 4 5
6 7 8
0
1
2
0 1 2
Dimension 0
Dimension 1
Dimensions
(ordered and
indexed)
Each of the array
elements is a
regular value
Dimension sizes
together define
the shape of the
array
(e.g. 3 x 3)
11
Arrays = data about relationships
(foo :A :T) => 2
0 1 2 3
4 5 6 7
8 9 10 11
:A
:B
:C
:R :S :T
Set X
Set Y
Each element is a
fact about a
relationship
between a value in
Set X and a value in
Set Y
ND array lookup is analogous to arity-N functions!
:U
12
Why arrays instead of functions?
0 1 2
3 4 5
6 7 8
0
1
2
0 1 2
vs. (fn [i j]
(+ j (* 3 i)))
1. Precomputed values with O(1) access
2. Efficient computation with optimised bulk
operations
3. Data driven representation
13
Principle of array programming:
generalise operations on regular (scalar) values to
multi-dimensional data
(+ 1 2) => 3
(+ ) => 2
14
Contents
 Why Clojure for Data Science
 Array Programming Essentials
 core.matrix
 Library Ecosystem Overview
 Examples and discussion
15
core.matrix
Array programming
as a language extension
for Clojure
(with a Data Science focus)
16
Expressivity
for (int i=0; i<n; i++) {
for (int j=0; j<m; j++) {
for (int k=0; k<p; k++) {
result[i][j][k] = a[i][j][k] + b[i][j][k];
}
}
}
Java
(mapv
(fn [a b]
(mapv
(fn [a b]
(mapv + a b))
a b))
a b)
(+ a b)
+ core.matrix
17
Elements of core.matrix
Abstraction
Coding with N-dimensional
arrays
Implementation
How is everything
implemented?
API
What can you do with
arrays?
18
API
19
Equivalence to Clojure vectors
Nested Clojure vectors of regular shape are arrays!
0 1 2
3 4 5
6 7 8
↔
[[0 1 2]
[3 4 5]
[6 7 8]]
0 1 2 [0 1 2]
↔
20
Array creation
;; Build an array from a sequence
(array (range 5))
=> [0 1 2 3 4]
;; ... or from nested arrays/sequences
(array
(for [i (range 3)]
(for [j (range 3)]
(str i j))))
=> [["00" "01" "02"]
["10" "11" "12"]
["20" "21" "22"]]
21
Shape
;; Shape of a 3 x 2 matrix
(shape [[1 2]
[3 4]
[5 6]])
=> [3 2]
;; Regular values have no shape
(shape 10.0)
=> nil
22
Dimensionality
;; Dimensionality = number of dimensions
;; = length of shape vector
;; = nesting level
(dimensionality [[1 2]
[3 4]
[5 6]])
=> 2
(dimensionality [1 2 3 4 5])
=> 1
;; Regular values have zero dimensionality
(dimensionality “Foo”)
=> 0
23
Scalars vs. arrays
(array? [[1 2] [3 4]])
=> true
(array? 12.3)
=> false
(scalar? [1 2 3])
=> false
(scalar? “foo”)
=> true
Everything is either an array or a scalar
A scalar works as like a 0-dimensional array
24
Indexed element access
0 1 2
3 4 5
6 7 8
0
1
2
0 1 2
Dimension 0
Dimension 1
(def M [[0 1 2]
[3 4 5]
[6 7 8]])
(mget M 1 2)
=> 5
25
Slicing access
0 1 2
3 4 5
6 7 8
0
1
2
0 1 2
Dimension 0
Dimension 1
(def M [[0 1 2]
[3 4 5]
[6 7 8]])
(slice M 1)
=> [3 4 5]
A slice of an array is itself an array!
26
Arrays as a composition of slices
(def M [[0 1 2]
[3 4 5]
[6 7 8]])
(slices M)
=> ([0 1 2] [3 4 5] [6 7 8])
(apply + (slices M))
=> [9 12 15]
0 1 2
3 4 5
6 7 8
0 1 2
3 4 5
6 7 8
slices
27
Operators
(use 'clojure.core.matrix.operators)
(+ [1 2 3] [4 5 6])
=> [5 7 9]
(* [1 2 3] [0 2 -1])
=> [0 4 -3]
(- [1 2] [3 4 5 6])
=> RuntimeException Incompatible shapes
(/ [1 2 3] 10.0)
=> [0.1 0.2 0.3]
28
Broadcasting scalars
(+ 1 1 )= ?
[[0 1 2]
[3 4 5]
[6 7 8]]
(+ 1 )=.
[[1 2 3]
[4 5 6]
[7 8 9]]
[[1 1 1]
[1 1 1]
[1 1 1]]
[[0 1 2]
[3 4 5]
[6 7 8]]
“Broadcasting”
29
Broadcasting arrays
(+ 1 )= ?
[[0 1 2]
[3 4 5]
[6 7 8]]
[2 1 0]
(+ 1 )=.
[[2 2 2]
[5 5 5]
[8 8 8]]
[[2 1 0]
[2 1 0]
[2 1 0]]
[[0 1 2]
[3 4 5]
[6 7 8]]
“Broadcasting”
30
Broadcasting Rules
1. Designed for elementwise operations
- other uses must be explicit
2. Extends shape vector by adding new leading
dimensions
• original shape [4 5]
• can broadcast to any shape [x y ... z 4 5]
• scalars can broadcast to any shape
3. Fills the new array space by duplication of the original
array over the new dimensions
4. Smart implementations can avoid making full copies by
structural sharing or clever indexing tricks
31
Functional operations on sequences
(map inc [1 2 3 4])
=> (2 3 4 5)map
(reduce * [1 2 3 4])
=> 24reduce
(seq [1 2 3 4])
=> (1 2 3 4)seq
32
Functional operations on arrays
(emap inc [[1 2]
[3 4]])
=> [[2 3]
[4 5]]
map ↔ emap
“element map”
(ereduce * [[1 2]
[3 4]])
=> 24
reduce ↔
ereduce
“element reduce”
(eseq [[1 2]
[3 4]])
=> (1 2 3 4)
seq ↔ eseq
“element seq”
33
Specialised matrix constructors
(zero-matrix 4 3) 0 0 0
0 0 0
0 0 0
0 0 0
(identity-matrix 4) 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
(permutation-matrix [3 1 0 2]) 0 0 0 1
0 1 0 0
1 0 0 0
0 0 1 0
34
Array transformations
(transpose )
0 1 2
3 4 5
0 3
1 4
2 5
Transposes reverses the order of all dimensions and indexes
35
Matrix multiplication
9 2 7
6 4 8
.
2 8
3 4
5 9
=
𝑎 𝑏
𝑐 𝑑
𝑎 = (9 ∗ 2) + (2 ∗ 3) + (7 ∗ 5)
𝑎 = 59
(mmul [[9 2 7] [6 4 8]]
[[2 8] [3 4] [5 9]])
=> [[59 143] [64 136]]
36
Geometry
(def π 3.141592653589793)
(def τ (* 2.0 π))
(defn rot [turns]
(let [a (* τ turns)]
[[ (cos a) (sin a)]
[(-(sin a)) (cos a)]]))
(mmul (rot 1/8) [3 4])
=> [4.9497 0.7071]
NB: See Tau Manifesto (http://tauday.com/) regarding
the use of Tau (τ)
45° =
1/8 turn
37
Mutability?
38
Mutability – the tradeoffs
Avoid mutability. But it’s an option if you really need it.
Pros Cons
 Faster
 Reduces GC pressure
 Standard in many existing
matrix libraries
✘ Mutability is evil
✘ Harder to maintain / debug
✘ Hard to write concurrent code
✘ Not idiomatic in Clojure
✘ Not supported by all core.matrix
implementations
✘ “Place Oriented Programming”
39
Mutability – performance benefit
28
120
0 50 100 150
Mutable add!
Immutable add
Time for addition of vectors* (ns)
* Length 10 double vectors, using :vectorz implementation
4x
performance benefit
40
Mutability – syntax
A core.matrix function name ending with “!” performs mutation
(usually on the first argument only)
(add [1 2] 1)
 [2 3]
(add! [1 2] 1)
=> RuntimeException ...... not mutable!
(def a (mutable [1 2])) ;; coerce to a mutable format
=> #<Vector2 [1.0,2.0]>
(add! a 1)
=> #<Vector2 [2.0,3.0]>
41
Implementation
42
Many Matrix libraries…
UJMP
ojAlgo
MTJ
javax.vecmath
43
44
Lots of trade-offs
Native Libraries vs. Pure JVM
Mutability vs. Immutability
Specialized elements (e.g.
doubles)
vs. Generalised elements (Object,
Complex)
Multi-dimensional vs. 2D matrices only
Memory efficiency vs. Runtime efficiency
Concrete types vs. Abstraction (interfaces / wrappers)
Specified storage format vs. Multiple / arbitrary storage formats
License A vs. License B
Lightweight (zero-copy) views vs. Heavyweight copying / cloning
45
What’s the best data structure?
0 1 2 3 .. 49Length 50 “range” vector:
2. Java double[] array
new double[]
{0, 1, 2, …. 49};
1. Clojure Vector
[0 1 2 …. 49]
3. Custom deftype
(deftype RangeVector
[^long start
^long end])
4. Native vector format
(org.jblas.DoubleMatrix.
params)
46
There is no spoon.
47
Secret weapon time!
48
Clojure Protocols
(defprotocol PSummable
"Protocol to support the summing of all elements in
an array. The array must hold numeric values only,
or an exception will be thrown."
(element-sum [m]))
clojure.core.matrix.protocols
1. Abstract Interface
2. Open Extension
3. Fast Dispatch
49
Protocols are fast and open
89
13.8
7.9
1.9
1.2
0 20 40 60 80 100
Multimethod*
Protocol call
Boxed function call
Primitive function call
Static / inlined code
Open extensionFunction call costs (ns)
✓
✓
✘
✘
✘
* Using class of first argument as dispatch function
50
Typical core.matrix call path
core.matrix
API
(matrix.clj)
(defn esum
"Calculates the sum of all the elements in a
numerical array."
[m]
(mp/element-sum m))
User Code
(esum [1 2 3 4])
Impl.
code
(extend-protocol mp/PSummable
SomeImplementationClass
(element-sum [a]
………))
51
Most protocols are optional
MANDATORY
Required for a working core.matrix implementation
PImplementation
PDimensionInfo
PIndexedAccess
PIndexedSetting
PMatrixEquality
PSummable
PRowOperations
PVectorCross
PCoercion
PTranspose
PVectorDistance
PMatrixMultiply
PAddProductMutable
PReshaping
PMathsFunctionsMutable
PMatrixRank
PArrayMetrics
PAddProduct
PVectorOps
PMatrixScaling
PMatrixOps
PMatrixPredicates
PSparseArray
…..
OPTIONAL
 Everything in the API will work without these
 core.matrix provides a “default implementation”
 Implement for improved performance
52
Default implementations
(extend-protocol mp/PSummable
Number
(element-sum [a] a)
Object
(element-sum [a]
(mp/element-reduce a +)))
clojure.core.matrix.impl.default
Protocol name - from namespace
clojure.core.matrix.protocols
Implementation for any Number
Implementation for an arbitrary Object
(assumed to be an array)
53
Extending a protocol
(extend-protocol mp/PSummable
(Class/forName "[D")
(element-sum [m]
(let [^doubles m m]
(areduce m i res 0.0 (+ res (aget m i))))))
Class to implement protocol for, in
this case a Java array : double[]
Optimised code to add up all the
elements of a double[] array
Add type hint to avoid reflection
54
15-20x
benefit
Speedup vs. default implementation
201
2859
3690
0 1000 2000 3000 4000
(esum v)
"Specialised"
(reduce + v)
(esum v)
"Default"
Timing for element sum of length 100 double array
(ns)
55
Internal Implementations
Implementation  Key Features
:persistent-vector  Support for Clojure vectors
 Immutable
 Not so fast, but great for quick testing
:double-array  Treats Java double[] objects as 1D arrays
 Mutable – useful for accumulating results etc.
:sequence  Treats Clojure sequences as arrays
 Mostly useful for interop / data loading
:ndarray
:ndarray-double
:ndarray-long
.....
 Google Summer of Code project by Dmitry Groshev
 Pure Clojure
 N-Dimensional arrays similar to NumPy
 Support arbitrary dimensions and data types
:scalar-wrapper
:slice-wrapper
:nd-wrapper
 Internal wrapper formats
 Used to provide efficient default implementations for
various protocols
56
NDArray
(deftype NDArrayDouble
[^doubles data
^int ndims
^ints shape
^ints strides
^int offset])
0 1 2
3 4 5 ? ? ? 0 1 2 ? ? 3 4 5 ?
offset
0
strides[1]
strides[0]
data
(Java array)
ndims = 2 shape = [2 3]
57
External Implementations
Implementation Key Features
vectorz-clj  Pure JVM (wraps Java Library Vectorz)
 Very fast, especially for vectors and small-medium
matrices
 Most mature core.matrix implementation at present
Clatrix  Use Native BLAS libraries by wrapping the Jblas library
 Very fast, especially for large 2D matrices
 Used by Incanter
parallel-colt-matrix  Wraps Parallel Colt library from Java
 Support for multithreaded matrix computations
arrayspace  Experimental
 Ideas around distributed matrix computation
 Builds on ideas from Blaze, Chapele, ZPL
image-matrix  Treats a Java BufferedImage as a core.matrix array
 Because you can?
58
Switching implementations
(array (range 5))
=> [0 1 2 3 4]
;; switch implementations
(set-current-implementation :vectorz)
;; create array with current implementation
(array (range 5))
=> #<Vector [0.0,1.0,2.0,3.0,4.0]>
;; explicit implementation usage
(array :persistent-vector (range 5))
=> [0 1 2 3 4]
59
Mixing implementations
(def A (array :persistent-vector (range 5)))
=> [0 1 2 3 4]
(def B (array :vectorz (range 5)))
=> #<Vector [0.0,1.0,2.0,3.0,4.0]>
(* A B)
=> [0.0 1.0 4.0 9.0 16.0]
(* B A)
=> #<Vector [0.0,1.0,4.0,9.0,16.0]>
core.matrix implementations can be mixed
(but: behaviour depends on the first argument)
60
Contents
 Why Clojure for Data Science
 Array Programming Essentials
 core.matrix
 Library Ecosystem Overview
 Examples and discussion
61
Data Science Libraries for Clojure
• Still not as mature as R or Python, but developing rapidly
• Clojure philosophy of small libraries rather than all-encompassing
frameworks
• Key areas:
• Interactive environments
• Visualisation
• Databases / data access
• Realtime data processing
• Machine Learning
62
Library Description
Incanter Fully featured analytical
environment (“R-like platform”)
gorilla-repl Notebook-style web-based
environment
Interactive environments
63
Library Description
quil Clojure interface to the Processing
library/environment for dynamic
visualisations
gyptis Clojure + ClojureScript library for
producing Vega.js graphs
imagez Library for generating and
manipulation bitmap images
Visualisation
64
Library Description
Datomic Awesome database supporting
immutable “time travel” over
database history. Great scalability
for reads / analytics
java.jdbc Clojure library for access to SQL
databases. Mature workhorse
Yesql Arguably better way to do SQL in
Clojure
Sparkling Clojure library for Apache Spark
flambo Clojure library for Apache Spark
Cascalog Clojure library for querying and data
processing with Apache Hadoop
many, many, more.....
Databases / data access
65
Library Description
Storm Mature, stream processing librray
for highly scalable realtime
computation over large distribute
clusters of compute nodes
Onyx More modern / better designed
alternative to Storm with growing
traction
core.async “Roll your own” concurrent data
processing pipelines
Realtime Data Processing
66
Library Description
clj-ml Wrapper for the popular and venerable “Weka”
machine learning library for Java
enclog Wrapper for the “Encog” machine learning library
Clortex /
Comportex
Libraries implementing Numenta’s Hierarchical
Temporary Memory model
synaptic Basic neural networks in Clojure
State of the art “Deep Learning” library
Machine Learning
67
Contents
 Why Clojure for Data Science
 Array Programming Essentials
 core.matrix
 Library Ecosystem Overview
 Examples and discussion
6868
Thank you
For more information about Datacraft, visit: www.datacraft.sg
69
Demo

More Related Content

What's hot

Python 2.5 reference card (2009)
Python 2.5 reference card (2009)Python 2.5 reference card (2009)
Python 2.5 reference card (2009)
gekiaruj
 
Nx tutorial basics
Nx tutorial basicsNx tutorial basics
Nx tutorial basics
Deepakshankar S
 
Compact and safely: static DSL on Kotlin
Compact and safely: static DSL on KotlinCompact and safely: static DSL on Kotlin
Compact and safely: static DSL on Kotlin
Dmitry Pranchuk
 
R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011
Mandi Walls
 
19. Java data structures algorithms and complexity
19. Java data structures algorithms and complexity19. Java data structures algorithms and complexity
19. Java data structures algorithms and complexity
Intro C# Book
 
core.logic introduction
core.logic introductioncore.logic introduction
core.logic introduction
Norman Richards
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In RubyRoss Lawley
 
Futures e abstração - QCon São Paulo 2015
Futures e abstração - QCon São Paulo 2015Futures e abstração - QCon São Paulo 2015
Futures e abstração - QCon São Paulo 2015
Leonardo Borges
 
Coscup2021-rust-toturial
Coscup2021-rust-toturialCoscup2021-rust-toturial
Coscup2021-rust-toturial
Wayne Tsai
 
Coscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usageCoscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usage
Wayne Tsai
 
Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”
Platonov Sergey
 
Python Cheat Sheet
Python Cheat SheetPython Cheat Sheet
Python Cheat Sheet
GlowTouch
 
The Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unificationThe Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unificationNorman Richards
 
Why Haskell
Why HaskellWhy Haskell
Why Haskell
Susan Potter
 
Building a website in Haskell coming from Node.js
Building a website in Haskell coming from Node.jsBuilding a website in Haskell coming from Node.js
Building a website in Haskell coming from Node.js
Nicolas Hery
 
Swift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML PersonalizationSwift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML Personalization
Jacopo Mangiavacchi
 
The Ring programming language version 1.7 book - Part 39 of 196
The Ring programming language version 1.7 book - Part 39 of 196The Ring programming language version 1.7 book - Part 39 of 196
The Ring programming language version 1.7 book - Part 39 of 196
Mahmoud Samir Fayed
 

What's hot (20)

Oop lecture9 13
Oop lecture9 13Oop lecture9 13
Oop lecture9 13
 
Python 2.5 reference card (2009)
Python 2.5 reference card (2009)Python 2.5 reference card (2009)
Python 2.5 reference card (2009)
 
Nx tutorial basics
Nx tutorial basicsNx tutorial basics
Nx tutorial basics
 
Compact and safely: static DSL on Kotlin
Compact and safely: static DSL on KotlinCompact and safely: static DSL on Kotlin
Compact and safely: static DSL on Kotlin
 
R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011
 
Hadoop + Clojure
Hadoop + ClojureHadoop + Clojure
Hadoop + Clojure
 
19. Java data structures algorithms and complexity
19. Java data structures algorithms and complexity19. Java data structures algorithms and complexity
19. Java data structures algorithms and complexity
 
Hw09 Hadoop + Clojure
Hw09   Hadoop + ClojureHw09   Hadoop + Clojure
Hw09 Hadoop + Clojure
 
core.logic introduction
core.logic introductioncore.logic introduction
core.logic introduction
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In Ruby
 
Futures e abstração - QCon São Paulo 2015
Futures e abstração - QCon São Paulo 2015Futures e abstração - QCon São Paulo 2015
Futures e abstração - QCon São Paulo 2015
 
Coscup2021-rust-toturial
Coscup2021-rust-toturialCoscup2021-rust-toturial
Coscup2021-rust-toturial
 
Coscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usageCoscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usage
 
Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”
 
Python Cheat Sheet
Python Cheat SheetPython Cheat Sheet
Python Cheat Sheet
 
The Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unificationThe Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unification
 
Why Haskell
Why HaskellWhy Haskell
Why Haskell
 
Building a website in Haskell coming from Node.js
Building a website in Haskell coming from Node.jsBuilding a website in Haskell coming from Node.js
Building a website in Haskell coming from Node.js
 
Swift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML PersonalizationSwift for TensorFlow - CoreML Personalization
Swift for TensorFlow - CoreML Personalization
 
The Ring programming language version 1.7 book - Part 39 of 196
The Ring programming language version 1.7 book - Part 39 of 196The Ring programming language version 1.7 book - Part 39 of 196
The Ring programming language version 1.7 book - Part 39 of 196
 

Viewers also liked

3 years with Clojure
3 years with Clojure3 years with Clojure
3 years with Clojure
Michael Klishin
 
Doing data science with Clojure
Doing data science with ClojureDoing data science with Clojure
Doing data science with Clojure
Simon Belak
 
Doing data science with Clojure
Doing data science with ClojureDoing data science with Clojure
Doing data science with Clojure
Simon Belak
 
Spec + onyx
Spec + onyxSpec + onyx
Spec + onyx
Simon Belak
 
Winning the Erlang Edit•Build•Test Cycle
Winning the Erlang Edit•Build•Test CycleWinning the Erlang Edit•Build•Test Cycle
Winning the Erlang Edit•Build•Test Cycle
Rusty Klophaus
 
Clojure made-simple - John Stevenson
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John Stevenson
JAX London
 
20 reasons why we don't need architects (@pavlobaron)
20 reasons why we don't need architects (@pavlobaron)20 reasons why we don't need architects (@pavlobaron)
20 reasons why we don't need architects (@pavlobaron)
Pavlo Baron
 
Elixir talk
Elixir talkElixir talk
Elixir talk
Cory Gwin
 
Messaging With Erlang And Jabber
Messaging With  Erlang And  JabberMessaging With  Erlang And  Jabber
Messaging With Erlang And Jabberl xf
 
Erlang - Because s**t Happens by Mahesh Paolini-Subramanya
Erlang - Because s**t Happens by Mahesh Paolini-SubramanyaErlang - Because s**t Happens by Mahesh Paolini-Subramanya
Erlang - Because s**t Happens by Mahesh Paolini-Subramanya
Hakka Labs
 
What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)
Pavlo Baron
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012
Eonblast
 
NDC London 2014: Erlang Patterns Matching Business Needs
NDC London 2014: Erlang Patterns Matching Business NeedsNDC London 2014: Erlang Patterns Matching Business Needs
NDC London 2014: Erlang Patterns Matching Business Needs
Torben Hoffmann
 
Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)
Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)
Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)
Howard Lewis Ship
 
From Perl To Elixir
From Perl To ElixirFrom Perl To Elixir
From Perl To Elixir
Ruben Amortegui
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
thnetos
 
Elixir for aspiring Erlang developers
Elixir for aspiring Erlang developersElixir for aspiring Erlang developers
Elixir for aspiring Erlang developers
Torben Dohrn
 
Introduction to Erlang for Python Programmers
Introduction to Erlang for Python ProgrammersIntroduction to Erlang for Python Programmers
Introduction to Erlang for Python Programmers
Python Ireland
 

Viewers also liked (20)

3 years with Clojure
3 years with Clojure3 years with Clojure
3 years with Clojure
 
Doing data science with Clojure
Doing data science with ClojureDoing data science with Clojure
Doing data science with Clojure
 
Doing data science with Clojure
Doing data science with ClojureDoing data science with Clojure
Doing data science with Clojure
 
Spec + onyx
Spec + onyxSpec + onyx
Spec + onyx
 
Winning the Erlang Edit•Build•Test Cycle
Winning the Erlang Edit•Build•Test CycleWinning the Erlang Edit•Build•Test Cycle
Winning the Erlang Edit•Build•Test Cycle
 
Clojure made-simple - John Stevenson
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John Stevenson
 
20 reasons why we don't need architects (@pavlobaron)
20 reasons why we don't need architects (@pavlobaron)20 reasons why we don't need architects (@pavlobaron)
20 reasons why we don't need architects (@pavlobaron)
 
Elixir talk
Elixir talkElixir talk
Elixir talk
 
Messaging With Erlang And Jabber
Messaging With  Erlang And  JabberMessaging With  Erlang And  Jabber
Messaging With Erlang And Jabber
 
Erlang - Because s**t Happens by Mahesh Paolini-Subramanya
Erlang - Because s**t Happens by Mahesh Paolini-SubramanyaErlang - Because s**t Happens by Mahesh Paolini-Subramanya
Erlang - Because s**t Happens by Mahesh Paolini-Subramanya
 
High Performance Erlang
High  Performance  ErlangHigh  Performance  Erlang
High Performance Erlang
 
What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)
 
Clojure values
Clojure valuesClojure values
Clojure values
 
VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012VoltDB and Erlang - Tech planet 2012
VoltDB and Erlang - Tech planet 2012
 
NDC London 2014: Erlang Patterns Matching Business Needs
NDC London 2014: Erlang Patterns Matching Business NeedsNDC London 2014: Erlang Patterns Matching Business Needs
NDC London 2014: Erlang Patterns Matching Business Needs
 
Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)
Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)
Clojure: Functional Concurrency for the JVM (presented at Open Source Bridge)
 
From Perl To Elixir
From Perl To ElixirFrom Perl To Elixir
From Perl To Elixir
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
Elixir for aspiring Erlang developers
Elixir for aspiring Erlang developersElixir for aspiring Erlang developers
Elixir for aspiring Erlang developers
 
Introduction to Erlang for Python Programmers
Introduction to Erlang for Python ProgrammersIntroduction to Erlang for Python Programmers
Introduction to Erlang for Python Programmers
 

Similar to Clojure for Data Science

Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
aboma2hawi
 
Tutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.sTutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.s
Kurniawan susanto
 
Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with Clojure
John Stevenson
 
A gentle introduction to functional programming through music and clojure
A gentle introduction to functional programming through music and clojureA gentle introduction to functional programming through music and clojure
A gentle introduction to functional programming through music and clojure
Paul Lam
 
INTRODUCTION TO MATLAB session with notes
  INTRODUCTION TO MATLAB   session with  notes  INTRODUCTION TO MATLAB   session with  notes
INTRODUCTION TO MATLAB session with notes
Infinity Tech Solutions
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
Vinay Kumar
 
Brief intro to clojure
Brief intro to clojureBrief intro to clojure
Brief intro to clojure
Roy Rutto
 
Programming with matlab session 1
Programming with matlab session 1Programming with matlab session 1
Programming with matlab session 1
Infinity Tech Solutions
 
R tutorial for a windows environment
R tutorial for a windows environmentR tutorial for a windows environment
R tutorial for a windows environment
Yogendra Chaubey
 
Introduction to Matlab.pdf
Introduction to Matlab.pdfIntroduction to Matlab.pdf
Introduction to Matlab.pdf
ssuser43b38e
 
Big datacourse
Big datacourseBig datacourse
Big datacourse
Massimiliano Ruocco
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
Ivan Goloskokovic
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1SethTisue
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
showkat27
 
Introducción a Elixir
Introducción a ElixirIntroducción a Elixir
Introducción a Elixir
Svet Ivantchev
 
Full Stack Clojure
Full Stack ClojureFull Stack Clojure
Full Stack Clojure
Michiel Borkent
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
Albert Bifet
 

Similar to Clojure for Data Science (20)

Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
 
Tutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.sTutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.s
 
Tutorial matlab
Tutorial matlabTutorial matlab
Tutorial matlab
 
Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with Clojure
 
A gentle introduction to functional programming through music and clojure
A gentle introduction to functional programming through music and clojureA gentle introduction to functional programming through music and clojure
A gentle introduction to functional programming through music and clojure
 
Clojure intro
Clojure introClojure intro
Clojure intro
 
INTRODUCTION TO MATLAB session with notes
  INTRODUCTION TO MATLAB   session with  notes  INTRODUCTION TO MATLAB   session with  notes
INTRODUCTION TO MATLAB session with notes
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Brief intro to clojure
Brief intro to clojureBrief intro to clojure
Brief intro to clojure
 
Programming with matlab session 1
Programming with matlab session 1Programming with matlab session 1
Programming with matlab session 1
 
R tutorial for a windows environment
R tutorial for a windows environmentR tutorial for a windows environment
R tutorial for a windows environment
 
Introduction to Matlab.pdf
Introduction to Matlab.pdfIntroduction to Matlab.pdf
Introduction to Matlab.pdf
 
Big datacourse
Big datacourseBig datacourse
Big datacourse
 
Deep Learning meetup
Deep Learning meetupDeep Learning meetup
Deep Learning meetup
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
 
Introducción a Elixir
Introducción a ElixirIntroducción a Elixir
Introducción a Elixir
 
Full Stack Clojure
Full Stack ClojureFull Stack Clojure
Full Stack Clojure
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 

Recently uploaded

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 

Recently uploaded (20)

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 

Clojure for Data Science

  • 1. 11 Clojure for Data Science Mike Anderson 26 January 2016
  • 2. 2 Contents  Why Clojure for Data Science  Array Programming Essentials  core.matrix  Library Ecosystem Overview  Examples and discussion
  • 3. 3 Why Clojure for Data Science Attribute Clojure Python R Julia Scala Haskell JavaScript Strong general purpose language ✓ ✓ ✓ ✓ ✓ Functional language ✓ ✓ ✓ JVM Ecosystem (Hadoop, Spark etc.) ✓ ✓ Near-native runtime performance ✓ ✓ ✓ ✓ Dynamic language ✓ ✓ ✓ ✓ ✓ Client side execution ✓ ✓ “Code is Data” ✓
  • 4. 4 Contents  Why Clojure for Data Science  Array Programming Essentials  core.matrix  Library Ecosystem Overview  Examples and discussion
  • 5. 5 Plug-in paradigms Paradigm Exemplar language Clojure implementation Functional programming Haskell clojure.core Meta-programming Lisp Logic programming Prolog core.logic Process algebras / CSP Go core.async Array programming APL core.matrix
  • 6. 6 APL Venerable history Has its own keyboard Interesting perspective on code readability  Notation invented in 1957 by Ken Iverson  Implemented at IBM around 1960-64 life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}
  • 7. 7 Modern array programming Standalone environment for statistical programming / graphics Python library for array programming A new language (2012) based on array programming principles .... and many others
  • 8. 8 "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." —Alan Perlis abstraction Design wisdom
  • 9. 9 What is an array? 0 1 2 0 1 2 3 4 5 6 7 8 1 2 3 Dimensions Example Vector Matrix 3D Array (3rd order Tensor) Terminology N ND Array 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 ... ...
  • 10. 10 Multi-dimensional array properties 0 1 2 3 4 5 6 7 8 0 1 2 0 1 2 Dimension 0 Dimension 1 Dimensions (ordered and indexed) Each of the array elements is a regular value Dimension sizes together define the shape of the array (e.g. 3 x 3)
  • 11. 11 Arrays = data about relationships (foo :A :T) => 2 0 1 2 3 4 5 6 7 8 9 10 11 :A :B :C :R :S :T Set X Set Y Each element is a fact about a relationship between a value in Set X and a value in Set Y ND array lookup is analogous to arity-N functions! :U
  • 12. 12 Why arrays instead of functions? 0 1 2 3 4 5 6 7 8 0 1 2 0 1 2 vs. (fn [i j] (+ j (* 3 i))) 1. Precomputed values with O(1) access 2. Efficient computation with optimised bulk operations 3. Data driven representation
  • 13. 13 Principle of array programming: generalise operations on regular (scalar) values to multi-dimensional data (+ 1 2) => 3 (+ ) => 2
  • 14. 14 Contents  Why Clojure for Data Science  Array Programming Essentials  core.matrix  Library Ecosystem Overview  Examples and discussion
  • 15. 15 core.matrix Array programming as a language extension for Clojure (with a Data Science focus)
  • 16. 16 Expressivity for (int i=0; i<n; i++) { for (int j=0; j<m; j++) { for (int k=0; k<p; k++) { result[i][j][k] = a[i][j][k] + b[i][j][k]; } } } Java (mapv (fn [a b] (mapv (fn [a b] (mapv + a b)) a b)) a b) (+ a b) + core.matrix
  • 17. 17 Elements of core.matrix Abstraction Coding with N-dimensional arrays Implementation How is everything implemented? API What can you do with arrays?
  • 19. 19 Equivalence to Clojure vectors Nested Clojure vectors of regular shape are arrays! 0 1 2 3 4 5 6 7 8 ↔ [[0 1 2] [3 4 5] [6 7 8]] 0 1 2 [0 1 2] ↔
  • 20. 20 Array creation ;; Build an array from a sequence (array (range 5)) => [0 1 2 3 4] ;; ... or from nested arrays/sequences (array (for [i (range 3)] (for [j (range 3)] (str i j)))) => [["00" "01" "02"] ["10" "11" "12"] ["20" "21" "22"]]
  • 21. 21 Shape ;; Shape of a 3 x 2 matrix (shape [[1 2] [3 4] [5 6]]) => [3 2] ;; Regular values have no shape (shape 10.0) => nil
  • 22. 22 Dimensionality ;; Dimensionality = number of dimensions ;; = length of shape vector ;; = nesting level (dimensionality [[1 2] [3 4] [5 6]]) => 2 (dimensionality [1 2 3 4 5]) => 1 ;; Regular values have zero dimensionality (dimensionality “Foo”) => 0
  • 23. 23 Scalars vs. arrays (array? [[1 2] [3 4]]) => true (array? 12.3) => false (scalar? [1 2 3]) => false (scalar? “foo”) => true Everything is either an array or a scalar A scalar works as like a 0-dimensional array
  • 24. 24 Indexed element access 0 1 2 3 4 5 6 7 8 0 1 2 0 1 2 Dimension 0 Dimension 1 (def M [[0 1 2] [3 4 5] [6 7 8]]) (mget M 1 2) => 5
  • 25. 25 Slicing access 0 1 2 3 4 5 6 7 8 0 1 2 0 1 2 Dimension 0 Dimension 1 (def M [[0 1 2] [3 4 5] [6 7 8]]) (slice M 1) => [3 4 5] A slice of an array is itself an array!
  • 26. 26 Arrays as a composition of slices (def M [[0 1 2] [3 4 5] [6 7 8]]) (slices M) => ([0 1 2] [3 4 5] [6 7 8]) (apply + (slices M)) => [9 12 15] 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 slices
  • 27. 27 Operators (use 'clojure.core.matrix.operators) (+ [1 2 3] [4 5 6]) => [5 7 9] (* [1 2 3] [0 2 -1]) => [0 4 -3] (- [1 2] [3 4 5 6]) => RuntimeException Incompatible shapes (/ [1 2 3] 10.0) => [0.1 0.2 0.3]
  • 28. 28 Broadcasting scalars (+ 1 1 )= ? [[0 1 2] [3 4 5] [6 7 8]] (+ 1 )=. [[1 2 3] [4 5 6] [7 8 9]] [[1 1 1] [1 1 1] [1 1 1]] [[0 1 2] [3 4 5] [6 7 8]] “Broadcasting”
  • 29. 29 Broadcasting arrays (+ 1 )= ? [[0 1 2] [3 4 5] [6 7 8]] [2 1 0] (+ 1 )=. [[2 2 2] [5 5 5] [8 8 8]] [[2 1 0] [2 1 0] [2 1 0]] [[0 1 2] [3 4 5] [6 7 8]] “Broadcasting”
  • 30. 30 Broadcasting Rules 1. Designed for elementwise operations - other uses must be explicit 2. Extends shape vector by adding new leading dimensions • original shape [4 5] • can broadcast to any shape [x y ... z 4 5] • scalars can broadcast to any shape 3. Fills the new array space by duplication of the original array over the new dimensions 4. Smart implementations can avoid making full copies by structural sharing or clever indexing tricks
  • 31. 31 Functional operations on sequences (map inc [1 2 3 4]) => (2 3 4 5)map (reduce * [1 2 3 4]) => 24reduce (seq [1 2 3 4]) => (1 2 3 4)seq
  • 32. 32 Functional operations on arrays (emap inc [[1 2] [3 4]]) => [[2 3] [4 5]] map ↔ emap “element map” (ereduce * [[1 2] [3 4]]) => 24 reduce ↔ ereduce “element reduce” (eseq [[1 2] [3 4]]) => (1 2 3 4) seq ↔ eseq “element seq”
  • 33. 33 Specialised matrix constructors (zero-matrix 4 3) 0 0 0 0 0 0 0 0 0 0 0 0 (identity-matrix 4) 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 (permutation-matrix [3 1 0 2]) 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 0
  • 34. 34 Array transformations (transpose ) 0 1 2 3 4 5 0 3 1 4 2 5 Transposes reverses the order of all dimensions and indexes
  • 35. 35 Matrix multiplication 9 2 7 6 4 8 . 2 8 3 4 5 9 = 𝑎 𝑏 𝑐 𝑑 𝑎 = (9 ∗ 2) + (2 ∗ 3) + (7 ∗ 5) 𝑎 = 59 (mmul [[9 2 7] [6 4 8]] [[2 8] [3 4] [5 9]]) => [[59 143] [64 136]]
  • 36. 36 Geometry (def π 3.141592653589793) (def τ (* 2.0 π)) (defn rot [turns] (let [a (* τ turns)] [[ (cos a) (sin a)] [(-(sin a)) (cos a)]])) (mmul (rot 1/8) [3 4]) => [4.9497 0.7071] NB: See Tau Manifesto (http://tauday.com/) regarding the use of Tau (τ) 45° = 1/8 turn
  • 38. 38 Mutability – the tradeoffs Avoid mutability. But it’s an option if you really need it. Pros Cons  Faster  Reduces GC pressure  Standard in many existing matrix libraries ✘ Mutability is evil ✘ Harder to maintain / debug ✘ Hard to write concurrent code ✘ Not idiomatic in Clojure ✘ Not supported by all core.matrix implementations ✘ “Place Oriented Programming”
  • 39. 39 Mutability – performance benefit 28 120 0 50 100 150 Mutable add! Immutable add Time for addition of vectors* (ns) * Length 10 double vectors, using :vectorz implementation 4x performance benefit
  • 40. 40 Mutability – syntax A core.matrix function name ending with “!” performs mutation (usually on the first argument only) (add [1 2] 1)  [2 3] (add! [1 2] 1) => RuntimeException ...... not mutable! (def a (mutable [1 2])) ;; coerce to a mutable format => #<Vector2 [1.0,2.0]> (add! a 1) => #<Vector2 [2.0,3.0]>
  • 43. 43
  • 44. 44 Lots of trade-offs Native Libraries vs. Pure JVM Mutability vs. Immutability Specialized elements (e.g. doubles) vs. Generalised elements (Object, Complex) Multi-dimensional vs. 2D matrices only Memory efficiency vs. Runtime efficiency Concrete types vs. Abstraction (interfaces / wrappers) Specified storage format vs. Multiple / arbitrary storage formats License A vs. License B Lightweight (zero-copy) views vs. Heavyweight copying / cloning
  • 45. 45 What’s the best data structure? 0 1 2 3 .. 49Length 50 “range” vector: 2. Java double[] array new double[] {0, 1, 2, …. 49}; 1. Clojure Vector [0 1 2 …. 49] 3. Custom deftype (deftype RangeVector [^long start ^long end]) 4. Native vector format (org.jblas.DoubleMatrix. params)
  • 46. 46 There is no spoon.
  • 48. 48 Clojure Protocols (defprotocol PSummable "Protocol to support the summing of all elements in an array. The array must hold numeric values only, or an exception will be thrown." (element-sum [m])) clojure.core.matrix.protocols 1. Abstract Interface 2. Open Extension 3. Fast Dispatch
  • 49. 49 Protocols are fast and open 89 13.8 7.9 1.9 1.2 0 20 40 60 80 100 Multimethod* Protocol call Boxed function call Primitive function call Static / inlined code Open extensionFunction call costs (ns) ✓ ✓ ✘ ✘ ✘ * Using class of first argument as dispatch function
  • 50. 50 Typical core.matrix call path core.matrix API (matrix.clj) (defn esum "Calculates the sum of all the elements in a numerical array." [m] (mp/element-sum m)) User Code (esum [1 2 3 4]) Impl. code (extend-protocol mp/PSummable SomeImplementationClass (element-sum [a] ………))
  • 51. 51 Most protocols are optional MANDATORY Required for a working core.matrix implementation PImplementation PDimensionInfo PIndexedAccess PIndexedSetting PMatrixEquality PSummable PRowOperations PVectorCross PCoercion PTranspose PVectorDistance PMatrixMultiply PAddProductMutable PReshaping PMathsFunctionsMutable PMatrixRank PArrayMetrics PAddProduct PVectorOps PMatrixScaling PMatrixOps PMatrixPredicates PSparseArray ….. OPTIONAL  Everything in the API will work without these  core.matrix provides a “default implementation”  Implement for improved performance
  • 52. 52 Default implementations (extend-protocol mp/PSummable Number (element-sum [a] a) Object (element-sum [a] (mp/element-reduce a +))) clojure.core.matrix.impl.default Protocol name - from namespace clojure.core.matrix.protocols Implementation for any Number Implementation for an arbitrary Object (assumed to be an array)
  • 53. 53 Extending a protocol (extend-protocol mp/PSummable (Class/forName "[D") (element-sum [m] (let [^doubles m m] (areduce m i res 0.0 (+ res (aget m i)))))) Class to implement protocol for, in this case a Java array : double[] Optimised code to add up all the elements of a double[] array Add type hint to avoid reflection
  • 54. 54 15-20x benefit Speedup vs. default implementation 201 2859 3690 0 1000 2000 3000 4000 (esum v) "Specialised" (reduce + v) (esum v) "Default" Timing for element sum of length 100 double array (ns)
  • 55. 55 Internal Implementations Implementation  Key Features :persistent-vector  Support for Clojure vectors  Immutable  Not so fast, but great for quick testing :double-array  Treats Java double[] objects as 1D arrays  Mutable – useful for accumulating results etc. :sequence  Treats Clojure sequences as arrays  Mostly useful for interop / data loading :ndarray :ndarray-double :ndarray-long .....  Google Summer of Code project by Dmitry Groshev  Pure Clojure  N-Dimensional arrays similar to NumPy  Support arbitrary dimensions and data types :scalar-wrapper :slice-wrapper :nd-wrapper  Internal wrapper formats  Used to provide efficient default implementations for various protocols
  • 56. 56 NDArray (deftype NDArrayDouble [^doubles data ^int ndims ^ints shape ^ints strides ^int offset]) 0 1 2 3 4 5 ? ? ? 0 1 2 ? ? 3 4 5 ? offset 0 strides[1] strides[0] data (Java array) ndims = 2 shape = [2 3]
  • 57. 57 External Implementations Implementation Key Features vectorz-clj  Pure JVM (wraps Java Library Vectorz)  Very fast, especially for vectors and small-medium matrices  Most mature core.matrix implementation at present Clatrix  Use Native BLAS libraries by wrapping the Jblas library  Very fast, especially for large 2D matrices  Used by Incanter parallel-colt-matrix  Wraps Parallel Colt library from Java  Support for multithreaded matrix computations arrayspace  Experimental  Ideas around distributed matrix computation  Builds on ideas from Blaze, Chapele, ZPL image-matrix  Treats a Java BufferedImage as a core.matrix array  Because you can?
  • 58. 58 Switching implementations (array (range 5)) => [0 1 2 3 4] ;; switch implementations (set-current-implementation :vectorz) ;; create array with current implementation (array (range 5)) => #<Vector [0.0,1.0,2.0,3.0,4.0]> ;; explicit implementation usage (array :persistent-vector (range 5)) => [0 1 2 3 4]
  • 59. 59 Mixing implementations (def A (array :persistent-vector (range 5))) => [0 1 2 3 4] (def B (array :vectorz (range 5))) => #<Vector [0.0,1.0,2.0,3.0,4.0]> (* A B) => [0.0 1.0 4.0 9.0 16.0] (* B A) => #<Vector [0.0,1.0,4.0,9.0,16.0]> core.matrix implementations can be mixed (but: behaviour depends on the first argument)
  • 60. 60 Contents  Why Clojure for Data Science  Array Programming Essentials  core.matrix  Library Ecosystem Overview  Examples and discussion
  • 61. 61 Data Science Libraries for Clojure • Still not as mature as R or Python, but developing rapidly • Clojure philosophy of small libraries rather than all-encompassing frameworks • Key areas: • Interactive environments • Visualisation • Databases / data access • Realtime data processing • Machine Learning
  • 62. 62 Library Description Incanter Fully featured analytical environment (“R-like platform”) gorilla-repl Notebook-style web-based environment Interactive environments
  • 63. 63 Library Description quil Clojure interface to the Processing library/environment for dynamic visualisations gyptis Clojure + ClojureScript library for producing Vega.js graphs imagez Library for generating and manipulation bitmap images Visualisation
  • 64. 64 Library Description Datomic Awesome database supporting immutable “time travel” over database history. Great scalability for reads / analytics java.jdbc Clojure library for access to SQL databases. Mature workhorse Yesql Arguably better way to do SQL in Clojure Sparkling Clojure library for Apache Spark flambo Clojure library for Apache Spark Cascalog Clojure library for querying and data processing with Apache Hadoop many, many, more..... Databases / data access
  • 65. 65 Library Description Storm Mature, stream processing librray for highly scalable realtime computation over large distribute clusters of compute nodes Onyx More modern / better designed alternative to Storm with growing traction core.async “Roll your own” concurrent data processing pipelines Realtime Data Processing
  • 66. 66 Library Description clj-ml Wrapper for the popular and venerable “Weka” machine learning library for Java enclog Wrapper for the “Encog” machine learning library Clortex / Comportex Libraries implementing Numenta’s Hierarchical Temporary Memory model synaptic Basic neural networks in Clojure State of the art “Deep Learning” library Machine Learning
  • 67. 67 Contents  Why Clojure for Data Science  Array Programming Essentials  core.matrix  Library Ecosystem Overview  Examples and discussion
  • 68. 6868 Thank you For more information about Datacraft, visit: www.datacraft.sg

Editor's Notes

  1. When I say language extension, it is of course in the sense that Clojure seems to have this ability to absorb new paradigms just by plugging in new libraries. Clojure already stole many good pure functional programming techniques from languages like Haskell And of course we have the macro meta-programming capabilities from Lisp More recently we’ve got core.logic bringing in Logic programming, inspired by Prolog and miniKanren And core.async bringing in the Communicating Sequential Processes with some syntax similar to Go And core.matrix is designed very much in the same way, to provide array programming capabilities. And if we want to trace the roots of array programming, we can go all the way back to this language called APL
  2. When I say language extension, it is of course in the sense that Clojure seems to have this ability to absorb new paradigms just by plugging in new libraries. Clojure already stole many good pure functional programming techniques from languages like Haskell And of course we have the macro meta-programming capabilities from Lisp More recently we’ve got core.logic bringing in Logic programming, inspired by Prolog and miniKanren And core.async bringing in the Communicating Sequential Processes with some syntax similar to Go And core.matrix is designed very much in the same way, to provide array programming capabilities. And if we want to trace the roots of array programming, we can go all the way back to this language called APL
  3. About the same age as Lisp? First specified in 1958 Love the fact that it has its own keyboard, with all these symbols inspired by mathematical notation And you get some crazy code. Might seem like a bit of a dinosaur new
  4. Array programming has had quite a renaissance in recent years. This is because of the increasing important of data science and numerical computing in many fields - So we’ve seen languages like R that provide an environment for statistical computing Highlight value of paradigm – clearly a demand for these kind of numerical computing capabilities
  5. Start off with one of my favourite quotes, because it contains a pretty important insight. “It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures” There is of course one error here….. (click) We should of course be talking about an abstraction here, not a concrete data structure. A great example of this is the sequence abstraction in Clojure – there are literally hundreds of functions that operate on Clojure sequences. Because so many functions produce and consume sequences, it gives you many different ways to compose then together. And it’s more than just the clojure.core API: other code can build on the same abstraction, which means that the composability extends to any code you write that uses the same abstraction. It makes entire libraries composable. In some ways I think the key to building systems using simple, composable components is about having shared abstractions. We’ve taken this principle very much to heart in core.matrix, our abstraction of course is the array - more specifically the multi-dimensional array And the rest of core.matrix is really all about giving you a powerful set of composable operations you can do with arrays
  6. Overloaded terminology! - Vector = 1D array (maths / array programming sense) – Also a Clojure vector - Matrix: conventionally used to indicate a 2 dimensional numerical array, - Array: in the sense of the N-dimensional array, but also the specific concrete example of a Java array Dimensions: also overloaded! Here using in the sense of the number of dimensions in an array, but it’s also used to refer to the number of dimensions in a vector space, e.g. 3 dimensional Euclidean space. If we’re lucky it should be clear from the context what we’re talking about.
  7. Give you an idea about how general array programming can be – An array is a way of representing a function using data Instead of computing a value for each combination of inputs, we’re typically pre-computing all such values
  8. Give you an idea about how general array programming can be – An array is a way of representing a function using data Instead of computing a value for each combination of inputs, we’re typically pre-computing all such values
  9. Today I’m going to be talking about core.matrix, and it’s quite appropriate that I’m talking about it here today at the Clojure Conj because this project actually came about as a direct result of conversations I had with many people at last year’s Conj The focus of those discussions was very much about how we could make numerical computing better in Clojure. And the solution I’ve been working on over the past year along with a number of collaborators is core.matrix, which offers array programming as a language extension to Clojure
  10. Example of adding a 3D array. Java it’s just a big nested loop… Clojure you can do it with nested maps, which is a bit more of a functional style, but still you’ve got this three-level nesting With core.matrix it’s really simple. We just generalise + to arbitrary multi-dimensional arrays and it all just works Does conciseness matter? Well if you’re writing a lot of code manipulating arrays it’s going to save you quite a bit of time, but more importantly it makes it much easier to avoid errors. Very easy to get off-by-one errors in this kind of code. core.matrix gives you a nice DSL that does all the index juggling for you Also it helps you to be mentally much closer to the problem that you are modelling. You ideally want an API that reflects the way that you think about the problem you are solving.
  11. So today I’m going to talk about core.matrix with three different lenses First I want to talk about the abstraction – what are these arrays? Then I’m going to talk about the core.matrix API Implementation: how does this all work, some of the engineering choices we’ve made
  12. So lets talk about the core.matrix API. This isn’t going to be an exhaustive tour, but I’m going to highlight a few of the key features to give you a taste of what is possible
  13. One of the important API design objectives was to exploit the “natural equivalence of arrays to nested Clojure vectors”. 1D array is a Clojure vector, 2D array is like a vector of vectors Most things in the core.matrix API work with nested Clojure vectors. This is nice – gives a natural syntax, and great for dynamic, exploratory work at the REPL.
  14. The most fundamental attribute of an array is probably the shape
  15. The most fundamental attribute of an array is probably the shape
  16. Arrays are compositions of arrays! This is one of the best signs that you have a good abstraction: if the abstraction can be recursively defined as a composition of the same abstraction.
  17. So of course we have quite a few different functions that let you work with slices of arrays. Most useful is probably the slices function, which cuts an array into a sequence of its slices Pretty common to want to do this – imagine if each slice is a row in your data set
  18. We define array versions of the common mathematical operators. These use the same names as clojure.core You have to use the clojure.core.matrix.operators namespace if you want to use these names instead of the standard clojure.core operators
  19. Question: what should happen if we add a scalar number to an array? We have a feature called broadcasting, which allows a lower dimensional array to be treated as a higher dimensional array
  20. The idea of broadcasting also generalises to arrays! Here the semantics is the same, we just duplicate the smaller array to fill out the shape of the larger array
  21. So we have some rules for broadcasting Note that it only really makes sense for elementwise operations. You can broadcast arrays explicitly if you want to to, but it only happens automatically for elementwise operations at present. Can only add leading dimensions.
  22. So lets talk about some higher order functions Two of my favourite Clojure functions – map and reduce are extremely useful higher order functions
  23. So one of the interesting observations about array programming is that you can also see it as a generalisation of sequences in multiple dimensions, so it probably isn’t too surprising that many of the sequence functions in Clojure actually have a nice array programming equivalent emap is the equivalent of map, it maps a function over all elements of an array – the key difference is that is preserves the structure of the array so here we’re mapping over a 2x2 matrix, and therefore we get a 2x2 result ereduce is the equivalent of reduce over all elements eseq is a handy bridge between core.matrix arrays and regular Clojure sequences – it just returns all the elements of an array in order Note row-major ordering of eseq and ereduce
  24. Basically mutability is horrible. You should be avoiding it as much as you can But it turns out that it is needed in some cases – performance matters for numerical work Mutability OK for library implementers, e.g. accumulation of a result in a temporary array Once a value is constructed, shouldn’t be mutated any more
  25. Usually 4x performance benefit isn’t a big deal – unless it happens to be your bottleneck There are cases where it might be important: e.g. if you are crunching through a lot of data and need to add to some sort of accumulator…
  26. Mutability OK for library implementers, e.g. accumulation of a result in a temporary array Once a value is constructed, shouldn’t be mutated any more
  27. Clearly this is insane – why so many matrix libraries?
  28. This explains the problem. But doesn’t really help us….
  29. The point is – there isn’t ever going to be a perfect right answer when choosing a concrete data type to implement an abstraction. There are always going to be inherent advantages of different approaches
  30. Luckily we have a secret weapon, and I think this is actually what really distinguishes core.matrix from all other array programming systems
  31. Of course the secret weapon is Clojure protocols. Here’s an example – PSummable protocol is a very simple protocol that allows to to compute the sum of all values in an array Three things are important to know about First is that they define an abstract interface – which is exactly what we need to define operations that work on our array abstraction Secondly they feature open extension: which means that we can solve the expression problem and use protocols with arbitrary types – importantly, this includes types that weren’t written with the protocol in mind – e.g. arbitrary Java classes Third feature is really fast dispatch – which is important if we want to core.matrix to be useful in high performance situations.
  32. Protocols are really the “sweet spot” of being both fast and open We benchmarked a pretty wide variety of different function calls
  33. It’s easy to make a working core.matrix implementation! It’s more work if you want to make it perfom across the whole API But that’s OK because it can be done incrementally So hopefully this provides a smooth development path for core.matrix implementations to integrate
  34. The secret is having default implementations for all protocols, that get used if you haven’t extended the protocol for your particular type Note that the default implementation delegates to another protocol call – this is generally the case, ultimately all these protocol calls have to be implemented in terms of the lower-level mandatory protocols if we want them to work on any array.
  35. Value of a specialised implementation
  36. Makes some operations very efficient - For example if you want to transpose an NDArray, you just need to reverse the shape and reverse the strides.
  37. vectorz-clj: probably the best choice if you want general purpose double numerics clatrix: probably the best choice if you want linear algebra with big matrices
  38. Not only can you switch implementation: you can also mix them! Actually quite unique capability How do we do this? Provide generic coercion functionality – so implementations typically use this to coerce second argument to type of the first