Recent presentation on deeplearning4j's new features as well as some underused features of the AI framework like arbiter,datavec's transform process and libnd4j.
2. What Is Skymind?
● Skymind is Red Hat for AI.
● The Skymind Intelligence Layer (SKIL) is RHEL:
An enterprise distribution backed by
commercial support.
● SKIL bundles libraries that Skymind built:
○ Deeplearning4j: Neural net configuration
○ ND4J: Scientific computing engine
○ DataVec: ETL tool for machine learning
○ Deep-learning model server w/REST API
● SKIL helps you train neural nets quickly and
get the maximum value from your hardware
3. Founded 2014
Funding $6.3M
Sales $1M+ in Q1 2017
$4-6M projected in 2017
Clients 12 enterprise clients
10,000+ open-source developers
160,000+ DL4J downloads/mo.
Staff 25
Company Overview
5. DL4J
Build, train, and deploy neural
networks on the JVM
RL4J
Reinforcement learning
algorithms on the JVM
ND4J
High-performance tensor library
for scientific computing
Skymind’s Open-Source Tools
Arbiter
Hyperparameter optimization for
neural networks
DataVec
Data ingestion, normalization, and
vectorization (ETL for ML)
Model Import
Import and deploy neural networks
trained from Keras, TensorFlow &
Theano
6. ParallelWrapper
● Single node parameter Averaging
● Use a datasetiterator but run multigpu
or just use all cores
● Also useful for testing parameter averaging on a single
node before going distributed
https://github.com/deeplearning4j/dl4j-examples/blob/maste
r/dl4j-cuda-specific-examples/src/main/java/org/deeplearnin
g4j/examples/multigpu/MultiGpuLenetMnistExample.java#L40
7. Datavec TransformProcess
● Persistable data pipelines
● Mainly useful for csv and log data right now
● Also comes with a transform process server for encasing
data transform as a service
● https://github.com/deeplearning4j/dl4j-examples/blob/mas
ter/datavec-examples/src/main/java/org/datavec/transfor
m/join/JoinExample.java
8. Nd4j indexing
● Use the static methods in NDArrayIndex
● Allows slicing of an array any way you could in numpy
● Boolean indexing also supports masking
● https://github.com/deeplearning4j/dl4j-examples/blob/mas
ter/nd4j-examples/src/main/java/org/nd4j/examples/Nd4jE
x6_BooleanIndexing.java
9. Arbiter
● Define a search space
● Supports Grid and Random search
● Comes with a GUI for looking at an overall search space
● https://github.com/deeplearning4j/dl4j-examples/tree/mas
ter/arbiter-examples
10. Nearest neighbors!
● We have VPTrees, QuadTrees,KDTrees,SPTrees
● Used in BarnesHutTsne and our nearest neighbors server
● Combined with an autoencoder variant it’s also a great way
of leveraging unsupervised algorithms (run nearest
neighbors on the representations from the neural net
11. Libnd4j
● Self contained c++ library
● Actually contains cuda kernels and openmp loops
for various types of algorithms (reductions,scans,..)
● Also contains bindings for blas and lapack
● https://github.com/deeplearning4j/libnd4j
12. Cudnn
● Dl4j actually comes with cudnn
● All you have to do is include it as a dependency
● We support lstms and the cnns
● It’s actually possible to use this on flink (just tricky to
setup)
13. ParallelInferece
● Thread pool oriented model serving
● Supports keras and dl4j models
● Useful for saturating your server also gets around
models not being thread safe
● Works similar to parallelwrapper
14. Nd4j Workspaces
● Our new cyclic memory management engine
● Allows you to turn off garbage collection
● We have seen a 3 to 5x speed improvement using it
● http://deeplearning4j.org/workspaces
15. Upcoming features
● Jumpy (our python interface) -
https://github.com/deeplearning4j/jumpy
● Autodiff
● Integration of our aeron based parameter server
● proper keras backend
16. Dl4j streaming
● Integrate with kafka and setup streaming jobs
● Needs some work and promotion but is a great way of
leveraging kafka for production
● Also supports spark streaming
● Flink contributions welcome
17. Transfer Learning
● Take pretrained models from keras and load them in to
dl4j and write a “fine tune configuration”
● Use this to build new image models quickly
● Use our model zoo and refine existing models
18. Nd4j Workspaces
● Our new cyclic memory management engine
● Allows you to turn off garbage collection
● We have seen a 3 to 5x speed improvement using it
● http://deeplearning4j.org/workspaces
19. Community showcase
● Apache opennlp is integrating arbiter and our LSTMS
● Apache Tika recently merged our
● Apache Flink wants to merge us for GPU support
● Tons of traction with the spring boot community