Build Deep Learning Models from Raw Data

How to Build Deep Learning Models
Josh Patterson
Smart Data Conference 2015

Presenter: Josh Patterson
Past
Research in Swarm Algorithms
Real-time optimization techniques in mesh sensor networks
TVA / NERC
Smartgrid, Sensor Collection, and Big Data
Cloudera
Today
Patterson Consulting
Skymind (Advisor)
josh@pattersonconsultingtn.com / @jpatanooga
Co-Founder of DL4J

Topics
• What is Deep Learning?
• What is DL4J?
• Enterprise Grade Deep Learning Workflows

We Want to be able to recognize
Handwriting
This is a Hard Problem

Automated Feature Engineering
• Deep Learning can be thought of as workflows for
automated feature construction
– Where previously we’d consider each stage in the
workflow as unique technique
• Many of the techniques have been around for
years
– But now are being chained together in a way that
automates exotic feature engineering
• As LeCun says:
– “machines that learn to represent the world”

These are the features learned at each neuron in a Restricted Boltzmann Machine
(RBMS)
These features are passed to higher levels of RBMs to learn more complicated things.
Part of the
“7” digit

Deep Learning Architectures
• Deep Belief Networks
– Most common architecture
• Convolutional Neural Networks
– State of the art in image classification
• Recurrent Networks
– Models sequences of input and output
• Recursive Networks
– Text / image
– Can break down scenes in images

DL4J
Next Generation Deep Learning with

DL4J
• “The Hadoop of Deep Learning”
– Command line driven
– Java, Scala, and Python APIs
– ASF 2.0 Licensed
• Java implementation
– Parallelization (Yarn, Spark)
– GPU support
• Also Supports multi-GPU per host
• Runtime Neutral
– Local
– Hadoop / YARN
– Spark
– AWS
• https://github.com/deeplearning4j/deeplearning4j
– Chat with us on Gitter:
• https://gitter.im/deeplearning4j/deeplearning4j

Issues in Machine Learning
• Data Gravity
– We need to process the data in workflows where the data lives
• If you move data you don’t have big data
– Even if the data is not “big” we still want simpler workflows
• Integration Issues
– Ingest, ETL, Vectorization, Modeling, Evaluation, and
Deployment issues
– Most ML tools are built with previous generation architectures
in mind
• Legacy Architectures
– Parallel iterative algorithm architectures are not common

DL4J Suite of Tools
• DL4J
– Main library for deep learning
• Canova
– Vectorization library
• ND4J
– Linear Algebra framework
– Swappable backends (JBLAS, GPUs):
• http://www.slideshare.net/agibsonccc/future-of-ai-on-the-jvm
• Arbiter
– Model evaluation and testing platform

DEEP LEARNING WORKFLOWS
Enterprise Grade

DL4J Core API
//setup the network
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder().l2(2e-4)
.l1(1e-1).optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).iterations(5)
.regularization(false)
.list(4).backprop(true).pretrain(false)
.layer(0,new DenseLayer.Builder().nIn(nIn).nOut(600).activation("relu")
.weightInit(WeightInit.XAVIER)
.build())
.layer(1, new DenseLayer.Builder().nIn(600).nOut(500).activation("relu")
.build())
.layer(2, new DenseLayer.Builder().nIn(500).nOut(400).activation("relu")
.build())
.layer(3,new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT)
.activation("softmax")
.nIn(400).nOut(5749).weightInit(WeightInit.XAVIER)
.build()).build();

DL4J and Parallelization
18
Model
Training Data
Worker 1
Master
Partial
Model
Global Model
Worker 2
Partial Model
Worker N
Partial
Model
Split 1 Split 2 Split 3
…
Traditional Serial Training Modern Parallel Engine
(Hadoop / Spark)

DL4J Spark / GPUs via API
public class SparkGpuExample {
public static void main(String[] args) throws Exception {
Nd4j.MAX_ELEMENTS_PER_SLICE = Integer.MAX_VALUE;
Nd4j.MAX_SLICES_TO_PRINT = Integer.MAX_VALUE;
// set to test mode
SparkConf sparkConf = new SparkConf()
.setMaster("local[*]").set(SparkDl4jMultiLayer.AVERAGE_EACH_ITERATION,"false")
.set("spark.akka.frameSize", "100")
.setAppName("mnist");
System.out.println("Setting up Spark Context...");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.momentum(0.9).iterations(10)
.weightInit(WeightInit.DISTRIBUTION).batchSize(10000)
.dist(new NormalDistribution(0, 1)).lossFunction(LossFunctions.LossFunction.RMSE_XENT)
.nIn(784).nOut(10).layer(new RBM())
.list(4).hiddenLayerSizes(600, 500, 400)
.override(3, new ClassifierOverride()).build();
System.out.println("Initializing network");
SparkDl4jMultiLayer master = new SparkDl4jMultiLayer(sc,conf);
DataSet d = new MnistDataSetIterator(60000,60000).next();
List<DataSet> next = d.asList();
JavaRDD<DataSet> data = sc.parallelize(next);
MultiLayerNetwork network2 = master.fitDataSet(data);
Evaluation evaluation = new Evaluation();
evaluation.eval(d.getLabels(),network2.output(d.getFeatureMatrix()));
System.out.println("Averaged once " + evaluation.stats());
INDArray params = network2.params();
Nd4j.writeTxt(params,"params.txt",",");
FileUtils.writeStringToFile(new File("conf.json"), network2.getLayerWiseConfigurations().toJson());
}
}

Turn on GPUs and Spark
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>dl4j-spark</artifactId>
<version>${dl4j.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-jcublas-7.0</artifactId>
<version>${nd4j.version}</version>
</dependency>

From Raw Data to Models
• We need to get data from a raw format into a
baseline raw vector
– Model the data
– Evaluate the Model
• Traditionally these are all tied together in one
tool
– But this is a monolithic pattern
– We’d like to apply the unix principles here
• The DL4J Suite of Tools lets us do this

Building Workflows From CLI
• We need to vectorize the data
– Possibly with some per column transformations
– Let’s use Canova
• We then need to build a deep learning model
over the data
– We’ll use the DL4J lib to do this
• Finally we’ll evaluate what happened
– This is where Arbiter comes in

Canova for Command Line Vectorization
• Library of tools to take
– Audio
– Video
– Image
– Text
– CSV data
• And convert the input data into vectors in a
standardized format
– Adaptable with custom input/output formats
• Open Source, ASF 2.0 Licensed
– https://github.com/deeplearning4j/Canova
– Part of DL4J suite

Vectorization with Canova
• Setup the configuration file
– Input Formats
– Output Formats
– Setup data types to vectorize
• Setup the schema transforms for the input
CSV data
• Generate the SVMLight vector data as the
output
– with the command line interface

Workflow Configuration (iris_conf.txt)
canova.input.header.skip=false
canova.input.statistics.debug.print=false
canova.input.format=org.canova.api.formats.input.impl.LineInputFormat
canova.input.directory=src/test/resources/csv/data/uci_iris_sample.txt
canova.input.vector.schema=src/test/resources/csv/schemas/uci/iris.txt
canova.output.directory=/tmp/iris_unit_test_sample.txt
canova.output.format=org.canova.api.formats.output.impl.SVMLightOutputFormat

Iris Canova Vector Schema
@RELATION UCIIrisDataset
@DELIMITER ,
@ATTRIBUTE sepallength NUMERIC !NORMALIZE
@ATTRIBUTE sepalwidth NUMERIC !NORMALIZE
@ATTRIBUTE petallength NUMERIC !NORMALIZE
@ATTRIBUTE petalwidth NUMERIC !NORMALIZE
@ATTRIBUTE class STRING !LABEL

Model UCI Iris From CLI
./bin/canova vectorize -conf /tmp/iris_conf.txt
File path already exists, deleting the old file before proceeding...
Output vectors written to: /tmp/iris_svmlight.txt
./bin/dl4j train –conf /tmp/iris_conf.txt
[ …log output… ]
./bin/arbiter evaluate –conf /tmp/iris_conf.txt
[ …log output… ]

Questions?
Thank you for your time and attention
“Deep Learning: A Practitioner’s Approach”
(Oreilly, October 2015)

Build Deep Learning Models from Raw Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Build Deep Learning Models from Raw Data

Similar to Build Deep Learning Models from Raw Data (20)

More from Josh Patterson

More from Josh Patterson (14)

Recently uploaded

Recently uploaded (20)

Build Deep Learning Models from Raw Data

Editor's Notes