Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine Learning in Rust with Leaf and Collenchyma

1,586 views

Published on

Machine Learning in Rust. A presentation about the Machine Intelligence Framework Leaf and the portable computation Framework Collenchyma by Autumn

Published in: Technology

Machine Learning in Rust with Leaf and Collenchyma

  1. 1. MACHINE LEARNING IN RUST WITH LEAF AND COLLENCHYMA RUST TALK BERLIN | Feb.2016 > @autumn_eng
  2. 2. “In machine learning, we seek methods by which the computer will come up with its own program based on examples that we provide.” MACHINE LEARNING ~ PROGRAMMING BY EXAMPLE [1]: http://www.cs.princeton.edu/courses/archive/spr08/cos511/scribe_notes/0204.pdf
  3. 3. AREAS OF ML [1]: http://cs.jhu.edu/~jason/tutorials/ml-simplex DOMAIN KNOWLEDGE LOTS OF DATA PROOF. TECHNIQUES BAYESIAN DEEP CLASSICAL INSIGHTFUL MODEL FITTING MODEL ANALYZABLE MODEL
  4. 4. > DEEP LEARNING. TON OF DATA + SMART ALGOS + PARALLEL COMP. [1]: http://www.andreykurenkov.com/writing/a-brief-history-of-neural-nets-and-deep-learning/
  5. 5. LARGE, SPARSE, HIGH-DIMENSIONAL DATASETS, LIKE IMAGES, AUDIO, TEXT, SENSORY & TIMESERIES DATA KEY CONCEPTS | DATA
  6. 6. > DEEP NEURAL NETWORK UNIVERSAL FUNCTION APPROXIMATOR, REPRESENTING HIRARCHICAL STRUCTURES IN LEARNED DATA KEY CONCEPTS | ALGORITHMS [1]: http://cs231n.github.io/neural-networks-1/
  7. 7. > DEEP NEURAL NETWORK KEY CONCEPTS | ALGORITHMS [image]: http://www.kdnuggets.com/wp-content/uploads/deep-learning.png
  8. 8. > BACKPROPAGATION COMPUTING GRADIENTS OF THE NETWORK THROUGH THE CHAIN RULE KEY CONCEPTS | ALGORITHMS [1]: http://cs231n.github.io/optimization-2/
  9. 9. KEY CONCEPTS | PARALLEL COMPUTATION > MULTI CORE DEVICES (GPUs) HIGH-DIMENSIONAL MATHEMATICAL OPERATIONS CAN BE EXECUTED MORE EFFICIENTLY ON SPECIAL-PURPOSE CHIPS LIKE GPUS OR FPGAS.
  10. 10. > COLLENCHYMA PORTABLE, PARALLEL, HPC IN RUST [1]: https://github.com/autumnai/collenchyma
  11. 11. SIMILAR PROJECTS: ARRAYFIRE (C++) COLLENCHYMA | OVERVIEW [1]: https://github.com/autumnai/collenchyma
  12. 12. COLLENCHYMA FUNDAMENTAL CONCEPTS 1. PORTABLE COMPUTATION | FRAMEWORKS/BACKEND 2. PLUGINS | OPERATIONS 3. MEMORY MANAGEMENT | SHAREDTENSOR
  13. 13. A COLLENCHYMA-FRAMEWORK DESCRIBES A COMPUTATIONAL LANGUAGE LIKE RUST, OPENCL, CUDA. A COLLENCHYMA-BACKEND DESCRIBES A SINGLE COMPUTATIONAL-CAPABLE HARDWARE (CPU, GPU, FPGA) WHICH IS ADDRESSABLE BY A FRAMEWORK. COLLENCHYMA | PORTABLE COMPUTATION
  14. 14. /// Defines a Framework. pub trait IFramework { /// Initializes a new Framework. /// /// Loads all the available hardwares fn new() -> Self where Self: Sized; /// Initializes a new Device from the provided hardwares. fn new_device(&self, &[Self::H]) -> Result<DeviceType, Error>; } /// Defines the main and highest struct of Collenchyma. pub struct Backend<F: IFramework> { framework: Box<F>, device: DeviceType, } COLLENCHYMA | PORTABLE COMPUTATION
  15. 15. // Initialize a CUDA Backend. let backend = Backend::<Cuda>::default().unwrap(); // Initialize a CUDA Backend - the explicit way let framework = Cuda::new(); let hardwares = framework.hardwares(); let backend_config = BackendConfig::new(framework, hardwares[0]); let backend = Backend::new(backend_config).unwrap(); COLLENCHYMA | PORTABLE COMPUTATION
  16. 16. COLLENCHYMA-PLUGINS ARE CRATES, WHICH EXTEND THE COLLENCHYMA BACKEND WITH FRAMEWORK AGNOSTIC, MATHEMATICAL OPERATIONS E.G. BLAS OPERATIONS COLLENCHYMA | OPERATIONS [1]: https://github.com/autumnai/collenchyma-blas [2]: https://github.com/autumnai/collenchyma-nn
  17. 17. /// Provides the functionality for a backend to support Neural Network related operations. pub trait NN<F: Float> { /// Initializes the Plugin. fn init_nn(); /// Returns the device on which the Plugin operations will run. fn device(&self) -> &DeviceType; } /// Provides the functionality for a Backend to support Sigmoid operations. pub trait Sigmoid<F: Float> : NN<F> { fn sigmoid(&self, x: &mut SharedTensor<F>, result: &mut SharedTensor<F>) -> Result<(), ::co::error::Error>; fn sigmoid_plain(&self, x: &SharedTensor<F>, result: &mut SharedTensor<F>) -> Result<(), ::co::error::Error>; fn sigmoid_grad(&self, x: &mut SharedTensor<F>, x_diff: &mut SharedTensor<F>) -> Result<(), ::co::error::Error>; fn sigmoid_grad_plain(&self, x: &SharedTensor<F>, x_diff: &SharedTensor<F>) -> Result<(), ::co::error::Error>; } COLLENCHYMA | OPERATIONS [1]: https://github.com/autumnai/collenchyma-nn/blob/master/src/plugin.rs
  18. 18. impl NN<f32> for Backend<Cuda> { fn init_nn() { let _ = CUDNN.id_c(); } fn device(&self) -> &DeviceType { self.device() } } impl_desc_for_sharedtensor!(f32, ::cudnn::utils::DataType::Float); impl_ops_convolution_for!(f32, Backend<Cuda>); impl_ops_sigmoid_for!(f32, Backend<Cuda>); impl_ops_relu_for!(f32, Backend<Cuda>); impl_ops_softmax_for!(f32, Backend<Cuda>); impl_ops_lrn_for!(f32, Backend<Cuda>); impl_ops_pooling_for!(f32, Backend<Cuda>); COLLENCHYMA | OPERATIONS [1]: https://github.com/autumnai/collenchyma-nn/blob/master/src/frameworks/cuda/mod.rs
  19. 19. COLLENCHYMA’S SHARED TENSOR IS A DEVICE- AND FRAMEWORK-AGNOSTIC MEMORY-AWARE, N- DIMENSIONAL STORAGE. COLLENCHYMA | MEMORY MANAGEMENT
  20. 20. OPTIMIZED FOR NON-SYNC USE CASE pub struct SharedTensor<T> { desc: TensorDesc, latest_location: DeviceType, latest_copy: MemoryType, copies: LinearMap<DeviceType, MemoryType>, phantom: PhantomData<T>, } COLLENCHYMA | MEMORY MANAGEMENT [1]: https://github.com/autumnai/collenchyma/blob/master/src/tensor.rs
  21. 21. fn sigmoid( &self, x: &mut SharedTensor<f32>, result: &mut SharedTensor<f32> ) -> Result<(), ::co::error::Error> { match x.add_device(self.device()) { _ => try!(x.sync(self.device())) } match result.add_device(self.device()) { _ => () } self.sigmoid_plain(x, result) } COLLENCHYMA | MEMORY MANAGEMENT [1]: https://github.com/autumnai/collenchyma-nn/blob/master/src/frameworks/cuda/helper.rs
  22. 22. fn main() { // Initialize a CUDA Backend. let backend = Backend::<Cuda>::default().unwrap(); // Initialize two SharedTensors. let mut x = SharedTensor::<f32>::new(backend.device(), &(1, 1, 3)).unwrap(); let mut result = SharedTensor::<f32>::new(backend.device(), &(1, 1, 3)).unwrap(); // Fill `x` with some data. let payload: &[f32] = &::std::iter::repeat(1f32).take(x.capacity()).collect::<Vec<f32>>(); let native = Backend::<Native>::default().unwrap(); x.add_device(native.device()).unwrap(); write_to_memory(x.get_mut(native.device()).unwrap(), payload); // Write to native host memory. x.sync(backend.device()).unwrap(); // Sync the data to the CUDA device. // Run the sigmoid operation, provided by the NN Plugin, on your CUDA enabled GPU. backend.sigmoid(&mut x, &mut result).unwrap(); // See the result. result.add_device(native.device()).unwrap(); // Add native host memory result.sync(native.device()).unwrap(); // Sync the result to host memory. println!("{:?}", result.get(native.device()).unwrap().as_native().unwrap().as_slice::<f32>()); } COLLENCHYMA | BRINGING IT TOGETHER [1]: https://github.com/autumnai/collenchyma#examples
  23. 23. > LEAF MACHINE INTELLIGENCE FRAMEWORK [1]: https://github.com/autumnai/leaf
  24. 24. SIMILAR PROJECTS: Torch (Lua), Theano (Python), Tensorflow (C++/Python), Caffe (C++) LEAF | OVERVIEW
  25. 25. Fundamental Parts Layers (based on Collenchyma plugins) Solvers
  26. 26. CONNECTED LAYERS FORM A NEURAL NETWORK BACKPROPAGATION VIA TRAITS => GRADIENT CALCULATION EASILY SWAPABLE LEAF | Layers
  27. 27. T = DATATYPE OF SHAREDTENSOR (e.g. f32). B = BACKEND /// A Layer that can compute the gradient with respect to its input. pub trait ComputeInputGradient<T, B: IBackend> { /// Compute gradients with respect to the inputs and write them into `input_gradients`. fn compute_input_gradient(&self, backend: &B, weights_data: &[&SharedTensor<T>], output_data: &[&SharedTensor<T>], output_gradients: &[&SharedTensor<T>], input_data: &[&SharedTensor<T>], input_gradients: &mut [&mut SharedTensor<T>]); } LEAF | Layers
  28. 28. POOLING LAYER EXECUTE ONE OPERATION OVER A REGION OF THE INPUT LEAF | Layers [image]: http://cs231n.github.io/convolutional-networks/
  29. 29. impl<B: IBackend + conn::Pooling<f32>> ComputeInputGradient<f32, B> for Pooling<B> { fn compute_input_gradient(&self, backend: &B, _weights_data: &[&SharedTensor<f32>], output_data: &[&SharedTensor<f32>], output_gradients: &[&SharedTensor<f32>], input_data: &[&SharedTensor<f32>], input_gradients: &mut [&mut SharedTensor<f32>]) { let config = &self.pooling_configs[0]; match self.mode { PoolingMode::Max => backend.pooling_max_grad_plain( output_data[0], output_gradients[0], input_data[0], input_gradients[0], config).unwrap() } }} LEAF | Layers
  30. 30. STOCHASTIC GRADIENT DESCENT REQUIRES BACKPROPAGATION MIGHT NOT FIND THE GLOBAL MINIMUM BUT WORKS FOR A HUGH NUBMER OF WEIGHTS LEAF | Solvers [image]: https://commons.wikimedia.org/wiki/File:Extrema_example_original.svg by Wikipedia user KSmrq
  31. 31. SDG WITH MOMENTUM: LEARNING RATE MOMENTUM OF HISTORY LEAF | PART 2
  32. 32. LEAF BRINGING ALL THE PARTS TOGETHER
  33. 33. > LIVE EXAMPLE - MNIST Dataset of 50_000 (training) + 10_000 (test) handwritten digits 28 x 28 px greyscale (8bit) images [image]: http://neuralnetworksanddeeplearning.com/chap1.html
  34. 34. We will use a single-layer perceptron LEAF | LIVE EXAMPLE [image]: http://neuralnetworksanddeeplearning.com/chap1.html
  35. 35. RUST MACHINE LEARNING IRC: #rust-machine-learning on irc.mozilla.org TWITTER: @autumn_eng http://autumnai.com

×