Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
DL4J at Workday Meetup
Next
Download to read offline and view in fullscreen.

3

Share

Download to read offline

DeepLearning4J and Spark: Successes and Challenges - François Garillot

Download to read offline

At the recent sold-out Spark & Machine Learning Meetup in Brussels, François Garillot of Skymind delivered a lightning talk called DeepLearning4J and Spark: Successes and Challenges.

Specifically, François offered a tour of the DeepLearning4J architecture intermingled with applications. He went over the main blocks of this deep learning solution for the JVM that includes GPU acceleration, a custom n-dimensional array library, a parallelized data-loading swiss army tool, deep learning and reinforcement learning libraries — all with an easy-access interface.

Along the way, he pointed out the strategic points of parallelization of computation across machines and gave insight on where Spark helps — and where it doesn't.

Related Books

Free with a 30 day trial from Scribd

See all

DeepLearning4J and Spark: Successes and Challenges - François Garillot

  1. 1. Deeplearning4J François Garillot, @huitseeker
  2. 2. Neural Networks & Deep Learning • graphical models w/ inputs and outputs • represents composition of differentiable functions • deep learning : expressivity exponential w.r.t depth
  3. 3. Interesting results • cat paper by Andrew Ng & Goole • AlexNet by Toronto • last week CNTK at speech recognition parity with humans
  4. 4. Industrial results • Autonomous Driving : Drive.ai, Comma.ai + the usual suspects • Drugs discovery : Deep Genomics (Frey) & Bayer • Predictive Maintenance : Thales, Bosch • optimistic pessimism (Moghimi, Manulife Financial Corp.)
  5. 5. DeepLearning in two steps : training, applying • training tends to require lots of data, (R) • but applying does not (embedded, etc). So that applying pre-trained models (Tensorframes) not the technical/business challenge. Enterprise : have lots of data yourself, what to apply ?
  6. 6. Benchmarks aren't distributed
  7. 7. Training, but how ? New Amazon GPU instances ?
  8. 8. Deep Learning Training • Facebook, Amazon, Google, Baidu, Microsoft have this distributed • But what if you’re not one of them ?
  9. 9. Training, but how ?
  10. 10. Distributing training • basically distributing SGD (R) • challenge is AllReduce Communication • Sparse updates, async communications
  11. 11. Deeplearning4J • the first commercial-grade, open-source, distributed deep- learning library written for Java and Scala • Skymind its commercial support arm
  12. 12. Scientific computing on the JVM • libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!) • JavaCPP: generates JNI bindings to your CPP libs • ND4J : numpy for the JVM, native superfast arrays • Datavec : one-stop interface to an NDArray • DeepLearning4J: orchestration, backprop, layer definition • ScalNet: gateway drug, inspired from (and closely following) Keras
  13. 13. Reinforcement learning
  14. 14. Killing the bottlenecks : generic • swappable net backend : netty -> aeron (Hi Lightbend !) • better support for binary data : big indexed tables Binary, columnar, off-heap • and more (Tamiya Onodera's group @ IBM Japan): http://www.slideshare.net/ishizaki/exploiting-gpus-in-spark
  15. 15. And if you don't care about Deep Learning ? • Spark-6442 : better linear algebra than breeze, please. (sparse, performant, Java-compatible, and an OK license) • SystemML got a best paper at VLDB'16, how about helping out on nd4j ? • ND4J only lacks sparse, but not for long ...
  16. 16. Questions ?
  • jmau2002

    Apr. 26, 2018
  • bunkertor

    Nov. 6, 2016
  • pvanimpe

    Nov. 3, 2016

At the recent sold-out Spark & Machine Learning Meetup in Brussels, François Garillot of Skymind delivered a lightning talk called DeepLearning4J and Spark: Successes and Challenges. Specifically, François offered a tour of the DeepLearning4J architecture intermingled with applications. He went over the main blocks of this deep learning solution for the JVM that includes GPU acceleration, a custom n-dimensional array library, a parallelized data-loading swiss army tool, deep learning and reinforcement learning libraries — all with an easy-access interface. Along the way, he pointed out the strategic points of parallelization of computation across machines and gave insight on where Spark helps — and where it doesn't.

Views

Total views

1,798

On Slideshare

0

From embeds

0

Number of embeds

691

Actions

Downloads

13

Shares

0

Comments

0

Likes

3

×