Future of AI on the JVM: Microservices and Deep Learning

•Download as PPTX, PDF•

16 likes•12,685 views

This document discusses the future of artificial intelligence on the Java Virtual Machine (JVM). It outlines how machine learning frameworks are currently monolithic and make assumptions about data. The document proposes a micro-services approach to machine learning that separates out concerns like data pipelines, scoring, model training, and evaluation. This would help reduce lock-in and allow greater flexibility. It also discusses how new hardware like GPUs are better suited for deep learning and the role frameworks like Spark and Akka could play in distributed, real-time machine learning applications on the JVM.

Data & Analytics

Future of AI on the JVM
Scala Days Amsterdam 2015
Adam Gibson
Creator of Deeplearning4j (and 4s :)

What is AI?
● Not Terminator (despite our name)
● Many subfields
● Our focus: Machine learning

Problem Space
● Spam Classification
● Summarization
● Face Detection
● Eye Tracking
● Targeted Ads
● Recommendation Engines

Current State of ML
● Simpler models
● Most of industry barely uses Logistic Reg.
● Many problems are binary
o e.g. fraud, spam
● Some unsupervised (clustering, reccos)
● Lots of ML frameworks on JVM

ML Frameworks on JVM...
● Apache Mahout
● Spark’s MLlib
● Weka (is that R?)

Problems
● Monolithic
● Makes assumptions about data
● Hard to use
● No separation of concerns

Ring a Bell?
● We call that “Monolithic”
● Separate ML concerns:
Data Pipelines/Vectorization
Scoring
Model Training
Evaluation

Micro-Services + ML?
● Kinda like micro-services
● Reduce lock in
● Take math, data cleaning, model training,
choosing algorithms ...
● … and separate them

Math
● Parametric Models (Matrices!)
● Non Parametric (Random forest)
● Focusing on Matrices (the hard part of ML
systems)

Matrices
● NDArrays ( > 2d)
● Tensors (think of pages of matrices)
● Example: 2 x 2 x 2 (2 2x 2 matrices)
● ^^THIS IS UNCLEAR. Two 2 x 2 matrices?
● Applies to graphs w/ sparse representations

Chips/Hardware/Matrices
● CPUs - We work with these
● GPUs - CUDA ditto
● FPGAs
o Intel bought Altera, an FPGA maker, for $17 billion
this month
o The edge, the cloud

Why New Chips?
● See the numbers yourself:
● http://www.slideshare.net/airbots/cuda-
29330283
● http://devblogs.nvidia.com/parallelforall/bidm
ach-machine-learning-limit-gpus/
● http://jcuda.org

Mixed clusters
● GPUs aren’t good for all workloads
● Because latency
● Need to upload data: not good for small
problems
● Mixed CPU/GPU clusters are best bet

Data Pipelines
● More data will be binary
● Frameworks today can’t process binary well
● Binary data has different semantics
● Moving windows for audio
● 3d for images ...

People Roll Their Own b/c
● Current frameworks assume clean data :(
● Pipelines are brittle, hard to maintain
● Moving towards being composable (reuse)

Dedicated Libraries
● Let’s focus on vectorization -- now!
● Because IoT
● Because more access to raw media
● Should fit into current big data frameworks

Scoring
● AUC
● F1
● Different Loss Functions
● Hyper parameter optimization

All independent
● These things work for different models
● Shouldn’t be tied to a particular system
● Should be embeddable

Training
● Split Train/Test
● Sample data (no, not all the data ;) to
validate model
● Increasingly compute intensive

Deep Learning
● Most done in Python...
● Norm training time is measured in
hours/days -- weeks!?
● Work being done in HPC (Model parallelism)
● Distbelief (Data parallelism)

Automatic Learning
● Good at unstructured data
● Images, Text, Audio and Sensors
● Quick, baseline feature engineering
● Not good at feature introspection

Where Does Scala Fit In?
● Akka - Real time streaming analytics/micro services
● Spark - Dataframes/number crunching
● JVM Key/Value Stores
● Pistachio (powers Yahoo’s ad network)
o http://yahooeng.tumblr.com/post/118860853846/dist
ributed-word2vec-on-top-of-pistachio

The Way We Learn Now
● Monolithic ML frameworks
● No per-chip optimizations
● No Tensors (come on guys, it’s 2015...)
● Need isolation and less lockin
● JVM is the platform to make it happen

Other Links
● http://deeplearning4j.org/
● http://nd4j.org/
● https://github.com/deeplearning4j/Canova

Questions?
● adam@skymind.io
● @agibsonccc
● github.com/agibsonccc

What's hot

Anomaly Detection and Automatic Labeling with Deep LearningAdam Gibson

Big Data Analytics TokyoAdam Gibson

Deep learning in production with the bestAdam Gibson

Building A Machine Learning Platform At Quora (1)Nikhil Garg

Dl4j in the wildAdam Gibson

Bringing Deep Learning into production Paolo Platter

IBM Middle East Data Science Connect 2016 - Doha, QatarRomeo Kienzler

CI/CD for Machine Learning with Daniel KobranDatabricks

Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf

ScrappyVishwas N

How to Feed a Data Hungry Organization – by Traveloka Data TeamTraveloka

Deep Learning with MXNet - Dmitry LarkoSri Ambati

Anomaly detection in deep learning (Updated) EnglishAdam Gibson

Anatomy of in memory processing in Sparkdatamantra

Traveloka's journey to no ops streaming analyticsRendy Bambang Junior

Machine learning and big data @ uber a tale of two systemsZhenxiao Luo

Machine Learning Using Cloud ServicesSC5.io

A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...Databricks

Deep Learning with Microsoft Cognitive ToolkitBarbara Fusinska

Staying Shallow & Lean in a Deep Learning WorldXavier Amatriain

What's hot (20)

Anomaly Detection and Automatic Labeling with Deep Learning

Big Data Analytics Tokyo

Deep learning in production with the best

Building A Machine Learning Platform At Quora (1)

Dl4j in the wild

Bringing Deep Learning into production

IBM Middle East Data Science Connect 2016 - Doha, Qatar

CI/CD for Machine Learning with Daniel Kobran

Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...

Scrappy

How to Feed a Data Hungry Organization – by Traveloka Data Team

Deep Learning with MXNet - Dmitry Larko

Anomaly detection in deep learning (Updated) English

Anatomy of in memory processing in Spark

Traveloka's journey to no ops streaming analytics

Machine learning and big data @ uber a tale of two systems

Machine Learning Using Cloud Services

A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...

Deep Learning with Microsoft Cognitive Toolkit

Staying Shallow & Lean in a Deep Learning World

Viewers also liked

Applied Deep Learning with Spark and Deeplearning4jDataWorks Summit

Composing Project Archetyps with SBT AutoPluginsMark Schaake

Transformative Git PracticesNicola Paolucci

Nd4 j slides.pptxAdam Gibson

A Scala Corrections LibraryPaul Phillips

Lightning Talk: Running MongoDB on Docker for High Performance DeploymentsMongoDB

Basic NLP with Python and NLTKFrancesco Bruni

The Next Generation SharePoint: Powered by Text AnalyticsAlyona Medelyan

Effective Actorsshinolajla

Natural Language Toolkit (NLTK), Basics Prakash Pimpale

Scala Json Features and PerformanceJohn Nestor

Stateful Distributed Stream ProcessingGyula Fóra

KiwiPyCon 2014 - NLP with Python tutorialAlyona Medelyan

Distributed deep rl on spark strata singaporeAdam Gibson

Recurrent nets and sensorsAdam Gibson

Wrangleconf Big Data Malaysia 2016Adam Gibson

What We (Don't) Know About the Beginning of the UniverseSean Carroll

Gifford Lecture One: Cosmos, Time, MemorySean Carroll

NLTK - Natural Language Processing in Pythonshanbady

Hadoop Turns a Corner and Sees the FutureDataWorks Summit

Viewers also liked (20)

Applied Deep Learning with Spark and Deeplearning4j

Composing Project Archetyps with SBT AutoPlugins

Transformative Git Practices

Nd4 j slides.pptx

A Scala Corrections Library

Lightning Talk: Running MongoDB on Docker for High Performance Deployments

Basic NLP with Python and NLTK

The Next Generation SharePoint: Powered by Text Analytics

Effective Actors

Natural Language Toolkit (NLTK), Basics

Scala Json Features and Performance

Stateful Distributed Stream Processing

KiwiPyCon 2014 - NLP with Python tutorial

Distributed deep rl on spark strata singapore

Recurrent nets and sensors

Wrangleconf Big Data Malaysia 2016

What We (Don't) Know About the Beginning of the Universe

Gifford Lecture One: Cosmos, Time, Memory

NLTK - Natural Language Processing in Python

Hadoop Turns a Corner and Sees the Future

Similar to Future of AI on the JVM: Microservices and Deep Learning

AI hype or realityAwantik Das

Cloud accounting software ukArcus Universe Ltd

Bridging the gap in enterprise AIMax Pumperla

Moving from BI to AI : For decision makerszekeLabs Technologies

Ideas spracklen-finalsupportlogic

Productionizing Deep Learning From the Ground Upodsc

AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty

ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen

Webinar - Unleash AI power with MySQL and MindsDBFederico Razzoli

Scaling Recommendations at Quora (RecSys talk 9/16/2016)Nikhil Dandekar

Machine learning at scale - Webinar By zekeLabszekeLabs Technologies

Real world machine learning with Java for Fumankaitori.comMathieu Dumoulin

Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty

Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfvitm11

AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | EnglishOmid Vahdaty

Aws autopilotVivek Raja P S

Sf big analytics: bigheadChester Chen

What drives Innovation? Innovations And Technological Solutions for the Distr...Stefano Fago

Simply Business' Data PlatformDani Solà Lagares

Similar to Future of AI on the JVM: Microservices and Deep Learning (20)

AI hype or reality

Cloud accounting software uk

Bridging the gap in enterprise AI

Moving from BI to AI : For decision makers

Ideas spracklen-final

Productionizing Deep Learning From the Ground Up

AWS Big Data Demystified #1.2 | Big Data architecture lessons learned

ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure

Webinar - Unleash AI power with MySQL and MindsDB

Scaling Recommendations at Quora (RecSys talk 9/16/2016)

Machine learning at scale - Webinar By zekeLabs

Real world machine learning with Java for Fumankaitori.com

Big Data in 200 km/h | AWS Big Data Demystified #1.3

Production-Ready BIG ML Workflows - from zero to hero

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf

AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English

Aws autopilot

Sf big analytics: bighead

What drives Innovation? Innovations And Technological Solutions for the Distr...

Simply Business' Data Platform

Recently uploaded

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster

GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics

1:1定制(UQ毕业证）昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk

Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics

Easter Eggs From Star Wars and in cars 1 and 217djon017

Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha

Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ

Machine learning classification ppt.pptamreenkhanum0307

How we prevented account sharing with MFAAndrei Kaleshka

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss

PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava

20240419 - Measurecamp Amsterdam - SAM.pdfHuman37

RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993

Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort

Multiple time frame trading analysis -brianshannon.pdfchwongval

Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics

Recently uploaded (20)

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024

GA4 Without Cookies [Measure Camp AMS]

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...

1:1定制(UQ毕业证）昆士兰大学毕业证成绩单修改留信学历认证原版一模一样

Heart Disease Classification Report: A Data Analysis Project

Easter Eggs From Star Wars and in cars 1 and 2

Call Girls In Dwarka 9654467111 Escorts Service

Advanced Machine Learning for Business Professionals

Machine learning classification ppt.ppt

How we prevented account sharing with MFA

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理

PKS-TGC-1084-630 - Stage 1 Proposal.pptx

20240419 - Measurecamp Amsterdam - SAM.pdf

RABBIT: A CLI tool for identifying bots based on their GitHub events.

Student profile product demonstration on grades, ability, well-being and mind...

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service

Multiple time frame trading analysis -brianshannon.pdf

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf

Future of AI on the JVM: Microservices and Deep Learning

1. Future of AI on the JVM Scala Days Amsterdam 2015 Adam Gibson Creator of Deeplearning4j (and 4s :)

2. What is AI? ● Not Terminator (despite our name) ● Many subfields ● Our focus: Machine learning

3. Big Data?

4. Problem Space ● Spam Classification ● Summarization ● Face Detection ● Eye Tracking ● Targeted Ads ● Recommendation Engines

5. Current State of ML ● Simpler models ● Most of industry barely uses Logistic Reg. ● Many problems are binary o e.g. fraud, spam ● Some unsupervised (clustering, reccos) ● Lots of ML frameworks on JVM

6. ML Frameworks on JVM... ● Apache Mahout ● Spark’s MLlib ● Weka (is that R?)

7. ML GUIs ● Prediction.io ● Encog

8. Problems ● Monolithic ● Makes assumptions about data ● Hard to use ● No separation of concerns

9. Ring a Bell? ● We call that “Monolithic” ● Separate ML concerns: Data Pipelines/Vectorization Scoring Model Training Evaluation

10. Micro-Services + ML? ● Kinda like micro-services ● Reduce lock in ● Take math, data cleaning, model training, choosing algorithms ... ● … and separate them

11. Math ● Parametric Models (Matrices!) ● Non Parametric (Random forest) ● Focusing on Matrices (the hard part of ML systems)

12. Matrices ● NDArrays ( > 2d) ● Tensors (think of pages of matrices) ● Example: 2 x 2 x 2 (2 2x 2 matrices) ● ^^THIS IS UNCLEAR. Two 2 x 2 matrices? ● Applies to graphs w/ sparse representations

13. Chips/Hardware/Matrices ● CPUs - We work with these ● GPUs - CUDA ditto ● FPGAs o Intel bought Altera, an FPGA maker, for $17 billion this month o The edge, the cloud

14. Why New Chips?

15. Why New Chips? ● See the numbers yourself: ● http://www.slideshare.net/airbots/cuda- 29330283 ● http://devblogs.nvidia.com/parallelforall/bidm ach-machine-learning-limit-gpus/ ● http://jcuda.org

16. Mixed clusters ● GPUs aren’t good for all workloads ● Because latency ● Need to upload data: not good for small problems ● Mixed CPU/GPU clusters are best bet

17. Data Pipelines ● More data will be binary ● Frameworks today can’t process binary well ● Binary data has different semantics ● Moving windows for audio ● 3d for images ...

18. People Roll Their Own b/c ● Current frameworks assume clean data :( ● Pipelines are brittle, hard to maintain ● Moving towards being composable (reuse)

19. Dedicated Libraries ● Let’s focus on vectorization -- now! ● Because IoT ● Because more access to raw media ● Should fit into current big data frameworks

20. Scoring ● AUC ● F1 ● Different Loss Functions ● Hyper parameter optimization

21. All independent ● These things work for different models ● Shouldn’t be tied to a particular system ● Should be embeddable

22. Training ● Split Train/Test ● Sample data (no, not all the data ;) to validate model ● Increasingly compute intensive

23. Deep Learning ● Most done in Python... ● Norm training time is measured in hours/days -- weeks!? ● Work being done in HPC (Model parallelism) ● Distbelief (Data parallelism)

24. Automatic Learning ● Good at unstructured data ● Images, Text, Audio and Sensors ● Quick, baseline feature engineering ● Not good at feature introspection

25. Or are they?

26. TSNE

27. Where Does Scala Fit In? ● Akka - Real time streaming analytics/micro services ● Spark - Dataframes/number crunching ● JVM Key/Value Stores ● Pistachio (powers Yahoo’s ad network) o http://yahooeng.tumblr.com/post/118860853846/dist ributed-word2vec-on-top-of-pistachio

28. The Way We Learn Now ● Monolithic ML frameworks ● No per-chip optimizations ● No Tensors (come on guys, it’s 2015...) ● Need isolation and less lockin ● JVM is the platform to make it happen

29. Other Links ● http://deeplearning4j.org/ ● http://nd4j.org/ ● https://github.com/deeplearning4j/Canova

30. Questions? ● adam@skymind.io ● @agibsonccc ● github.com/agibsonccc

Future of AI on the JVM: Microservices and Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Future of AI on the JVM: Microservices and Deep Learning

Similar to Future of AI on the JVM: Microservices and Deep Learning (20)

More from Adam Gibson

More from Adam Gibson (14)

Recently uploaded

Recently uploaded (20)

Future of AI on the JVM: Microservices and Deep Learning