Overview
● Brief Skymind Intro
● Deep Learning outside research
● Core trends for ROI in deep learning
● Anomaly Detection with deep learning
● Simbox fraud detection for telco
● Network Intrusion
● Fintech securities churn prediction
● Real time corporate campus security: Detecting
dangerous objects
Distributed Deep
RL on Spark
We built
Deeplearning4j
SKYMIND INTELLIGENCE LAYER (SKIL)
REFERENCE ARCHITECTURE
Deep Learning
outside research
● Too much hype
● Most companies rarely do machine learning let
alone deep learning
● Beginners try to jump to deep learning after
andrew ng’s coursera class without first
principles
This is not deep learning.
This is deep
learning.
Deep Learning
outside research
● Mostly python and r on kaggle
● Many learning from udacity
● Most deep learning is research stage/enthusiast
● Salaried engineers doing DL mostly publishing
papers
● Large fight for talent (see google fellowship)
Deep Learning
outside research
● Deep Learning hasn’t penetrated the fortune
2000
● Fortune 2000 wants ROI not cat pictures
● Many organizations just NOW starting to take
software seriously let alone data science
● Use cases for deep learning still not widely
understood
● Large fight for talent (see google fellowship)
Core trends for
ROI in DL
● Mostly funded by adtech companies
● Companies doing DL have data from lots of
media data (audio,image,video)
● Many companies using DL for ad targeting
● Best use cases are targeting understanding large
scale hidden patterns in data (often cross
domain)
● Time series has largely been ignored
Core trends for
ROI in DL
● Initial first attempts at deep learning following
papers (no other examples)
● Many companies end up sticking to simpler
techniques after trying DL
● Expectations for DL tend to match hype not
reality
● Some rare cases exist outside this trend (mainly
in asia)
Core trends for
ROI in DL
For more trends
see:
https://www.oreilly.c
om/ideas/the-curren
t-state-of-machine-i
ntelligence-3-0
Anomaly
Detection
● “Find the needle in the haystack”
● “Find the bad guy”
● “The machines about to break!”
● “Find the next market rally”
● “Take action on said anomaly”
Anomaly
Detection with
deep learning
● Both unsupervised and supervised techniques
● LSTMs (time series neural net)
● Autoencoders (unsupervised)
● Expectations for DL tend to match hype not
reality
● Some rare cases exist outside this trend (mainly
in asia)
LSTM
AutoEncoder
Simbox fraud for
telco
● Costs telco over 3 billion yearly
● Route calls for free over a carrier network
● Need to mine raw call detail records to find
● Find and cluster fraudulent CDRs with
autoencoders (unsupervised)
● Beats current rules and supervised based
approaches
Network
Intrusion
● Raw web log traffic
● Detect attacks at points of origin
● Typically supervised learning
● Goal: Classify raw time series to find attacks
● Optional: Detect *kind* of attack
Fintech
securities churn
prediction
● Predict when user is going to leave
service
● Using recurrent nets find likelihood of leaving
● Using lift curves identify budget for sending
discounts to percentage of users “worth” saving
● Optional: use autoencoders with kmeans to
identify groups of users wanting to leave
Corporate
campus security
● At 30 FPS or more find dangerous objects in a
crowd
● Identify a target object and send immediate
report
● Uses variants of Convolutional nets
● Imagine hooking this up to a real camera
Conclusion
● Deep Learning still young
● Many use cases not being tried
● Research is moving faster every year
● Talent still hard to find
● Will become more common with time
Wrangleconf Big Data Malaysia 2016

Wrangleconf Big Data Malaysia 2016

  • 2.
    Overview ● Brief SkymindIntro ● Deep Learning outside research ● Core trends for ROI in deep learning ● Anomaly Detection with deep learning ● Simbox fraud detection for telco ● Network Intrusion ● Fintech securities churn prediction ● Real time corporate campus security: Detecting dangerous objects
  • 3.
    Distributed Deep RL onSpark We built Deeplearning4j
  • 4.
    SKYMIND INTELLIGENCE LAYER(SKIL) REFERENCE ARCHITECTURE
  • 5.
    Deep Learning outside research ●Too much hype ● Most companies rarely do machine learning let alone deep learning ● Beginners try to jump to deep learning after andrew ng’s coursera class without first principles This is not deep learning. This is deep learning.
  • 6.
    Deep Learning outside research ●Mostly python and r on kaggle ● Many learning from udacity ● Most deep learning is research stage/enthusiast ● Salaried engineers doing DL mostly publishing papers ● Large fight for talent (see google fellowship)
  • 7.
    Deep Learning outside research ●Deep Learning hasn’t penetrated the fortune 2000 ● Fortune 2000 wants ROI not cat pictures ● Many organizations just NOW starting to take software seriously let alone data science ● Use cases for deep learning still not widely understood ● Large fight for talent (see google fellowship)
  • 8.
    Core trends for ROIin DL ● Mostly funded by adtech companies ● Companies doing DL have data from lots of media data (audio,image,video) ● Many companies using DL for ad targeting ● Best use cases are targeting understanding large scale hidden patterns in data (often cross domain) ● Time series has largely been ignored
  • 9.
    Core trends for ROIin DL ● Initial first attempts at deep learning following papers (no other examples) ● Many companies end up sticking to simpler techniques after trying DL ● Expectations for DL tend to match hype not reality ● Some rare cases exist outside this trend (mainly in asia)
  • 10.
    Core trends for ROIin DL For more trends see: https://www.oreilly.c om/ideas/the-curren t-state-of-machine-i ntelligence-3-0
  • 11.
    Anomaly Detection ● “Find theneedle in the haystack” ● “Find the bad guy” ● “The machines about to break!” ● “Find the next market rally” ● “Take action on said anomaly”
  • 12.
    Anomaly Detection with deep learning ●Both unsupervised and supervised techniques ● LSTMs (time series neural net) ● Autoencoders (unsupervised) ● Expectations for DL tend to match hype not reality ● Some rare cases exist outside this trend (mainly in asia) LSTM AutoEncoder
  • 13.
    Simbox fraud for telco ●Costs telco over 3 billion yearly ● Route calls for free over a carrier network ● Need to mine raw call detail records to find ● Find and cluster fraudulent CDRs with autoencoders (unsupervised) ● Beats current rules and supervised based approaches
  • 14.
    Network Intrusion ● Raw weblog traffic ● Detect attacks at points of origin ● Typically supervised learning ● Goal: Classify raw time series to find attacks ● Optional: Detect *kind* of attack
  • 15.
    Fintech securities churn prediction ● Predictwhen user is going to leave service ● Using recurrent nets find likelihood of leaving ● Using lift curves identify budget for sending discounts to percentage of users “worth” saving ● Optional: use autoencoders with kmeans to identify groups of users wanting to leave
  • 16.
    Corporate campus security ● At30 FPS or more find dangerous objects in a crowd ● Identify a target object and send immediate report ● Uses variants of Convolutional nets ● Imagine hooking this up to a real camera
  • 17.
    Conclusion ● Deep Learningstill young ● Many use cases not being tried ● Research is moving faster every year ● Talent still hard to find ● Will become more common with time