Wrangleconf Big Data Malaysia 2016

Overview
● Brief Skymind Intro
● Deep Learning outside research
● Core trends for ROI in deep learning
● Anomaly Detection with deep learning
● Simbox fraud detection for telco
● Network Intrusion
● Fintech securities churn prediction
● Real time corporate campus security: Detecting
dangerous objects

Distributed Deep
RL on Spark
We built
Deeplearning4j

SKYMIND INTELLIGENCE LAYER (SKIL)
REFERENCE ARCHITECTURE

Deep Learning
outside research
● Too much hype
● Most companies rarely do machine learning let
alone deep learning
● Beginners try to jump to deep learning after
andrew ng’s coursera class without first
principles
This is not deep learning.
This is deep
learning.

Deep Learning
outside research
● Mostly python and r on kaggle
● Many learning from udacity
● Most deep learning is research stage/enthusiast
● Salaried engineers doing DL mostly publishing
papers
● Large fight for talent (see google fellowship)

Deep Learning
outside research
● Deep Learning hasn’t penetrated the fortune
2000
● Fortune 2000 wants ROI not cat pictures
● Many organizations just NOW starting to take
software seriously let alone data science
● Use cases for deep learning still not widely
understood
● Large fight for talent (see google fellowship)

Core trends for
ROI in DL
● Mostly funded by adtech companies
● Companies doing DL have data from lots of
media data (audio,image,video)
● Many companies using DL for ad targeting
● Best use cases are targeting understanding large
scale hidden patterns in data (often cross
domain)
● Time series has largely been ignored

Core trends for
ROI in DL
● Initial first attempts at deep learning following
papers (no other examples)
● Many companies end up sticking to simpler
techniques after trying DL
● Expectations for DL tend to match hype not
reality
● Some rare cases exist outside this trend (mainly
in asia)

Core trends for
ROI in DL
For more trends
see:
https://www.oreilly.c
om/ideas/the-curren
t-state-of-machine-i
ntelligence-3-0

Anomaly
Detection
● “Find the needle in the haystack”
● “Find the bad guy”
● “The machines about to break!”
● “Find the next market rally”
● “Take action on said anomaly”

Anomaly
Detection with
deep learning
● Both unsupervised and supervised techniques
● LSTMs (time series neural net)
● Autoencoders (unsupervised)
● Expectations for DL tend to match hype not
reality
● Some rare cases exist outside this trend (mainly
in asia)
LSTM
AutoEncoder

Simbox fraud for
telco
● Costs telco over 3 billion yearly
● Route calls for free over a carrier network
● Need to mine raw call detail records to find
● Find and cluster fraudulent CDRs with
autoencoders (unsupervised)
● Beats current rules and supervised based
approaches

Network
Intrusion
● Raw web log traffic
● Detect attacks at points of origin
● Typically supervised learning
● Goal: Classify raw time series to find attacks
● Optional: Detect *kind* of attack

Fintech
securities churn
prediction
● Predict when user is going to leave
service
● Using recurrent nets find likelihood of leaving
● Using lift curves identify budget for sending
discounts to percentage of users “worth” saving
● Optional: use autoencoders with kmeans to
identify groups of users wanting to leave

Corporate
campus security
● At 30 FPS or more find dangerous objects in a
crowd
● Identify a target object and send immediate
report
● Uses variants of Convolutional nets
● Imagine hooking this up to a real camera

Conclusion
● Deep Learning still young
● Many use cases not being tried
● Research is moving faster every year
● Talent still hard to find
● Will become more common with time

Wrangleconf Big Data Malaysia 2016

Wrangleconf Big Data Malaysia 2016

More Related Content

Viewers also liked

Similar to Wrangleconf Big Data Malaysia 2016

More from Adam Gibson

Recently uploaded

Wrangleconf Big Data Malaysia 2016