Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
DevOps in a
Machine Learning World
@leonardaustin
As machine learning moves from niche to
mainstream tech stacks how do DevOps engineers
prepare for a very different set of...
Leonard Austin
Cofounder at Ravelin
CTO, Software Engineer, DevOps, Recruiter...
@leonardaustin
Ravelin
Fraud Detection. Ravelin examines your visitor and
payment data in real time, telling your systems
which customers...
Fraud?
$14B
Lost to fraud
Growing rapidly as fraudsters move online
Detection is Hard
One fraudster leads to lots of cost
3D Secure
3D Secure
Kills Conversion
Stack
Go + Python
AWS
MicroServices
Storage: Cassandra, Postgres, ElasticSearch, Redis, Graph Database X, ZooKeeper
Queue:...
Doing Things The
Right Way
TerraForm
100% Automation
Horizontally Scalable
Continuous Integration
No need for SSH access
1...
Servers & MicroServices
Servers & MicroServices
“Livestock, not pets. It gets sick, terminate it” - DevOps guy on the internet
Machine Learning
Challenges
> Data Warehousing
Resource on Demand
Deploy
Hardware Requirement
Life Cycle
(Explore, Train, ...
Data Warehousing
What?
Why we need it for Ravelin
How much data
$10m
IBM, Oracle, Microsoft
v1
$1m
Massively Parallel Processing - MPP
IBM, Oracle, Microsoft, Teradata, Vertica, GreenPlum
v1.5
$200k
Hadoop MapReduce, Spark, Hive, Impala
v2
$500
BigQuery
v3
$5.00
BigQuery per Terabyte
We ♡ BigQuery
Costs - $5 per terabyte, 5c per range query per terabyte
Managed - but no reserve compute resources needed!
...
Probably need to mention AWS RedShift
Stack
Go + Python
AWS & Google Cloud Platform
MicroServices
DB: Cassandra, Postgres, ElasticSearch, Redis, Graph Databases...
Machine Learning
Challenges
Data Warehousing
> Resource on Demand
Deploy
Hardware Requirement
Life Cycle
(Explore, Train, ...
Work on the Cloud!
“Stephen’s laptop was measurably heavier because of the amount
of data he had on it. We asked him nicel...
Data
“Single point of success”- Jose CTO Hailo 2014
AWS
32 Cores 244GB RAM
Google Cloud Platform
32 Cores 208GB RAM
Azure
...
Machine Learning
Challenges
Data Warehousing
Resource on Demand
> Deploy
Hardware Requirement
Life Cycle
(Explore, Train, ...
Deploying Models
Train - sample
Pickle
S3
Deploy
Simple
Hardware - GPU’s
Specific for Deep Learning
AWS have a GPU machine but $$$
No virtualization
Buy and build your own server...
Summary
Data Warehousing
BigQuery
Dataflow
On Demand Resource
1 Machine (because clustering is expensive)
Big Machines on ...
Hiring Smart People
DevOps - Mid Level & Senior
Data Scientist - Junior & Mid Level
Software Engineer - Junior, Mid Level ...
Thanks
@leonardaustin
@ravelinhq
ravelin.com
leonard.austin@ravelin.com
Remember we are hiring
Upcoming SlideShare
Loading in …5
×

Leonard Austin (Ravelin) - DevOps in a Machine Learning World

As machine learning moves from niche to mainstream tech stacks how do DevOps engineers prepare for a very different set of problems. A brief look at the new issues that arise from machine learning, an overview of cutting-edge "old school" solutions and how to drag data science (kicking and screaming) into a world of automation.

Video: https://www.youtube.com/watch?v=KHxZCRajRiA

Join DevOps Exchange London here: http://meetup.com/DevOps-Exchange-London/
Follow DOXLON on twitter http://www.twitter.com/doxlon

  • Be the first to comment

Leonard Austin (Ravelin) - DevOps in a Machine Learning World

  1. 1. DevOps in a Machine Learning World @leonardaustin
  2. 2. As machine learning moves from niche to mainstream tech stacks how do DevOps engineers prepare for a very different set of problems. A brief look at the new issues that arise from machine learning, an overview of cutting-edge "old school" solutions and how to drag data science (kicking and screaming) into a world of automation.
  3. 3. Leonard Austin Cofounder at Ravelin CTO, Software Engineer, DevOps, Recruiter... @leonardaustin
  4. 4. Ravelin Fraud Detection. Ravelin examines your visitor and payment data in real time, telling your systems which customers are fraudsters. We use Machine Learning, Rule Engines, Graph Networks and Industry Expertise to respond with scores in milliseconds. Perfect for an on-demand world. Raised $2m last year. Fintech. Hiring
  5. 5. Fraud?
  6. 6. $14B Lost to fraud Growing rapidly as fraudsters move online
  7. 7. Detection is Hard
  8. 8. One fraudster leads to lots of cost
  9. 9. 3D Secure
  10. 10. 3D Secure
  11. 11. Kills Conversion
  12. 12. Stack Go + Python AWS MicroServices Storage: Cassandra, Postgres, ElasticSearch, Redis, Graph Database X, ZooKeeper Queue: NSQ, Kinesis Instrumentation: InfluxDB, Grafana Docker - but only for local dev
  13. 13. Doing Things The Right Way TerraForm 100% Automation Horizontally Scalable Continuous Integration No need for SSH access 100% Visibility - Metrics & Logs
  14. 14. Servers & MicroServices
  15. 15. Servers & MicroServices “Livestock, not pets. It gets sick, terminate it” - DevOps guy on the internet
  16. 16. Machine Learning Challenges > Data Warehousing Resource on Demand Deploy Hardware Requirement Life Cycle (Explore, Train, Deploy)
  17. 17. Data Warehousing What? Why we need it for Ravelin How much data
  18. 18. $10m IBM, Oracle, Microsoft v1
  19. 19. $1m Massively Parallel Processing - MPP IBM, Oracle, Microsoft, Teradata, Vertica, GreenPlum v1.5
  20. 20. $200k Hadoop MapReduce, Spark, Hive, Impala v2
  21. 21. $500 BigQuery v3
  22. 22. $5.00 BigQuery per Terabyte
  23. 23. We ♡ BigQuery Costs - $5 per terabyte, 5c per range query per terabyte Managed - but no reserve compute resources needed! Distributed columns easily append Dataflow Restriction: Can’t Update No Indexes
  24. 24. Probably need to mention AWS RedShift
  25. 25. Stack Go + Python AWS & Google Cloud Platform MicroServices DB: Cassandra, Postgres, ElasticSearch, Redis, Graph Databases, ZooKeeper Queue: NSQ, Kinesis, Google Pub/Sub Warehouse: BigQuery, DataFlow
  26. 26. Machine Learning Challenges Data Warehousing > Resource on Demand Deploy Hardware Requirement Life Cycle (Explore, Train, Deploy)
  27. 27. Work on the Cloud! “Stephen’s laptop was measurably heavier because of the amount of data he had on it. We asked him nicely to move everything to the cloud and now the internet is a little heavier” - Science 2016
  28. 28. Data “Single point of success”- Jose CTO Hailo 2014 AWS 32 Cores 244GB RAM Google Cloud Platform 32 Cores 208GB RAM Azure 16 Cores 112GB RAM
  29. 29. Machine Learning Challenges Data Warehousing Resource on Demand > Deploy Hardware Requirement Life Cycle (Explore, Train, Deploy)
  30. 30. Deploying Models Train - sample Pickle S3 Deploy Simple
  31. 31. Hardware - GPU’s Specific for Deep Learning AWS have a GPU machine but $$$ No virtualization Buy and build your own server Q. How Deep is your problem? Speech, Video, Images
  32. 32. Summary Data Warehousing BigQuery Dataflow On Demand Resource 1 Machine (because clustering is expensive) Big Machines on the Cloud Persistent Volumes on Google Cloud Compute
  33. 33. Hiring Smart People DevOps - Mid Level & Senior Data Scientist - Junior & Mid Level Software Engineer - Junior, Mid Level & Senior Product Owner
  34. 34. Thanks @leonardaustin @ravelinhq ravelin.com leonard.austin@ravelin.com Remember we are hiring

×