Deploying Real-Time Decision Services Using Redis with Tague Griffith

Tague Griffith, Redis Labs
Deploying Real-Time Decision
Sevices using Redis
#MLSAIS12

Teaching a computer, by example, an
algorithm that is too complex to program

Machine Learning Problems
Pick One of a Set
• Spam Detection
• Manufacturing defect
detection
• Handwriting analysis
• Decision Trees
• Naïve Bayes
• Logistic Regression
Score or Rank
• Recommendations
• Likelihood of
Purchase
• Linear Regression
• SVM
Classification Regression
Group Similar
• Find Similar Items
• Customer
segmentation
• Cohort detection
• K-Means
• K-Nearest Neighbors
• Hierarchical
Clustering
Clustering

Supervised Learning – Training Spam
Classifier
#MLSAIS12

Deploying a Spam Classifier
#MLSAIS12

How do we Build these Boxes
¯_( )_/¯
#MLSAIS12

• Building high performance and reliable
services are hard, isn't there something
we can deploy

Typical Spark Application Structure
Spark Training
Data is loaded into Spark Model is saved in files
File System Custom Server
Model is loaded to your
custom app
Serving Client
Client App
#MLSAIS12

Redis-ML: Predictive Model Serving Engine
• Predictive models as native Redis types
• Perform evaluation directly in Redis
• Store training output as “hot model”
Spark Training
Data loaded into Spark Model is saved in
Redis-ML
Redis-ML
Serving Client
Client
App
Client
App
Client
App
Any Training
Platform

REmote DIctionary Server
Strings Hashes Lists
Sets Bitmaps
Hyperlog-
logs
Sorted
Sets
Geo-
spatial
Bitfield

A Quick Recap of Redis
Key
"I'm a Plain Text String!"
{ A: “foo”, B: “bar”, C: “baz” }
Strings / Bitmaps / BitFields
Hash Tables (objects!)
Linked Lists
Sets
Sorted Sets
Geo Sets
HyperLogLog
{ A , B , C , D , E }
[ A → B → C → D → E ]
{ A: 0.1, B: 0.3, C: 100, D: 1337 }
{ A: (51.5, 0.12), B: (32.1, 34.7) }
00110101 11001110 10101010

Redis Modules
• Any C/C++ program can now run on Redis
• Use existing or add new data-structures
• Enjoy simplicity, infinite scalability and high availability while
keeping the native speed of Redis
• Can be created by anyone
New Capabilities
New Commands
New Data Types

Redis ML Module
Redis Module
Tree Ensembles
Linear Regression
Logistic Regression
Matrix + Vector Operations
More to come...

Random Forest Model
• A collection of decision trees
• Supports classification & regression
• Splitter Node can be:
◦ Categorical (e.g. day == “Sunday”)
◦ Numerical (e.g. age < 43)
• Decision is taken by the majority of decision trees

Classic Tree Problem: Titanic Survival
YES
Sex =
Male ?
Age <
9.5?
Sibps >
2.5?
Survived
Died
SurvivedDied
NO
• Passenger Data encoded as feature vecto
• ML Algorithm learns the tree rules
• ID3, CART (RPART), etc.
• Tree rules used to infer results

Titanic Survival: Random Forest
YES
Sex =
Male ?
Age <
9.5?
*Sibps >
2.5?
Survived
Died
SurvivedDied
NO YES
Country=
US?
State =
CA?
Height>
1.60m?
Survived
Died
SurvivedDied
NO YES
Weight<
80kg?
I.Q<100?
Eye color
=blue?
Survived
Died
SurvivedDied
NO
Tree #1 Tree #2 Tree #3

Who Would Survive the Titanic
• John:
• Male, 34,
• Married w/ 2 kids
(Sibps=3)
• New York, USA
• 1.78m, 78kg
• 110 iq
• Blue eyes
Mathew:
• Male, 6
• 3 Sisters (Sibps=3)
• New York, USA
• 1.06m, 22.7 kg
• 100 iq
• Brown eyes
Let's use our forest to find out

Redis: Forest Data Type
Add nodes to a tree in a forest:
Perform classification/regression of a feature vector:
ML.FOREST.ADD <forestId> <treeId> <path>
[ [NUMERIC|CATEGORIC] <splitterAttr> <splitterVal> ] |
[LEAF] <predVal>
ML.FOREST.RUN <forestId> <features>
[CLASSIFICATION|REGRESSION]

Real World Challenge
• Ad serving company
• Need to serve 20,000 ads/sec @ 50msec data-center latency
• Runs 1k campaigns → 1K random forest
• Each forest has 15K trees
• On average each tree has 7 levels (depth)

Ad Serving costs: Homegrown v. Redis
Homegrown
1,247 x c4.8xlarge 35 x c4.8xlarge
Cut computing infrastructure
by 97%

Summary
• Train with Spark, Serve with Redis
• 97% resource cost serving
• Simplify ML lifecycle
• Redise (Cloud or Pack):
‒Scaling, HA, Performance
‒PAYG – cost optimized
‒Ease of use
‒Supported by the teams who created Spark and
Redis
Spark Training
Data loaded into Spark Model is saved in
Redis-ML
Redis-ML
Serving Client
Client
App
Client
App
Client
App
+

Deploying Real-Time Decision Services Using Redis with Tague Griffith

Recommended

Recommended

More Related Content

Similar to Deploying Real-Time Decision Services Using Redis with Tague Griffith

Similar to Deploying Real-Time Decision Services Using Redis with Tague Griffith (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Deploying Real-Time Decision Services Using Redis with Tague Griffith