SlideShare a Scribd company logo
1 of 32
Download to read offline
Boosting Machine Learning with Redis
Modules and Spark
Dvir Volk, Redis Labs, November 2016
2
Hello World
Open source. The leading in-memory database
The open source home and commercial provider of
Redis - cloud and on-premise
Senior System Architect at Redis Labs. Redis user
and contributor for ~6 years
@dvirsky
dvirvolk
3
A Brief Overview of Redis
● Started in 2009 by Salvatore Sanfilippo
● Mostly a one man show
● Most popular KV store
● Notable Users:
○ Twitter, Netflix, Uber, Groupon, Twitch
○ Many, many more...
4
A Brief Overview of Redis
▪ Key => Data Structure server
▪ In memory disk backed
▪ Optional cluster mode
▪ Embedded Lua scripting
▪ Single Threaded!
▪ Key features: Fast, Flexible, Simple
5
A Lego For Your Database
Key
"I'm a Plain Text String!"
{ A: “foo”, B: “bar”, C: “baz” }
Strings/Blobs/Bitmaps
Hash Tables (objects!)
Linked Lists
Sets
Sorted Sets
Geo Sets
HyperLogLog
{ A , B , C , D , E }
[ A → B → C → D → E ]
{ A: 0.1, B: 0.3, C: 100, D: 1337 }
{ A: (51.5, 0.12), B: (32.1, 34.7) }
00110101 11001110 10101010
6
Redis In Practice
▪ “Front End Database”
▪ Real Time Counters
▪ Ad Serving
▪ Message Queues
▪ Geo Database
▪ Time Series
▪ Cache
▪ Session State
▪ Etc
7
But Can Redis Do X?
Secondary Index?
Time Series?
Full Text Search?
Graph?
Machine Learning?
AutoComplete?
SQL?
8
So You Want a New Feature?
▪ Try a Lua script
▪ Convince @antirez
▪ Fork Redis
▪ Build Your Own Database!
9
Enter Redis Modules
▪ In development since March 2016
▪ Redis 4.0 RC out soon
▪ Several modules already exist
▪ Key paradigm shift for Redis
10
New Capabilities
What Modules Actually Are
▪ Dynamic libraries loaded to redis
▪ Written in C/C++
▪ Use a C ABI/API isolating redis internals
▪ Near Zero latency access to data
New Commands
New Data Types
11
Obligatory Module Example
12
LEFTPAD Example
127.0.0.1:6379> MODULE LOAD "./example.so"
OK
127.0.0.1:6379> COMMAND INFO EXAMPLE.LEFTPAD
1) 1) "example.leftpad"
...
127.0.0.1:6379> EXAMPLE.LEFTPAD "foo" 8
foo
127.0.0.1:6379> EXAMPLE.LEFTPAD "foo" 8 "_"
_____foo
13
Real Module: RediSearch
▪ From-Scratch search index over redis
▪ Uses Strings for holding compressed index data
▪ Includes stemming, exact phrase match, etc.
▪ Fast Fuzzy Auto-complete
▪ Up to X5 faster than Elastic / Solr
> FT.SEARCH “lcd tv” FILTER price 100 +inf
> FT.SUGGET “lcd” FUZZY
14
More Modules Out There
▪ Native JSON Support
▪ Time Series
▪ Secondary Indexing
▪ Encryption
▪ Bloom Filters
▪ Online Neural Network
▪ Many Many more...
15
Spark ML + Redis modules
16
Redis + Spark So Far
▪ Current connector:
- RDD abstraction
- SparkSQL
- Streaming Source
▪ ML is not addressed specifically
▪ Used for pre-computed results
▪ We felt that we can take it further
17
Addressing The ML Pain
▪ The missing piece of ML: Serving your model
- Not standardized
- Vendor-lock with cloud platforms
- Reliable services are hard to do
- If only we had a “database” for this!
- Well, maybe we do?
18
Why Modules for ML?
With modules we can:
▪ Define data structures for models
▪ Store training output as “hot model”
▪ Perform evaluation directly in Redis
▪ Easily integrate existing C/C++ libs
19
Spark + Modules = AWESOME
▪ Train ML model on Spark
▪ Save model to Redis and get:
- High availability
- Clustering
- Persistence
- Performance
- Client libraries
20
Spark-ML End-to-End Flow
Spark Training
Custom Server
Model saved to
Parquet file
Data Loaded
to Spark
Pre-computed
results
Batch Evaluation
?
ClientApp
21
Adding Redis Into The Mix
Redis-ML “Active Model”
Any Training Platform
ClientApp
Spark Training
Data Loaded
to Spark
22
Redis Module
Tree Ensembles
Linear Regression
Logistic Regression
Matrix + Vector Operations
More to come...
The Redis-ML Module
23
Example: Random Forest
24
Forest Data Type
▪ A collection of decision trees
▪ Supports classification & regression
▪ Splitter Node can be
- Categorical (e.g. day == “Sunday”)
- Numerical (e.g. age < 43)
25
Decision Tree Example
The famous Titanic survival predictor
sex=male?yes no
Survived
Died
Age > 9.5?
sibsp > 2.5?
Died Survived *sibsp = siblings + spouses
26
Forest Data Type Example
> MODULE LOAD "./redis-ml.so"
OK
> ML.FOREST.ADD myforest 0 . CATEGORIC sex “male” .L
LEAF 1 .R LEAF 0
OK
> ML.FOREST.RUN myforest sex:male
"1"
> ML.FOREST.RUN myforest sex:yes_please
"0"
27
Using Redis-ML With Spark
scala> import com.redislabs.client.redisml.MLClient
scala> import com.redislabs.provider.redis.ml.Forest
scala> val rfModel =
pipelineModel.stages.last.asInstanceOf[RandomForestClassificationModel]
scala> val f = new Forest(rfModel.trees)
scala> f.loadToRedis("forest-test", "localhost")
scala> val jedis = new Jedis("localhost")
scala> jedis.getClient.sendCommand(MLClient.ModuleCommand.FOREST_RUN,
"forest-test", makeInputString (0))
scala> jedis.getClient.getStatusCodeReply
res53: String = 1
28
Benchmarking Redis-ML
- Spark + Parquet Spark + Redis ML
Model Preparation + Save 3785ms 292ms
Model Load 2769ms 0ms (model is on memory)
Classification (AVG) 13ms 1ms
● Forest size: 15000 trees
● Data: $(SPARK_HOME)/data/mllib/sample_libsvm_data.txt
29
Going Forward - More Features
▪ Implement more Spark-ML model types
- SVM
- Naive Bayes Classifier
- Neural Networks
▪ Integration with Redis’ native types
▪ Data Processing (e.g. Word2Vec, TF-IDF)
▪ PMML Support
30
PS: Neural Redis
▪ Developed by Salvatore
▪ Training is done inside redis
▪ Online continuous training process
▪ Builds Fully Connected NNs
31
More Resources
Redis-ML:
https://github.com/RedisLabsModules/redis-ml
Spark-Redis-ML:
https://github.com/RedisLabs/spark-redis-ml
Neural-Redis:
https://github.com/antirez/neural-redis
32

More Related Content

What's hot

Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101Dvir Volk
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringErik Krogen
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture ForumChristopher Spring
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergendistributed matters
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Tim Lossen
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redisTanu Siwag
 
Background Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitBackground Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitRedis Labs
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with RedisGeorge Platon
 
A simple introduction to redis
A simple introduction to redisA simple introduction to redis
A simple introduction to redisZhichao Liang
 
Redis memcached pdf
Redis memcached pdfRedis memcached pdf
Redis memcached pdfErin O'Neill
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsDave Nielsen
 
Scaling php applications with redis
Scaling php applications with redisScaling php applications with redis
Scaling php applications with redisjimbojsb
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeErik Krogen
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 

What's hot (20)

Redis modules 101
Redis modules 101Redis modules 101
Redis modules 101
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 
Redis and it's data types
Redis and it's data typesRedis and it's data types
Redis and it's data types
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
Background Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbitBackground Tasks in Node - Evan Tahler, TaskRabbit
Background Tasks in Node - Evan Tahler, TaskRabbit
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
A simple introduction to redis
A simple introduction to redisA simple introduction to redis
A simple introduction to redis
 
Redis tutoring
Redis tutoringRedis tutoring
Redis tutoring
 
Redis memcached pdf
Redis memcached pdfRedis memcached pdf
Redis memcached pdf
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale Apps
 
Scaling php applications with redis
Scaling php applications with redisScaling php applications with redis
Scaling php applications with redis
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 

Similar to Boosting Machine Learning with Redis Modules and Spark

Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir VolkSpark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir VolkSpark Summit
 
Cost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWSCost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWSAmazon Web Services
 
What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsRedis Labs
 
Redis for Security Data : SecurityScorecard JVM Redis Usage
Redis for Security Data : SecurityScorecard JVM Redis UsageRedis for Security Data : SecurityScorecard JVM Redis Usage
Redis for Security Data : SecurityScorecard JVM Redis UsageTimothy Spann
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network
 
Hadoop Spark - Reuniao SouJava 12/04/2014
Hadoop Spark - Reuniao SouJava 12/04/2014Hadoop Spark - Reuniao SouJava 12/04/2014
Hadoop Spark - Reuniao SouJava 12/04/2014soujavajug
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisItamar Haber
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksAnyscale
 
Real time Object Detection and Analytics using RedisEdge and Docker
Real time Object Detection and Analytics using RedisEdge and DockerReal time Object Detection and Analytics using RedisEdge and Docker
Real time Object Detection and Analytics using RedisEdge and DockerAjeet Singh Raina
 
Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017HashedIn Technologies
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Redis Labs
 
MySQL Document Store - A Document Store with all the benefts of a Transactona...
MySQL Document Store - A Document Store with all the benefts of a Transactona...MySQL Document Store - A Document Store with all the benefts of a Transactona...
MySQL Document Store - A Document Store with all the benefts of a Transactona...Olivier DASINI
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkGuido Schmutz
 
Redispresentation apac2012
Redispresentation apac2012Redispresentation apac2012
Redispresentation apac2012Ankur Gupta
 
Developing a Redis Module - Hackathon Kickoff
 Developing a Redis Module - Hackathon Kickoff Developing a Redis Module - Hackathon Kickoff
Developing a Redis Module - Hackathon KickoffItamar Haber
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systemsramazan fırın
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014cdmaxime
 

Similar to Boosting Machine Learning with Redis Modules and Spark (20)

Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir VolkSpark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
 
Redis by-hari
Redis by-hariRedis by-hari
Redis by-hari
 
Cost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWSCost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWS
 
What's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis LabsWhat's new with enterprise Redis - Leena Joshi, Redis Labs
What's new with enterprise Redis - Leena Joshi, Redis Labs
 
Redis for Security Data : SecurityScorecard JVM Redis Usage
Redis for Security Data : SecurityScorecard JVM Redis UsageRedis for Security Data : SecurityScorecard JVM Redis Usage
Redis for Security Data : SecurityScorecard JVM Redis Usage
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
 
Hadoop Spark - Reuniao SouJava 12/04/2014
Hadoop Spark - Reuniao SouJava 12/04/2014Hadoop Spark - Reuniao SouJava 12/04/2014
Hadoop Spark - Reuniao SouJava 12/04/2014
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
 
Real time Object Detection and Analytics using RedisEdge and Docker
Real time Object Detection and Analytics using RedisEdge and DockerReal time Object Detection and Analytics using RedisEdge and Docker
Real time Object Detection and Analytics using RedisEdge and Docker
 
NoSQL
NoSQLNoSQL
NoSQL
 
Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017Redis Modules - Redis India Tour - 2017
Redis Modules - Redis India Tour - 2017
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
 
MySQL Document Store - A Document Store with all the benefts of a Transactona...
MySQL Document Store - A Document Store with all the benefts of a Transactona...MySQL Document Store - A Document Store with all the benefts of a Transactona...
MySQL Document Store - A Document Store with all the benefts of a Transactona...
 
Real-Time Analytics with Apache Cassandra and Apache Spark,
Real-Time Analytics with Apache Cassandra and Apache Spark,Real-Time Analytics with Apache Cassandra and Apache Spark,
Real-Time Analytics with Apache Cassandra and Apache Spark,
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache Spark
 
Redispresentation apac2012
Redispresentation apac2012Redispresentation apac2012
Redispresentation apac2012
 
Developing a Redis Module - Hackathon Kickoff
 Developing a Redis Module - Hackathon Kickoff Developing a Redis Module - Hackathon Kickoff
Developing a Redis Module - Hackathon Kickoff
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systems
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
 

More from Dvir Volk

Searching Billions of Documents with Redis
Searching Billions of Documents with RedisSearching Billions of Documents with Redis
Searching Billions of Documents with RedisDvir Volk
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningDvir Volk
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about goDvir Volk
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redisDvir Volk
 
Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2Dvir Volk
 
Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to ThriftDvir Volk
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 

More from Dvir Volk (8)

RediSearch
RediSearchRediSearch
RediSearch
 
Searching Billions of Documents with Redis
Searching Billions of Documents with RedisSearching Billions of Documents with Redis
Searching Billions of Documents with Redis
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe running
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about go
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
 
Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to Thrift
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Boosting Machine Learning with Redis Modules and Spark

  • 1. Boosting Machine Learning with Redis Modules and Spark Dvir Volk, Redis Labs, November 2016
  • 2. 2 Hello World Open source. The leading in-memory database The open source home and commercial provider of Redis - cloud and on-premise Senior System Architect at Redis Labs. Redis user and contributor for ~6 years @dvirsky dvirvolk
  • 3. 3 A Brief Overview of Redis ● Started in 2009 by Salvatore Sanfilippo ● Mostly a one man show ● Most popular KV store ● Notable Users: ○ Twitter, Netflix, Uber, Groupon, Twitch ○ Many, many more...
  • 4. 4 A Brief Overview of Redis ▪ Key => Data Structure server ▪ In memory disk backed ▪ Optional cluster mode ▪ Embedded Lua scripting ▪ Single Threaded! ▪ Key features: Fast, Flexible, Simple
  • 5. 5 A Lego For Your Database Key "I'm a Plain Text String!" { A: “foo”, B: “bar”, C: “baz” } Strings/Blobs/Bitmaps Hash Tables (objects!) Linked Lists Sets Sorted Sets Geo Sets HyperLogLog { A , B , C , D , E } [ A → B → C → D → E ] { A: 0.1, B: 0.3, C: 100, D: 1337 } { A: (51.5, 0.12), B: (32.1, 34.7) } 00110101 11001110 10101010
  • 6. 6 Redis In Practice ▪ “Front End Database” ▪ Real Time Counters ▪ Ad Serving ▪ Message Queues ▪ Geo Database ▪ Time Series ▪ Cache ▪ Session State ▪ Etc
  • 7. 7 But Can Redis Do X? Secondary Index? Time Series? Full Text Search? Graph? Machine Learning? AutoComplete? SQL?
  • 8. 8 So You Want a New Feature? ▪ Try a Lua script ▪ Convince @antirez ▪ Fork Redis ▪ Build Your Own Database!
  • 9. 9 Enter Redis Modules ▪ In development since March 2016 ▪ Redis 4.0 RC out soon ▪ Several modules already exist ▪ Key paradigm shift for Redis
  • 10. 10 New Capabilities What Modules Actually Are ▪ Dynamic libraries loaded to redis ▪ Written in C/C++ ▪ Use a C ABI/API isolating redis internals ▪ Near Zero latency access to data New Commands New Data Types
  • 12. 12 LEFTPAD Example 127.0.0.1:6379> MODULE LOAD "./example.so" OK 127.0.0.1:6379> COMMAND INFO EXAMPLE.LEFTPAD 1) 1) "example.leftpad" ... 127.0.0.1:6379> EXAMPLE.LEFTPAD "foo" 8 foo 127.0.0.1:6379> EXAMPLE.LEFTPAD "foo" 8 "_" _____foo
  • 13. 13 Real Module: RediSearch ▪ From-Scratch search index over redis ▪ Uses Strings for holding compressed index data ▪ Includes stemming, exact phrase match, etc. ▪ Fast Fuzzy Auto-complete ▪ Up to X5 faster than Elastic / Solr > FT.SEARCH “lcd tv” FILTER price 100 +inf > FT.SUGGET “lcd” FUZZY
  • 14. 14 More Modules Out There ▪ Native JSON Support ▪ Time Series ▪ Secondary Indexing ▪ Encryption ▪ Bloom Filters ▪ Online Neural Network ▪ Many Many more...
  • 15. 15 Spark ML + Redis modules
  • 16. 16 Redis + Spark So Far ▪ Current connector: - RDD abstraction - SparkSQL - Streaming Source ▪ ML is not addressed specifically ▪ Used for pre-computed results ▪ We felt that we can take it further
  • 17. 17 Addressing The ML Pain ▪ The missing piece of ML: Serving your model - Not standardized - Vendor-lock with cloud platforms - Reliable services are hard to do - If only we had a “database” for this! - Well, maybe we do?
  • 18. 18 Why Modules for ML? With modules we can: ▪ Define data structures for models ▪ Store training output as “hot model” ▪ Perform evaluation directly in Redis ▪ Easily integrate existing C/C++ libs
  • 19. 19 Spark + Modules = AWESOME ▪ Train ML model on Spark ▪ Save model to Redis and get: - High availability - Clustering - Persistence - Performance - Client libraries
  • 20. 20 Spark-ML End-to-End Flow Spark Training Custom Server Model saved to Parquet file Data Loaded to Spark Pre-computed results Batch Evaluation ? ClientApp
  • 21. 21 Adding Redis Into The Mix Redis-ML “Active Model” Any Training Platform ClientApp Spark Training Data Loaded to Spark
  • 22. 22 Redis Module Tree Ensembles Linear Regression Logistic Regression Matrix + Vector Operations More to come... The Redis-ML Module
  • 24. 24 Forest Data Type ▪ A collection of decision trees ▪ Supports classification & regression ▪ Splitter Node can be - Categorical (e.g. day == “Sunday”) - Numerical (e.g. age < 43)
  • 25. 25 Decision Tree Example The famous Titanic survival predictor sex=male?yes no Survived Died Age > 9.5? sibsp > 2.5? Died Survived *sibsp = siblings + spouses
  • 26. 26 Forest Data Type Example > MODULE LOAD "./redis-ml.so" OK > ML.FOREST.ADD myforest 0 . CATEGORIC sex “male” .L LEAF 1 .R LEAF 0 OK > ML.FOREST.RUN myforest sex:male "1" > ML.FOREST.RUN myforest sex:yes_please "0"
  • 27. 27 Using Redis-ML With Spark scala> import com.redislabs.client.redisml.MLClient scala> import com.redislabs.provider.redis.ml.Forest scala> val rfModel = pipelineModel.stages.last.asInstanceOf[RandomForestClassificationModel] scala> val f = new Forest(rfModel.trees) scala> f.loadToRedis("forest-test", "localhost") scala> val jedis = new Jedis("localhost") scala> jedis.getClient.sendCommand(MLClient.ModuleCommand.FOREST_RUN, "forest-test", makeInputString (0)) scala> jedis.getClient.getStatusCodeReply res53: String = 1
  • 28. 28 Benchmarking Redis-ML - Spark + Parquet Spark + Redis ML Model Preparation + Save 3785ms 292ms Model Load 2769ms 0ms (model is on memory) Classification (AVG) 13ms 1ms ● Forest size: 15000 trees ● Data: $(SPARK_HOME)/data/mllib/sample_libsvm_data.txt
  • 29. 29 Going Forward - More Features ▪ Implement more Spark-ML model types - SVM - Naive Bayes Classifier - Neural Networks ▪ Integration with Redis’ native types ▪ Data Processing (e.g. Word2Vec, TF-IDF) ▪ PMML Support
  • 30. 30 PS: Neural Redis ▪ Developed by Salvatore ▪ Training is done inside redis ▪ Online continuous training process ▪ Builds Fully Connected NNs
  • 32. 32