SlideShare a Scribd company logo
1 of 17
How to model anything in Redis
by Josiah Carlson
@dr_josiah
dr-josiah.com
Agenda
● Who am I?
● What is Redis (in 2 minutes)?
● Queries, a precursor to modeling
● How do I model my data?
● Questions
Who am I?
● Redis user/enthusiast/contributor
● Author of Redis in Action
○ Read free at: bit.ly/ria-free
○ Buy paperback/ebook: manning.com/carlson/
● Author of a couple Python + Redis libraries
○ RPQueue (deferred/scheduled tasks)
○ rom (rich Redis object mapper for Python)
○ lua_call (library for Lua to Lua calls in Redis)
● And other stuff (check my blog, github, etc.)
What is Redis (in 2 minutes)?
● In-memory key/value
● Values can be one of 5 native types
○ String, List, Hash, Set, Sorted Set
○ HyperLogLog (string)
● Optionally persisted to disk
● Optional replication, clustering, high-availability
● Server-side Lua scripting (stored procedures)
Redis is also pretty fast…
(on an Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz)
PING_INLINE: 2777777.75 requests per second
PING_BULK: 3502627.00 requests per second
SET: 1191895.12 requests per second
GET: 2290950.75 requests per second
INCR: 1264222.50 requests per second
LPUSH: 1091703.00 requests per second
LPOP: 1285347.00 requests per second
SADD: 1986097.38 requests per second
SPOP: 2642007.75 requests per second
Queries, a precursor to modeling
● Find data that matches these filters
○ E.g. SELECT … FROM … WHERE …
● “Good” query languages fit existing ideas
about what, from where, filtering by
● The existence of a query language makes
the DB “easy” to use
○ *SQL, CouchDB, MongoDB, Cassandra,
Elasticsearch/SOLR, …
Queries, a precursor to modeling (cont)
● Better query languages offer chaining and
nesting
○ SELECT … FROM (SELECT … FROM …)
○ (A or B) and (C or D and not E)
● When all else fails, some DBs have
○ Stored Procedures
○ Triggers
○ Views
What does Redis have instead?
● Redis has a data structure manipulation and
access API
○ GET, SET, ADD, REMOVE, INCREMENT,
LENGTH, RANGE GET, RANGE SET, etc.
○ Loosely fits “what”, “from where”, “filtering by”
● Also has Lua scripting
○ Some command sequences restricted for the sake of
replication/disk persistence consistency
What does Redis have instead? (cont)
● Why don’t we just end this conversation with
“just use Lua scripting in Redis?”
○ Doesn’t solve all our problems
○ Still need to structure data properly so we’re not
doing a table scan equivalent
○ Single-threaded nature of command execution
means slow scripts block execution
How do I model my data?
● In other databases
○ Tables + rows + indexes
○ Documents + indexes
○ Key -> value
● With Redis
○ What queries do I need to perform?
■ What data do I want to filter by and fetch?
○ What Redis structures can I use/combine to actually
execute the filtering required?
Example 0: user last active
● Who has been active in the last 3 weeks?
● In a relational DB:
○ last_active column on user table, indexed
● In Redis:
○ We want user id, filtered by time
■ ZSET last_active -> {<user_id>: <timestamp>, …}
■ SET active:<day> -> {<user_id>, …}
■ (HyperLogLog active:<day> -> {...})
Example 1: text search index
● Find document/content with these words
● In a relational DB:
○ LIKE :’(
○ GIN/GiST in Postgres, FULLTEXT in MySQL
● In Redis:
○ Inverted index: <context>:<word> -> {<docid>, …}
○ Sorted: <context>:<word> -> {<docid>: <score>, …}
○ TF/IDF: like sorted, different meaning for scores
Example 2: restaurant menus
● What items are available at restaurant X at
time/date Y?
● In a relational DB, modeled with 12 tables
○ Even relational queries are of limited use with an 8+
table join (we never went there)
○ Could be 8 sequential queries, 3 with joins:
■ 1) schedule + menus, 2) m2m + items + item
categories, 3) m2m + add-ons + add-on
categories
Example 2: restaurant menus (cont)
● In Redis, pre-cache the “tree” structure
○ Cache up to item-level (JSON blobs)
○ Invalidate from leaf to root on change
○ Re-cache “immediately”
● Execution becomes
○ Find menus that are available at time X
○ Get the item ids for the menus (SMEMBERS)
○ Fetch the items (MGET)
How to model your data in Redis:
1. Know what you are searching for and how
2. Understand the basics of Redis structures
(refer to docs as necessary)
3. Translate your data into Redis structures
4. Optional: use Lua to build higher-level
operations for your “composite” structures
5. Optional: contact an expert
Questions
Thank you!
by Josiah Carlson
@dr_josiah
dr-josiah.com

More Related Content

What's hot

Crawling the Web for Structured Documents
Crawling the Web for Structured DocumentsCrawling the Web for Structured Documents
Crawling the Web for Structured DocumentsJulián Urbano
 
Application-level Denial of Service
Application-level Denial of ServiceApplication-level Denial of Service
Application-level Denial of ServiceVlad Garbuz
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...Data Science Milan
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC PythonMike Dirolf
 
Mango Database - Web Development
Mango Database - Web DevelopmentMango Database - Web Development
Mango Database - Web Developmentmssaman
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDBCésar Trigo
 
The solution for technical test claire kim
The solution for technical test   claire kimThe solution for technical test   claire kim
The solution for technical test claire kimClaire Kim
 
Discovering Memes in Social Media
Discovering Memes in Social MediaDiscovering Memes in Social Media
Discovering Memes in Social MediaMatthew Lease
 

What's hot (11)

Crawling the Web for Structured Documents
Crawling the Web for Structured DocumentsCrawling the Web for Structured Documents
Crawling the Web for Structured Documents
 
Mongo db
Mongo dbMongo db
Mongo db
 
Getting triples from records: the role of ISBD
Getting triples from records: the role of ISBDGetting triples from records: the role of ISBD
Getting triples from records: the role of ISBD
 
Application-level Denial of Service
Application-level Denial of ServiceApplication-level Denial of Service
Application-level Denial of Service
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC Python
 
Mango Database - Web Development
Mango Database - Web DevelopmentMango Database - Web Development
Mango Database - Web Development
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
The solution for technical test claire kim
The solution for technical test   claire kimThe solution for technical test   claire kim
The solution for technical test claire kim
 
Discovering Memes in Social Media
Discovering Memes in Social MediaDiscovering Memes in Social Media
Discovering Memes in Social Media
 

Similar to Big Data Day LA 2015 - How to model anything in Redis by Josiah Carlson of Zeconomy

Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabasePostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabaseMubashar Iqbal
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyGuillaume Lefranc
 
A super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMA super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMHolden Karau
 
Data analysis with Pandas and Spark
Data analysis with Pandas and SparkData analysis with Pandas and Spark
Data analysis with Pandas and SparkFelix Crisan
 
SQL or NoSQL - how to choose
SQL or NoSQL - how to chooseSQL or NoSQL - how to choose
SQL or NoSQL - how to chooseLars Thorup
 
How to get started in Big Data for master's students
How to get started in Big Data for master's studentsHow to get started in Big Data for master's students
How to get started in Big Data for master's studentsMohamed Nadjib MAMI
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMHolden Karau
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of dataPiyush Katariya
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...jexp
 
Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...
Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...
Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...Jose Mº Muñoz
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataJihoon Son
 
HPEC 2021 sparse binary format
HPEC 2021 sparse binary formatHPEC 2021 sparse binary format
HPEC 2021 sparse binary formatErikWelch2
 
Overview of no sql
Overview of no sqlOverview of no sql
Overview of no sqlSean Murphy
 
Active Data Stores at 30,000ft
Active Data Stores at 30,000ftActive Data Stores at 30,000ft
Active Data Stores at 30,000ftJeffrey Sica
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015Jorg Janke
 

Similar to Big Data Day LA 2015 - How to model anything in Redis by Josiah Carlson of Zeconomy (20)

Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabasePostgreSQL - Object Relational Database
PostgreSQL - Object Relational Database
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4j
 
NoSQL Solutions - a comparative study
NoSQL Solutions - a comparative studyNoSQL Solutions - a comparative study
NoSQL Solutions - a comparative study
 
A super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMA super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAM
 
Data analysis with Pandas and Spark
Data analysis with Pandas and SparkData analysis with Pandas and Spark
Data analysis with Pandas and Spark
 
SQL or NoSQL - how to choose
SQL or NoSQL - how to chooseSQL or NoSQL - how to choose
SQL or NoSQL - how to choose
 
How to get started in Big Data for master's students
How to get started in Big Data for master's studentsHow to get started in Big Data for master's students
How to get started in Big Data for master's students
 
NoSQL for Artificial Intelligence
NoSQL for Artificial IntelligenceNoSQL for Artificial Intelligence
NoSQL for Artificial Intelligence
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVM
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of data
 
Python redis talk
Python redis talkPython redis talk
Python redis talk
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...
 
Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...
Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...
Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...
 
Druid
DruidDruid
Druid
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
HPEC 2021 sparse binary format
HPEC 2021 sparse binary formatHPEC 2021 sparse binary format
HPEC 2021 sparse binary format
 
Overview of no sql
Overview of no sqlOverview of no sql
Overview of no sql
 
Active Data Stores at 30,000ft
Active Data Stores at 30,000ftActive Data Stores at 30,000ft
Active Data Stores at 30,000ft
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Big Data Day LA 2015 - How to model anything in Redis by Josiah Carlson of Zeconomy

  • 1. How to model anything in Redis by Josiah Carlson @dr_josiah dr-josiah.com
  • 2. Agenda ● Who am I? ● What is Redis (in 2 minutes)? ● Queries, a precursor to modeling ● How do I model my data? ● Questions
  • 3. Who am I? ● Redis user/enthusiast/contributor ● Author of Redis in Action ○ Read free at: bit.ly/ria-free ○ Buy paperback/ebook: manning.com/carlson/ ● Author of a couple Python + Redis libraries ○ RPQueue (deferred/scheduled tasks) ○ rom (rich Redis object mapper for Python) ○ lua_call (library for Lua to Lua calls in Redis) ● And other stuff (check my blog, github, etc.)
  • 4. What is Redis (in 2 minutes)? ● In-memory key/value ● Values can be one of 5 native types ○ String, List, Hash, Set, Sorted Set ○ HyperLogLog (string) ● Optionally persisted to disk ● Optional replication, clustering, high-availability ● Server-side Lua scripting (stored procedures)
  • 5. Redis is also pretty fast… (on an Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz) PING_INLINE: 2777777.75 requests per second PING_BULK: 3502627.00 requests per second SET: 1191895.12 requests per second GET: 2290950.75 requests per second INCR: 1264222.50 requests per second LPUSH: 1091703.00 requests per second LPOP: 1285347.00 requests per second SADD: 1986097.38 requests per second SPOP: 2642007.75 requests per second
  • 6. Queries, a precursor to modeling ● Find data that matches these filters ○ E.g. SELECT … FROM … WHERE … ● “Good” query languages fit existing ideas about what, from where, filtering by ● The existence of a query language makes the DB “easy” to use ○ *SQL, CouchDB, MongoDB, Cassandra, Elasticsearch/SOLR, …
  • 7. Queries, a precursor to modeling (cont) ● Better query languages offer chaining and nesting ○ SELECT … FROM (SELECT … FROM …) ○ (A or B) and (C or D and not E) ● When all else fails, some DBs have ○ Stored Procedures ○ Triggers ○ Views
  • 8. What does Redis have instead? ● Redis has a data structure manipulation and access API ○ GET, SET, ADD, REMOVE, INCREMENT, LENGTH, RANGE GET, RANGE SET, etc. ○ Loosely fits “what”, “from where”, “filtering by” ● Also has Lua scripting ○ Some command sequences restricted for the sake of replication/disk persistence consistency
  • 9. What does Redis have instead? (cont) ● Why don’t we just end this conversation with “just use Lua scripting in Redis?” ○ Doesn’t solve all our problems ○ Still need to structure data properly so we’re not doing a table scan equivalent ○ Single-threaded nature of command execution means slow scripts block execution
  • 10. How do I model my data? ● In other databases ○ Tables + rows + indexes ○ Documents + indexes ○ Key -> value ● With Redis ○ What queries do I need to perform? ■ What data do I want to filter by and fetch? ○ What Redis structures can I use/combine to actually execute the filtering required?
  • 11. Example 0: user last active ● Who has been active in the last 3 weeks? ● In a relational DB: ○ last_active column on user table, indexed ● In Redis: ○ We want user id, filtered by time ■ ZSET last_active -> {<user_id>: <timestamp>, …} ■ SET active:<day> -> {<user_id>, …} ■ (HyperLogLog active:<day> -> {...})
  • 12. Example 1: text search index ● Find document/content with these words ● In a relational DB: ○ LIKE :’( ○ GIN/GiST in Postgres, FULLTEXT in MySQL ● In Redis: ○ Inverted index: <context>:<word> -> {<docid>, …} ○ Sorted: <context>:<word> -> {<docid>: <score>, …} ○ TF/IDF: like sorted, different meaning for scores
  • 13. Example 2: restaurant menus ● What items are available at restaurant X at time/date Y? ● In a relational DB, modeled with 12 tables ○ Even relational queries are of limited use with an 8+ table join (we never went there) ○ Could be 8 sequential queries, 3 with joins: ■ 1) schedule + menus, 2) m2m + items + item categories, 3) m2m + add-ons + add-on categories
  • 14. Example 2: restaurant menus (cont) ● In Redis, pre-cache the “tree” structure ○ Cache up to item-level (JSON blobs) ○ Invalidate from leaf to root on change ○ Re-cache “immediately” ● Execution becomes ○ Find menus that are available at time X ○ Get the item ids for the menus (SMEMBERS) ○ Fetch the items (MGET)
  • 15. How to model your data in Redis: 1. Know what you are searching for and how 2. Understand the basics of Redis structures (refer to docs as necessary) 3. Translate your data into Redis structures 4. Optional: use Lua to build higher-level operations for your “composite” structures 5. Optional: contact an expert
  • 17. Thank you! by Josiah Carlson @dr_josiah dr-josiah.com