SlideShare a Scribd company logo
RediSearch Aggregations
Super fast data processing pipeline!
RediSearch 101
● High Performance Search Index
● Redis Module
● Written From Scratch in C
● Supports multiple index types
● Open Source / Commercial
○ Commercial is scalable and distributed
○ Several commercial customers
○ Largest cluster: 200 shards, ~2B documents, 8TB RAM
What Can RediSearch Do?
● Full-Text, Numeric, Geo and Tag indexes
● Real-time indexing and updates
● Distributed search on Billions of documents
● Complex query language
● Auto-Complete
● Highlighting
So, Aggregations...
“Data aggregation is the compiling of information from databases with intent
to prepare combined datasets for data processing.”
(Whatever you say, Wikipedia)
● Take a Search Query
● Process the results to produce some insights from them by:
○ Grouping
○ Reducing groups
○ Applying Transformations
○ Sorting
● Transform plain search results into statistical insights
Aggregate Request vs Search Request
● Search request just fetches the top N most relevant results
● Processing is minimal if any
● Focus is on filtering and ranking
● Aggregate requests use the same query language
● But add a pipeline of processing steps
● Emphasis on processing and transforming
The Aggregation Pipeline
● Transform the results by combining multiple steps
● All steps are repeatable and work on the upstream data
Filter Group
Reduce
Apply Sort Apply
Reduce
What The API Looks Like
FT.AGGREGATE {index} {query}
[LOAD {len} {property} ...]
[GROUPBY {len} {property} ...
[REDUCE {func} {len} {arg} … [AS {alias}] ]
[SORTBY {len}{property} [ASC|DESC] …[MAX {n}]
[APPLY {expr} AS {alias}]
[LIMIT {offset} {num} ]
Aggregate Filters
● Determine what results we process
● Same query syntax as normal RediSearch queries
● Can be used to create complex selections
@country:{Israel | Italy}
@age:[25 52]
@profession:”1337 h4x0r”
-@languages:{perl | php}
GROUPBY and Reduce
● Grouping by any number of fields / upstream properties
● A GROUPBY is associated with many reducers
● Reducers process each matching result per group and “flatten” them
GROUPBY @x @y
Reduce ⇒ count
Group{foo,bar} Group{bar,baz}
Reduce ⇒ count
Available Reducers
● COUNT
● COUNT_DISTINCT
● COUNT_DISTINCTISH
● MIN / MAX
● AVG
● SUM
● STDDEV
● QUANTILE
● FIRST_VALUE
● TOLIST
APPLY Transformations
● APPLY applies 1:1 transformations on results
● Evaluate a complex expression per result
● Can be numeric / text functions
● Numeric operators
● Runtime errors result in NULL
APPLY “sqrt((@a + @b)/2)” AS avg
APPLY “time(floor(@timestamp/86400)*86400)” AS date
APPLY “format(‘Hello, %s’, @userName)” AS msg
Numeric Functions and Expressions
● Numeric operators:
+ - / * % ^
● log(x), log2(x)
● floor(x), ciel(x)
● abs(x)
● sqrt(x)
● exp(x)
String Functions and Expressions
● lower(s)
● upper(s)
● substr(s, offset, len)
● format(fmt, …)
● time(unix_timestamp, [fmt])
● parsetime(str, [fmt])
SORTBY
● Sort by one or more properties
● Each can be ASC/DESC
● MAX limits to top-N results
● Using MinMaxHeap
LIMIT vs. SORTBY MAX
● SORTBY … MAX {n}
● LIMIT {offset} {num}
● Limit can be applied even without sorting
● Max just speeds up sorts, but doesn’t page through results
● Getting results 10-20 will work best as:
SORTBY … MAX 20 LIMIT 10 10
Aggregations In Practice
The Problem
● Github events (push, issue, pr, etc) stream
● For each event we have:
○ Repo
○ Actor (user)
○ Type
○ Time
○ Text
● Let’s extract the top committers on Github!
Step 1: Filter
We scan only pushes:
FT.AGGREGATE gh “@type:{PushEvent}”
Step 2: Group/Reduce
Let’s group by actor, and count the number of commits and unique repos.
FT.AGGREGATE gh “@type:{PushEvent}”
GROUPBY 1 @actor
REDUCE COUNT 0 AS commits
REDUCE COUNT_DISTINCT 1 @repo AS repos
Step 3: Apply transformations if needed
Let’s group by actor, and count the number of commits and unique repos.
FT.AGGREGATE gh “@type:{PushEvent}”
GROUPBY 1 @actor
REDUCE COUNT 0 AS commits
REDUCE COUNT_DISTINCT 1 @repo AS repos
APPLY “@commits/@repos” AS commits_per_repo
Step 4: Sort
Sort by commits per repo:
FT.AGGREGATE gh “@type:{PushEvent}”
GROUPBY 1 @actor
REDUCE COUNT 0 AS commits
REDUCE COUNT_DISTINCT 1 @repo AS repos
APPLY “@commits/@repos” AS commits_per_repo
SORTBY 2 @commits_per_repo DESC MAX 20
Another Example: Total/Unique Visitors Per Hour
FT.AGGREGATE visits
APPLY “@time - (@time%3600)” AS hour
GROUPBY 1 @hour
REDUCE COUNT_DISTINCT 1 @userId AS unique
REDUCE COUNT 0 AS total
SORTBY 2 @hour ASC
APPLY “time(@hour)” AS hour
(Near) Future Challenges
How Do You Scale Aggregations?
● All this works very well on a single node
● But what if your data is split across nodes?
● Simple searches run the query on each
node and merge
How Do You Scale Aggregations?
● That cannot always be done for aggregations
○ How would you merge AVG?
○ How would you merge COUNT_DISTINCT?
Simple Approach - Filter Push-Down
● We push the filtering stage and the properties we want to all nodes
● We merge all results into one stream
● Perform aggregations on it as if it was a single node’s results
● Pros: simple
● Cons: ALL matching records must be transferred over the network
Nicer Approach - Complex Push-Down
● We tale the original request
● For each step we can either:
○ Push it down to the nodes as is
○ Push down and create some special merge logic
○ Decide we cannot push down anything and quit
Nicer Approach - Complex Push-Down
● Merge logic:
○ COUNT ⇒ Pushed down as COUNT, merged as SUM
○ AVG ⇒ Pushed down as SUM + COUNT, merged as SUM/COUNT
○ COUNT_DISTINCT ⇒ Pushed as TOLIST, merged as COUNT_DISTINCT
○ Etc
Thank You!

More Related Content

What's hot

WHODIS_kearns_presentation.v0a
WHODIS_kearns_presentation.v0aWHODIS_kearns_presentation.v0a
WHODIS_kearns_presentation.v0aEdward Kearns
 
Stream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsStream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka Streams
Tim Ysewyn
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series World
MapR Technologies
 
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
Johan
 
tado° Makes Your Home Environment Smart with InfluxDB
tado° Makes Your Home Environment Smart with InfluxDBtado° Makes Your Home Environment Smart with InfluxDB
tado° Makes Your Home Environment Smart with InfluxDB
InfluxData
 
Terark Product and Technology
Terark Product and TechnologyTerark Product and Technology
Terark Product and Technology
Xinyuan Fu
 
Effective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcEffective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcDevopsdays
 
Consistent hashing algorithmic tradeoffs
Consistent hashing  algorithmic tradeoffsConsistent hashing  algorithmic tradeoffs
Consistent hashing algorithmic tradeoffs
Evan Lin
 
Scaling Graphite At Yelp
Scaling Graphite At YelpScaling Graphite At Yelp
Scaling Graphite At Yelp
Paul O'Connor
 
InfluxDB & Grafana
InfluxDB & GrafanaInfluxDB & Grafana
InfluxDB & Grafana
Pedro Salgado
 
Scaling metrics
Scaling metricsScaling metrics
Scaling metrics
Vladimir Varfolomeev
 
TypoScript and EEL outside of Neos [InspiringFlow2013]
TypoScript and EEL outside of Neos [InspiringFlow2013]TypoScript and EEL outside of Neos [InspiringFlow2013]
TypoScript and EEL outside of Neos [InspiringFlow2013]
Christian Müller
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Nick Galbreath
 
InfiniFlux performance
InfiniFlux performanceInfiniFlux performance
InfiniFlux performance
InfiniFlux
 
Virtual training Intro to the Tick stack and InfluxEnterprise
Virtual training  Intro to the Tick stack and InfluxEnterpriseVirtual training  Intro to the Tick stack and InfluxEnterprise
Virtual training Intro to the Tick stack and InfluxEnterprise
InfluxData
 
Bitcoin Price Detection with Pyspark presentation
Bitcoin Price Detection with Pyspark presentationBitcoin Price Detection with Pyspark presentation
Bitcoin Price Detection with Pyspark presentation
Yakup Görür
 
MongoDB - Warehouse and Aggregator of Events
MongoDB - Warehouse and Aggregator of EventsMongoDB - Warehouse and Aggregator of Events
MongoDB - Warehouse and Aggregator of Events
Maxim Ligus
 
Dato vs GraphX
Dato vs GraphXDato vs GraphX
Dato vs GraphX
Keira Zhou
 
Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...
Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...
Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...
Jørgen Orehøj Erichsen
 

What's hot (20)

WHODIS_kearns_presentation.v0a
WHODIS_kearns_presentation.v0aWHODIS_kearns_presentation.v0a
WHODIS_kearns_presentation.v0a
 
Stream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsStream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka Streams
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series World
 
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
 
tado° Makes Your Home Environment Smart with InfluxDB
tado° Makes Your Home Environment Smart with InfluxDBtado° Makes Your Home Environment Smart with InfluxDB
tado° Makes Your Home Environment Smart with InfluxDB
 
Terark Product and Technology
Terark Product and TechnologyTerark Product and Technology
Terark Product and Technology
 
Effective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôcEffective monitoring with statsd - Alexis lê-quôc
Effective monitoring with statsd - Alexis lê-quôc
 
Consistent hashing algorithmic tradeoffs
Consistent hashing  algorithmic tradeoffsConsistent hashing  algorithmic tradeoffs
Consistent hashing algorithmic tradeoffs
 
Scaling Graphite At Yelp
Scaling Graphite At YelpScaling Graphite At Yelp
Scaling Graphite At Yelp
 
InfluxDB & Grafana
InfluxDB & GrafanaInfluxDB & Grafana
InfluxDB & Grafana
 
Scaling metrics
Scaling metricsScaling metrics
Scaling metrics
 
TypoScript and EEL outside of Neos [InspiringFlow2013]
TypoScript and EEL outside of Neos [InspiringFlow2013]TypoScript and EEL outside of Neos [InspiringFlow2013]
TypoScript and EEL outside of Neos [InspiringFlow2013]
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
 
InfiniFlux performance
InfiniFlux performanceInfiniFlux performance
InfiniFlux performance
 
Virtual training Intro to the Tick stack and InfluxEnterprise
Virtual training  Intro to the Tick stack and InfluxEnterpriseVirtual training  Intro to the Tick stack and InfluxEnterprise
Virtual training Intro to the Tick stack and InfluxEnterprise
 
Bitcoin Price Detection with Pyspark presentation
Bitcoin Price Detection with Pyspark presentationBitcoin Price Detection with Pyspark presentation
Bitcoin Price Detection with Pyspark presentation
 
MongoDB - Warehouse and Aggregator of Events
MongoDB - Warehouse and Aggregator of EventsMongoDB - Warehouse and Aggregator of Events
MongoDB - Warehouse and Aggregator of Events
 
Log Aggregation
Log AggregationLog Aggregation
Log Aggregation
 
Dato vs GraphX
Dato vs GraphXDato vs GraphX
Dato vs GraphX
 
Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...
Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...
Using GraphQL for the next generation of the Holdsport APIUsing GraphQL for t...
 

Similar to Redis Day TLV 2018 - RediSearch Aggregations

RedisConf18 - Introducing RediSearch Aggregations
RedisConf18 - Introducing RediSearch AggregationsRedisConf18 - Introducing RediSearch Aggregations
RedisConf18 - Introducing RediSearch Aggregations
Redis Labs
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
Mukesh Singh
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
MongoDB
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
Mao Geng
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
Gianfranco Palumbo
 
Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1
Anuj Jain
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
NETWAYS
 
Mongodb Performance
Mongodb PerformanceMongodb Performance
Mongodb Performance
Jack
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
Lucian Neghina
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
Amit Juneja
 
Document Similarity with Cloud Computing
Document Similarity with Cloud ComputingDocument Similarity with Cloud Computing
Document Similarity with Cloud Computing
Bryan Bende
 
Druid
DruidDruid
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
Datavail
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
Jim Mlodgenski
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache Druid
Imply
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid
Matt Sarrel
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Amazon Web Services
 
Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!
Knoldus Inc.
 
An Introduction to MapReduce
An Introduction to MapReduce An Introduction to MapReduce
An Introduction to MapReduce
Sina Ebrahimi
 

Similar to Redis Day TLV 2018 - RediSearch Aggregations (20)

RedisConf18 - Introducing RediSearch Aggregations
RedisConf18 - Introducing RediSearch AggregationsRedisConf18 - Introducing RediSearch Aggregations
RedisConf18 - Introducing RediSearch Aggregations
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
 
Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1Aggregation Framework in MongoDB Overview Part-1
Aggregation Framework in MongoDB Overview Part-1
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Mongodb Performance
Mongodb PerformanceMongodb Performance
Mongodb Performance
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
Document Similarity with Cloud Computing
Document Similarity with Cloud ComputingDocument Similarity with Cloud Computing
Document Similarity with Cloud Computing
 
Druid
DruidDruid
Druid
 
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache Druid
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
 
Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!Run your queries 14X faster without any investment!
Run your queries 14X faster without any investment!
 
An Introduction to MapReduce
An Introduction to MapReduce An Introduction to MapReduce
An Introduction to MapReduce
 

More from Redis Labs

Redis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redisRedis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redis
Redis Labs
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Redis Labs
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
Redis Labs
 
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
Redis Labs
 
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Redis Labs
 
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of OracleRedis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis Labs
 
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Redis Labs
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Redis Labs
 
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Redis Labs
 
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
Redis Labs
 
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Redis Labs
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Redis Labs
 
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Redis Labs
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
Redis Labs
 
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Redis Labs
 
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Redis Labs
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Redis Labs
 

More from Redis Labs (20)

Redis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redisRedis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redis
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
 
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
 
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of OracleRedis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
 
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
 
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
 
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
 
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
 
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
 
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
 
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
 

Recently uploaded

Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Redis Day TLV 2018 - RediSearch Aggregations

  • 1. RediSearch Aggregations Super fast data processing pipeline!
  • 2. RediSearch 101 ● High Performance Search Index ● Redis Module ● Written From Scratch in C ● Supports multiple index types ● Open Source / Commercial ○ Commercial is scalable and distributed ○ Several commercial customers ○ Largest cluster: 200 shards, ~2B documents, 8TB RAM
  • 3. What Can RediSearch Do? ● Full-Text, Numeric, Geo and Tag indexes ● Real-time indexing and updates ● Distributed search on Billions of documents ● Complex query language ● Auto-Complete ● Highlighting
  • 4. So, Aggregations... “Data aggregation is the compiling of information from databases with intent to prepare combined datasets for data processing.” (Whatever you say, Wikipedia) ● Take a Search Query ● Process the results to produce some insights from them by: ○ Grouping ○ Reducing groups ○ Applying Transformations ○ Sorting ● Transform plain search results into statistical insights
  • 5. Aggregate Request vs Search Request ● Search request just fetches the top N most relevant results ● Processing is minimal if any ● Focus is on filtering and ranking ● Aggregate requests use the same query language ● But add a pipeline of processing steps ● Emphasis on processing and transforming
  • 6. The Aggregation Pipeline ● Transform the results by combining multiple steps ● All steps are repeatable and work on the upstream data Filter Group Reduce Apply Sort Apply Reduce
  • 7. What The API Looks Like FT.AGGREGATE {index} {query} [LOAD {len} {property} ...] [GROUPBY {len} {property} ... [REDUCE {func} {len} {arg} … [AS {alias}] ] [SORTBY {len}{property} [ASC|DESC] …[MAX {n}] [APPLY {expr} AS {alias}] [LIMIT {offset} {num} ]
  • 8. Aggregate Filters ● Determine what results we process ● Same query syntax as normal RediSearch queries ● Can be used to create complex selections @country:{Israel | Italy} @age:[25 52] @profession:”1337 h4x0r” -@languages:{perl | php}
  • 9. GROUPBY and Reduce ● Grouping by any number of fields / upstream properties ● A GROUPBY is associated with many reducers ● Reducers process each matching result per group and “flatten” them GROUPBY @x @y Reduce ⇒ count Group{foo,bar} Group{bar,baz} Reduce ⇒ count
  • 10. Available Reducers ● COUNT ● COUNT_DISTINCT ● COUNT_DISTINCTISH ● MIN / MAX ● AVG ● SUM ● STDDEV ● QUANTILE ● FIRST_VALUE ● TOLIST
  • 11. APPLY Transformations ● APPLY applies 1:1 transformations on results ● Evaluate a complex expression per result ● Can be numeric / text functions ● Numeric operators ● Runtime errors result in NULL APPLY “sqrt((@a + @b)/2)” AS avg APPLY “time(floor(@timestamp/86400)*86400)” AS date APPLY “format(‘Hello, %s’, @userName)” AS msg
  • 12. Numeric Functions and Expressions ● Numeric operators: + - / * % ^ ● log(x), log2(x) ● floor(x), ciel(x) ● abs(x) ● sqrt(x) ● exp(x)
  • 13. String Functions and Expressions ● lower(s) ● upper(s) ● substr(s, offset, len) ● format(fmt, …) ● time(unix_timestamp, [fmt]) ● parsetime(str, [fmt])
  • 14. SORTBY ● Sort by one or more properties ● Each can be ASC/DESC ● MAX limits to top-N results ● Using MinMaxHeap
  • 15. LIMIT vs. SORTBY MAX ● SORTBY … MAX {n} ● LIMIT {offset} {num} ● Limit can be applied even without sorting ● Max just speeds up sorts, but doesn’t page through results ● Getting results 10-20 will work best as: SORTBY … MAX 20 LIMIT 10 10
  • 17. The Problem ● Github events (push, issue, pr, etc) stream ● For each event we have: ○ Repo ○ Actor (user) ○ Type ○ Time ○ Text ● Let’s extract the top committers on Github!
  • 18. Step 1: Filter We scan only pushes: FT.AGGREGATE gh “@type:{PushEvent}”
  • 19. Step 2: Group/Reduce Let’s group by actor, and count the number of commits and unique repos. FT.AGGREGATE gh “@type:{PushEvent}” GROUPBY 1 @actor REDUCE COUNT 0 AS commits REDUCE COUNT_DISTINCT 1 @repo AS repos
  • 20. Step 3: Apply transformations if needed Let’s group by actor, and count the number of commits and unique repos. FT.AGGREGATE gh “@type:{PushEvent}” GROUPBY 1 @actor REDUCE COUNT 0 AS commits REDUCE COUNT_DISTINCT 1 @repo AS repos APPLY “@commits/@repos” AS commits_per_repo
  • 21. Step 4: Sort Sort by commits per repo: FT.AGGREGATE gh “@type:{PushEvent}” GROUPBY 1 @actor REDUCE COUNT 0 AS commits REDUCE COUNT_DISTINCT 1 @repo AS repos APPLY “@commits/@repos” AS commits_per_repo SORTBY 2 @commits_per_repo DESC MAX 20
  • 22. Another Example: Total/Unique Visitors Per Hour FT.AGGREGATE visits APPLY “@time - (@time%3600)” AS hour GROUPBY 1 @hour REDUCE COUNT_DISTINCT 1 @userId AS unique REDUCE COUNT 0 AS total SORTBY 2 @hour ASC APPLY “time(@hour)” AS hour
  • 24. How Do You Scale Aggregations? ● All this works very well on a single node ● But what if your data is split across nodes? ● Simple searches run the query on each node and merge
  • 25. How Do You Scale Aggregations? ● That cannot always be done for aggregations ○ How would you merge AVG? ○ How would you merge COUNT_DISTINCT?
  • 26. Simple Approach - Filter Push-Down ● We push the filtering stage and the properties we want to all nodes ● We merge all results into one stream ● Perform aggregations on it as if it was a single node’s results ● Pros: simple ● Cons: ALL matching records must be transferred over the network
  • 27. Nicer Approach - Complex Push-Down ● We tale the original request ● For each step we can either: ○ Push it down to the nodes as is ○ Push down and create some special merge logic ○ Decide we cannot push down anything and quit
  • 28. Nicer Approach - Complex Push-Down ● Merge logic: ○ COUNT ⇒ Pushed down as COUNT, merged as SUM ○ AVG ⇒ Pushed down as SUM + COUNT, merged as SUM/COUNT ○ COUNT_DISTINCT ⇒ Pushed as TOLIST, merged as COUNT_DISTINCT ○ Etc