SlideShare a Scribd company logo
Making Reddit Search
Relevant and Scalable
Anupama Joshi
Senior Engineering Manager, Search
Jerry Bao
Senior Software Engineer, Search
Agenda
• What is Reddit?
• Search Architecture
• Improving our Relevance
• The History of Search @ Reddit
• Scaling our Infrastructure
• Q&A
What is Reddit?
Reddit is a network of communities where
individuals can find experiences built
around their interests, hobbies and
passions
It’s where people converse about the
things that are most important to them
Bring community and
belonging to everyone
Our mission
Reddit by the numbers
Alexa Rank (US/World)
MAU
Communities
Posts per day
Comments per day
Votes per day
Searches per day
5th/18th
400M+
1M+
440K+
3.5M+
82M+
68M+
So, what are
we doing with
all that
power?
Dog getting love
51.2k points (95% upvoted)
Cat Fist Bumping
137.1k points (90% upvoted)
817.2k views
Wait, it’s not just
cat/dog pictures!
Community > Content > Individual
● Authenticity
● Creative freedom
● Empathy @ scale
● Belonging
● Being heard
r/assistance
Empathy and support at scale
News Source
Reddit’s Community of Support
None of that matters if you can’t
FIND the content! So let’s talk about
Search...
User Retention
● New users who
searched are 300%
more likely to come
back between D1 to
D14
● > 50% of all mobile
users search
Search @ Reddit
Search Today: Architecture
Search Today: Architecture
Show and Tell: A better subreddit search
Challenge: Redditors are very creative in their subreddit naming (e.g. r/superbowl
is about superb owl pictures) which whilst fun, poses a challenge for discovery.
Answer: faceted search on posts!
Result: A better subreddit search
Result: A better subreddit search
Show and Tell: Better Post Search
● Post search with phrase matching of selftext
The challenge: What about images and link posts?
Answer - Comments
● Comments are important but which comments are most relevant to the post?
● How do we separate the signal from the noise?
Answer - HVT
● HVTs are the highest scoring tf-idf terms from comment sections.
● Index and match on these HVTs along with post selftexts and titles.
Result: Better Post Search
Qualitatively, we saw some users notice almost immediately when we first introduced HVTs.
For some queries, the difference is
quite stark. The following are
search results for the query
‘shabooya’. Note how ‘shabooya’
doesn’t appear anywhere in the title
or the body of the first three post
results, but you can see the phrase
show up in the comments.
Result: Better Post Search
● Post click through rate (CTR) (+3.15%),
● Relevancy ranking for navigational searches (MRR) (+4.01%)
● Search experience improvements for navigational searches due to increased
recall on posts with poor title or body text
Take It to the Next Level: Improve Search Relevance
● Learn from the users click statistics to automatically generate a relevancy
model
● Rerank Search results based on aggregated Click Signal weights that users
click higher on search results for a given query
○ Stream user events in Solr/Fusion cluster
○ Spark Jobs to aggregate click data
○ Use output from the aggregated signal to boost the search results
Result: Post search relevance using signals
7.5 % Increase in CTR12.5 % increase in MRR
Result: Subreddit search relevance using signals
Head-Tail Analysis
● Spelling corrections.
● Tail Query Rewriting.
● Specific Dictionary based Rewriting
Head-Tail Analysis
A tail query like “lot of credit card debit” would be rewritten to produce better relevant results.
Trending Searches
● Reddit can attribute week-over-week DAU
growth to external events, like game
releases, movie releases, and cultural
events (reference).
● We see similar upticks in searches based on
these events (reference).
● We believe that we can increase search
engagement and time on site by leveraging
these signals to highlight trending queries
to users when they search on Reddit.
NSFW Categorization
● Develop NSFW classification criteria
● Query Time classification based content filtering.
● Results boosting/reordering based on classification(boost or filter results
based on knowing the query does/does not have NSFW intent)
● Look at the NSFW results in recall
● Look at the NSFW results people clicked
● Try open source Tensorflow libraries for auto detection of NSFW which is not
marked NSFW
Related Searches
● Train a collaborative filtering matrix decomposition recommender using
SparkML's Alternating Least Squares (ALS) to batch compute query-query
similarities
● Related Searches backend based on Collaborative Filtering & Co Occurrence
Counting Algorithm via Temporal Proximity
● Collaborative filtering based recommender systems are a popular technique
applied for movie recommendations at Netflix, or product recommendations in
e-commerce sites like Amazon
Related Searches
● Dynamic temporal buckets as source of data.
● All pairs irrespective of number of distinct queries in Session
● Length & temporal distance metrics to help with boosting recommendation.
● Intuitive & easily explainable.
● Scales extremely well for building pluggable logic & adding more dimensions.
Related Searches
*Query* —> *Related Searches*
*learn* —> `learn programming`, `learn python`, `learn javascript`, `learn French`, `learn java`, `piano learn`
*cats* —> `cat`, `aww`, `dogs`, `r/cats`, `r/comics`, `kittens`, `funny`, `pets`
*dogs* —> `dog`, `dogs`, `aww`, `isle of dogs`, `isle of dogs discussion`, `cute dogs`, `pets`
*infinity war* —> `avengers`, `piracy`, `infinity war stream`, `infinity war hd`, `avengers infinity war`, `avengers infinity
war stream`, `deadpool2`, `infinity war torrent`
*coming out*. —> `gay`, `lgbt`
*makeup*. —> `beauty`, `make up`, `makeupaddiction`, `skincare`, `foundation`, `eyeshadow`, `wedding`
*keto* —> `snacks`, `r/keto`, `r/progresspics`, `xxketo`, `keto recipes`, `keto diet`, `fasting`
*programming* —> `r/politics`, `r/programming`, `programming`, `python`, `coding`, `learnprogramming`, `r/golang`,
`r/programming`
*Cohen* —> `sacha baron cohen`, `sasha baron cohen` `who is america`, `trump`, `jason spencer`, `sacha cohen`,
`sasha cohen`
*photography* —> `photo`, `r/Nikon`, `r/photography`, `camera`, `photos`, `art`, `r/bestof`, `instagram`
*blep*. —> `mlem`
Future Relevance Work
What’s next
● Contextual Query Understanding
○ how context informs query understanding
● Understanding User Intent
○ classifying the query by its interpretation. The interpretation of the query can then be used to
define intent
● Query rewriting and scoping
○ query rewriting technique that improves precision by matching each query segment to the right
attribute
○ query tagging (special case of named-entity recognition (NER))
Infrastructure and Scaling
Reddit Search has an
interesting history...
History of Reddit Search
History of Reddit Search
● 2005 - Steve Huffman, cofounder and now CEO, implements postgres tsearch.
● 2006 - Chris Slowe, founding engineer and now CTO, implements pylucene.
○ “we fixed a bug in the search results ordering” - Steve Huffman ‘06
○ “I made a quick fix to search that I hope helps until we get a chance to really fix it.” - Steve ‘07
● 2008 - David King, first employee and former search engineer, implements Solr.
○ “[David]’s been fixing search and hacking mystery projects in Erlang.” - Alexis Ohanian ‘08
○ “I’ve totally replaced the reddit search function.” - David King ‘08
● 2010 - David King replaces Solr with IndexTank.
○ “We launched a new search engine yesterday. Calm down. It’s okay. I know. You’ve been hurt
before.” - David King ‘10
● 2012 - u/kemitche implements CloudSearch after LinkedIn shut down IndexTank
“Q: Where do you see reddit in 10 years? A: Reddit search might work by then.” - Steve AMA ‘16
Redditors told us how
much they loved
Search...
“Reddit Search is great!” - said no redditor ever
“This image should honestly replace the 503 error (all servers busy) page.” - u/seven0feleven
“Ever since they moved away from scotch tape, I've been able to get irrelevant results in record time.” - u/El_Bandito_Blanquito
In 2017, we set out to
rebuild search from the
ground up!
Rebuilding Search
Our First Cluster
● Create an AMI with Solr and Fusion packages installed
● Spin up servers with custom AMI
● SSH into each server
○ Install Fusion and Solr
○ Edit configuration files
○ Increase file descriptor limit
● Configured in AWS US West
Our First Cluster
Our new cluster was up
and running well! We
immediately started work
on ingesting data and
relevance tuning.
But we ran into a
couple of key issues
when trying to scale
up...
Challenge #1
Issues with Scaling our Solr Cluster
● Adding capacity to our cluster or changing instance types took a lot of
effort
● Adding capacity our cluster meant that we needed to rebalance our
cluster so that our replicas were equally distributed across machines
○ Solr 7+ introduced some basic autoscaling features but lacked
policies to ensure a cluster was properly balanced
○ Rebalancing process was 100% manual
● Cross-region requests cost unnecessary latency
● As a result, our team was very cautious in scaling our cluster until it
was absolutely needed, to reduce the number of times we scaled up
Terraform and Puppet
everything!
Automate all the things!
Terraform + Puppet
● Together they allow us to programmatically make changes to
infrastructure and server configuration quickly
● We can describe how we want servers to be setup
○ Install Java and Solr
○ Mount drives and add user groups/permissions
○ Set up Solr configuration files
● Modifications to servers and infra are reviewable, and revertible
● Rollout changes across our fleet with ease
● “Can you add more servers Jerry??”
○ No problem! One line code change.
Terraforming Solr
Terraforming Solr
Terraforming Solr
Terraforming Solr
21
Terraforming Solr
Distributing Replicas in
Solr
Equally Distribute by Availability Zone
subreddits
shard 1
replica 1
solr-01
us-east-1a
replica 2
solr-02
us-east-1b
replica 3
solr-03
us-east-1c
shard 2
replica 1
solr-01
us-east-1a
replica 2
solr-02
us-east-1b
replica 3
solr-03
us-east-1c
shard 3
replica 1
solr-01
us-east-1a
replica 2
solr-02
us-east-1b
replica 3
solr-03
us-east-1c
No More Than 1 Replica From Same Shard
subreddits
shard 1
replica 1
solr-01
us-east-1a
replica 2
solr-02
us-east-1b
replica 3
solr-03
us-east-1c
shard 2
replica 1
solr-01
us-east-1a
replica 2
solr-02
us-east-1b
replica 3
solr-03
us-east-1c
shard 3
replica 1
solr-01
us-east-1a
replica 2
solr-02
us-east-1b
replica 3
solr-03
us-east-1c
Equally Distribute Collection’s Replicas
cluster
solr-01 (us-east-1a)
subreddits
shard 1
replica 1
posts
shard 1
replica 1
posts
shard 2
replica 1
solr-04 (us-east-1a)
posts
shard 3
replica 1
solr-02 (us-east-1b)
subreddits
shard 1
replica 2
posts
shard 1
replica 2
posts
shard 2
replica 2
solr-03 (us-east-1c)
subreddits
shard 1
replica 3
posts
shard 1
replica 3
posts
shard 2
replica 3
solr-05 (us-east-1b)
posts
shard 3
replica 2
solr-06 (us-east-1c)
posts
shard 3
replica 3
subreddits - 1 shard; posts - 2 shards; each shard has 3 replicas
Equally Distribute Cluster’s Replicas
cluster
solr-01 (us-east-1a)
subreddits
shard 1
replica 1
posts
shard 1
replica 1
solr-04 (us-east-1a)
posts
shard 3
replica 1
posts
shard 2
replica 1
solr-02 (us-east-1b)
subreddits
shard 1
replica 2
posts
shard 1
replica 2
solr-03 (us-east-1c)
subreddits
shard 1
replica 3
posts
shard 1
replica 3
solr-05 (us-east-1b)
posts
shard 3
replica 2
posts
shard 2
replica 2
solr-06 (us-east-1c)
posts
shard 3
replica 3
posts
shard 2
replica 3
subreddits - 1 shard; posts - 2 shards; each shard has 3 replicas
Solr Rebalancing Tool
● Applied balancing rules in order
○ Check each shard’s availability zone distribution and replica
distribution
○ Move replicas so that each collection’s replicas are on the most
amount of machines
○ Move replicas so that each machine has the least amount of
replicas possible
● Outputs list of operations to be performed and confirms with user each
replica to move
Solr Rebalancing Tool
Search Architecture Today
Cross-Region Latency Improvement
4x faster
queries!
Our cluster was now
scaling easily, but
reindexing all of our
data took many
weeks...
Challenge #2
Indexing Data for Search
● Backfills
○ Pulls data from our datasource
○ Transforms it into the schema we need for indexing
○ Used to add/remove/change field indexing
● Streaming
○ Captures real-time updates so up-to-date information can be
reflected in our indices
○ Transforms data the same way as backfills
Why are fast backfills important?
● Quickly iterate on document schemas
● Test new ways to analyze document fields
● Create multiple clusters of the same data for testing
● Fix data issues rapidly
Thing Data Model
Hive
● Pulled data from postgres with sqoop into Hive
● A series of transformations to
○ Join thing and data tables
○ Rotate the keys into columns
○ Store the final result as Parquet in S3
● Fusion/Spark fetched S3 files and indexed data into Solr
Issues with v1
● Several weeks to transform data
○ Afraid of changing the schema
● Many stages of transformation, making it hard to debug and figure out
how far upstream data transformation issues were
○ Hard to ensure the end result was correct
Thing Service
● Search Service as the transformer and indexer of data
○ Fetches the latest data from the Thing Service
● Special logic in Thing Service made it easier to handle postgres data
○ Score of links, comments
○ Converting to actual data types (booleans, fullnames)
● Cut backfill time from multiple weeks to a single week with
parallelization
Issues with v2
● Reliant upon a shared production service for what should be an offline
job
○ We’ve pushed the thing service too hard with our backfills,
affecting other services that rely upon it
● Other initiatives highlighted how slow our ingestion could get
○ HVTs (augmenting links with high value tokens from comments)
○ Attempts to index comment data
Spark
● Running our own postgres replicas from wal-e backups in S3
● Spark pulls data directly from postgres and transforms the data
● Can horizontally scale ingestion to be faster
○ Postgres to speed up ingestion of data into Spark
○ Spark to speed up transformation and joining of data
● We can adjust ingestion parallelism by repartitioning in the end
● Cut backfill time significantly from multiple weeks to days
Random 100% CPU
spikes prevented us
from shipping search
new features...
Challenge #3
Redditors Issue Expensive Queries
● High Recall Queries
○ the, would, you, ifs, news, games
● Crazy Queries
○ (AFD+OR+CDU+OR+CSU+OR+FDP+OR+Grünen+OR+SPD+OR+"
Die+Linke"+OR+Energiepolitik+OR+Gesetze~+OR+Kabinetts~+O
R+Regierungs~+OR+Referentenentwurf)+(Energiehandel~+OR+E
nergiemanagement~+OR+Energiepreis~+OR+Energiesteuer~)
● These queries would take multiple seconds to complete, blocking a
significant number of CPU cores in the cluster
Cutting Queries Off
● Utilize timeAllowed in solrconfig.xml to prevent expensive queries
taking up all of your cluster’s resources
○ NOTE: timeAllowed is not a hard cutoff. From the Solr docs:
○ As this check is periodically performed, the actual time for which a
request can be processed before it is aborted would be marginally
greater than or equal to the value of timeAllowed. If the request
consumes more time in other stages, e.g., custom components,
etc., this parameter is not expected to abort the request.
Future Scalability Work
Multi-Cluster Solr Environment
● One cluster per collection
● Hardware Isolation: one collections issues won’t affect other
collections
● Scale each collection independently
● Balancing becomes really simple
○ Each machine has equally distributed number of replicas
○ Ensure AZ and shard awareness
Solr 7.5 Autoscaling
● Solr 7.5 includes new policies that allow us to equally distribute
replicas by
○ Arbitrary properties
○ Collection
○ Cluster
● Turn Solr Scaling into a one step process
Questions?
Thank you!
Anupama Joshi
anupama@reddit.com
linkedin.com/in/anupamajoshi
Jerry Bao
jerry.bao@reddit.com
linkedin.com/in/thejerrybao
PS: We’re Hiring!
reddit.com/jobs

More Related Content

What's hot

Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Databricks
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
Jim Dowling
 
Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...
Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...
Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...
Databricks
 
Graphql presentation
Graphql presentationGraphql presentation
Graphql presentation
Vibhor Grover
 
[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem
Jiangjie Qin
 
Managed Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty ImagesManaged Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty Images
Lucidworks
 
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
Databricks
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
Databricks
 
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
MLconf
 
Validating credit cards on mobile using deep learning
Validating credit cards on mobile using deep learningValidating credit cards on mobile using deep learning
Validating credit cards on mobile using deep learning
DataWorks Summit
 
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Databricks
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines
Jim Dowling
 
Introduction to GraphQL: Mobile Week SF
Introduction to GraphQL: Mobile Week SFIntroduction to GraphQL: Mobile Week SF
Introduction to GraphQL: Mobile Week SF
Amazon Web Services
 
How web works and browser works ? (behind the scenes)
How web works and browser works ? (behind the scenes)How web works and browser works ? (behind the scenes)
How web works and browser works ? (behind the scenes)
Vibhor Grover
 
Scalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
Scalable Machine Learning Pipeline For Meta Data Discovery From eBay ListingsScalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
Scalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
Spark Summit
 
No REST till Production – Building and Deploying 9 Models to Production in 3 ...
No REST till Production – Building and Deploying 9 Models to Production in 3 ...No REST till Production – Building and Deploying 9 Models to Production in 3 ...
No REST till Production – Building and Deploying 9 Models to Production in 3 ...
Databricks
 
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Databricks
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
Databricks
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
Stepan Pushkarev
 
Pycon Talk
Pycon TalkPycon Talk
Pycon Talk
Abhijit Gadgil
 

What's hot (20)

Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
 
Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...
Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...
Bridging the Gap Between Data Scientists and Software Engineers – Deploying L...
 
Graphql presentation
Graphql presentationGraphql presentation
Graphql presentation
 
[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem[FFE19] Build a Flink AI Ecosystem
[FFE19] Build a Flink AI Ecosystem
 
Managed Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty ImagesManaged Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty Images
 
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
 
Validating credit cards on mobile using deep learning
Validating credit cards on mobile using deep learningValidating credit cards on mobile using deep learning
Validating credit cards on mobile using deep learning
 
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines
 
Introduction to GraphQL: Mobile Week SF
Introduction to GraphQL: Mobile Week SFIntroduction to GraphQL: Mobile Week SF
Introduction to GraphQL: Mobile Week SF
 
How web works and browser works ? (behind the scenes)
How web works and browser works ? (behind the scenes)How web works and browser works ? (behind the scenes)
How web works and browser works ? (behind the scenes)
 
Scalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
Scalable Machine Learning Pipeline For Meta Data Discovery From eBay ListingsScalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
Scalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
 
No REST till Production – Building and Deploying 9 Models to Production in 3 ...
No REST till Production – Building and Deploying 9 Models to Production in 3 ...No REST till Production – Building and Deploying 9 Models to Production in 3 ...
No REST till Production – Building and Deploying 9 Models to Production in 3 ...
 
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
Apache Spark-Based Stratification Library for Machine Learning Use Cases at N...
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
Pycon Talk
Pycon TalkPycon Talk
Pycon Talk
 

Similar to Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit

The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...
The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...
The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...
Lucidworks
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
Caserta
 
Conversion Models: A Systematic Method of Building Learning to Rank Training ...
Conversion Models: A Systematic Method of Building Learning to Rank Training ...Conversion Models: A Systematic Method of Building Learning to Rank Training ...
Conversion Models: A Systematic Method of Building Learning to Rank Training ...
Lucidworks
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Sanjay Willie
 
Basic Level SEO Interview Questions.pdf
Basic Level SEO Interview Questions.pdfBasic Level SEO Interview Questions.pdf
Basic Level SEO Interview Questions.pdf
SaritaM11
 
ChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano FirtmanChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano Firtman
Wey Wey Web
 
best Digital Marketing ppt for all......
best Digital Marketing ppt for all......best Digital Marketing ppt for all......
best Digital Marketing ppt for all......
Smayara
 
Technical Club PPT for BTech CS and Btech IT
Technical Club PPT for BTech CS and Btech ITTechnical Club PPT for BTech CS and Btech IT
Technical Club PPT for BTech CS and Btech IT
paurushsinhad
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
Andy Stretton
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Lucidworks
 
Build your own analytics power tools
Build your own analytics power toolsBuild your own analytics power tools
Build your own analytics power tools
Alban Gérôme
 
Uncovering 'not provided' keyword data
Uncovering 'not provided' keyword data Uncovering 'not provided' keyword data
Uncovering 'not provided' keyword data
Clayton Wood
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
Daniel Zivkovic
 
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst
 
Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3
Dimitri van Hees
 
How to unlock the secrets of effortless keyword research with ChatGPT.pptx
How to unlock the secrets of effortless keyword research with ChatGPT.pptxHow to unlock the secrets of effortless keyword research with ChatGPT.pptx
How to unlock the secrets of effortless keyword research with ChatGPT.pptx
Daniel Smullen
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
OpenSource Connections
 
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineSTQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineAlbert Gareev
 
How Google works
How Google worksHow Google works
How Google works
Accesstrade Vietnam
 

Similar to Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit (20)

The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...
The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...
The Search for Better Search at Reddit - Nick Caldwell, Chris Slowe, and Luis...
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Conversion Models: A Systematic Method of Building Learning to Rank Training ...
Conversion Models: A Systematic Method of Building Learning to Rank Training ...Conversion Models: A Systematic Method of Building Learning to Rank Training ...
Conversion Models: A Systematic Method of Building Learning to Rank Training ...
 
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your QueriesExploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
Exploring ChatGPT Prompt Hacks To Maximally Optimise Your Queries
 
Basic Level SEO Interview Questions.pdf
Basic Level SEO Interview Questions.pdfBasic Level SEO Interview Questions.pdf
Basic Level SEO Interview Questions.pdf
 
Emperors new clothes_jab
Emperors new clothes_jabEmperors new clothes_jab
Emperors new clothes_jab
 
ChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano FirtmanChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano Firtman
 
best Digital Marketing ppt for all......
best Digital Marketing ppt for all......best Digital Marketing ppt for all......
best Digital Marketing ppt for all......
 
Technical Club PPT for BTech CS and Btech IT
Technical Club PPT for BTech CS and Btech ITTechnical Club PPT for BTech CS and Btech IT
Technical Club PPT for BTech CS and Btech IT
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
 
Build your own analytics power tools
Build your own analytics power toolsBuild your own analytics power tools
Build your own analytics power tools
 
Uncovering 'not provided' keyword data
Uncovering 'not provided' keyword data Uncovering 'not provided' keyword data
Uncovering 'not provided' keyword data
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
Birst Webinar Slides: "Build vs. Buy - Making the Right Choice for a Great Da...
 
Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3Crawlable Spatial Data - #Geo4Web research topic #3
Crawlable Spatial Data - #Geo4Web research topic #3
 
How to unlock the secrets of effortless keyword research with ChatGPT.pptx
How to unlock the secrets of effortless keyword research with ChatGPT.pptxHow to unlock the secrets of effortless keyword research with ChatGPT.pptx
How to unlock the secrets of effortless keyword research with ChatGPT.pptx
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
 
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineSTQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
 
How Google works
How Google worksHow Google works
How Google works
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
Lucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Lucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit

  • 1. Making Reddit Search Relevant and Scalable Anupama Joshi Senior Engineering Manager, Search Jerry Bao Senior Software Engineer, Search
  • 2. Agenda • What is Reddit? • Search Architecture • Improving our Relevance • The History of Search @ Reddit • Scaling our Infrastructure • Q&A
  • 3. What is Reddit? Reddit is a network of communities where individuals can find experiences built around their interests, hobbies and passions It’s where people converse about the things that are most important to them
  • 4. Bring community and belonging to everyone Our mission
  • 5. Reddit by the numbers Alexa Rank (US/World) MAU Communities Posts per day Comments per day Votes per day Searches per day 5th/18th 400M+ 1M+ 440K+ 3.5M+ 82M+ 68M+
  • 6. So, what are we doing with all that power?
  • 7. Dog getting love 51.2k points (95% upvoted) Cat Fist Bumping 137.1k points (90% upvoted) 817.2k views
  • 8. Wait, it’s not just cat/dog pictures!
  • 9. Community > Content > Individual ● Authenticity ● Creative freedom ● Empathy @ scale ● Belonging ● Being heard
  • 13. None of that matters if you can’t FIND the content! So let’s talk about Search...
  • 14. User Retention ● New users who searched are 300% more likely to come back between D1 to D14 ● > 50% of all mobile users search
  • 18. Show and Tell: A better subreddit search Challenge: Redditors are very creative in their subreddit naming (e.g. r/superbowl is about superb owl pictures) which whilst fun, poses a challenge for discovery. Answer: faceted search on posts!
  • 19. Result: A better subreddit search
  • 20. Result: A better subreddit search
  • 21. Show and Tell: Better Post Search ● Post search with phrase matching of selftext The challenge: What about images and link posts? Answer - Comments ● Comments are important but which comments are most relevant to the post? ● How do we separate the signal from the noise? Answer - HVT ● HVTs are the highest scoring tf-idf terms from comment sections. ● Index and match on these HVTs along with post selftexts and titles.
  • 22. Result: Better Post Search Qualitatively, we saw some users notice almost immediately when we first introduced HVTs. For some queries, the difference is quite stark. The following are search results for the query ‘shabooya’. Note how ‘shabooya’ doesn’t appear anywhere in the title or the body of the first three post results, but you can see the phrase show up in the comments.
  • 23. Result: Better Post Search ● Post click through rate (CTR) (+3.15%), ● Relevancy ranking for navigational searches (MRR) (+4.01%) ● Search experience improvements for navigational searches due to increased recall on posts with poor title or body text
  • 24. Take It to the Next Level: Improve Search Relevance ● Learn from the users click statistics to automatically generate a relevancy model ● Rerank Search results based on aggregated Click Signal weights that users click higher on search results for a given query ○ Stream user events in Solr/Fusion cluster ○ Spark Jobs to aggregate click data ○ Use output from the aggregated signal to boost the search results
  • 25. Result: Post search relevance using signals 7.5 % Increase in CTR12.5 % increase in MRR
  • 26. Result: Subreddit search relevance using signals
  • 27. Head-Tail Analysis ● Spelling corrections. ● Tail Query Rewriting. ● Specific Dictionary based Rewriting
  • 28. Head-Tail Analysis A tail query like “lot of credit card debit” would be rewritten to produce better relevant results.
  • 29. Trending Searches ● Reddit can attribute week-over-week DAU growth to external events, like game releases, movie releases, and cultural events (reference). ● We see similar upticks in searches based on these events (reference). ● We believe that we can increase search engagement and time on site by leveraging these signals to highlight trending queries to users when they search on Reddit.
  • 30. NSFW Categorization ● Develop NSFW classification criteria ● Query Time classification based content filtering. ● Results boosting/reordering based on classification(boost or filter results based on knowing the query does/does not have NSFW intent) ● Look at the NSFW results in recall ● Look at the NSFW results people clicked ● Try open source Tensorflow libraries for auto detection of NSFW which is not marked NSFW
  • 31. Related Searches ● Train a collaborative filtering matrix decomposition recommender using SparkML's Alternating Least Squares (ALS) to batch compute query-query similarities ● Related Searches backend based on Collaborative Filtering & Co Occurrence Counting Algorithm via Temporal Proximity ● Collaborative filtering based recommender systems are a popular technique applied for movie recommendations at Netflix, or product recommendations in e-commerce sites like Amazon
  • 32. Related Searches ● Dynamic temporal buckets as source of data. ● All pairs irrespective of number of distinct queries in Session ● Length & temporal distance metrics to help with boosting recommendation. ● Intuitive & easily explainable. ● Scales extremely well for building pluggable logic & adding more dimensions.
  • 33. Related Searches *Query* —> *Related Searches* *learn* —> `learn programming`, `learn python`, `learn javascript`, `learn French`, `learn java`, `piano learn` *cats* —> `cat`, `aww`, `dogs`, `r/cats`, `r/comics`, `kittens`, `funny`, `pets` *dogs* —> `dog`, `dogs`, `aww`, `isle of dogs`, `isle of dogs discussion`, `cute dogs`, `pets` *infinity war* —> `avengers`, `piracy`, `infinity war stream`, `infinity war hd`, `avengers infinity war`, `avengers infinity war stream`, `deadpool2`, `infinity war torrent` *coming out*. —> `gay`, `lgbt` *makeup*. —> `beauty`, `make up`, `makeupaddiction`, `skincare`, `foundation`, `eyeshadow`, `wedding` *keto* —> `snacks`, `r/keto`, `r/progresspics`, `xxketo`, `keto recipes`, `keto diet`, `fasting` *programming* —> `r/politics`, `r/programming`, `programming`, `python`, `coding`, `learnprogramming`, `r/golang`, `r/programming` *Cohen* —> `sacha baron cohen`, `sasha baron cohen` `who is america`, `trump`, `jason spencer`, `sacha cohen`, `sasha cohen` *photography* —> `photo`, `r/Nikon`, `r/photography`, `camera`, `photos`, `art`, `r/bestof`, `instagram` *blep*. —> `mlem`
  • 35. What’s next ● Contextual Query Understanding ○ how context informs query understanding ● Understanding User Intent ○ classifying the query by its interpretation. The interpretation of the query can then be used to define intent ● Query rewriting and scoping ○ query rewriting technique that improves precision by matching each query segment to the right attribute ○ query tagging (special case of named-entity recognition (NER))
  • 37. Reddit Search has an interesting history... History of Reddit Search
  • 38. History of Reddit Search ● 2005 - Steve Huffman, cofounder and now CEO, implements postgres tsearch. ● 2006 - Chris Slowe, founding engineer and now CTO, implements pylucene. ○ “we fixed a bug in the search results ordering” - Steve Huffman ‘06 ○ “I made a quick fix to search that I hope helps until we get a chance to really fix it.” - Steve ‘07 ● 2008 - David King, first employee and former search engineer, implements Solr. ○ “[David]’s been fixing search and hacking mystery projects in Erlang.” - Alexis Ohanian ‘08 ○ “I’ve totally replaced the reddit search function.” - David King ‘08 ● 2010 - David King replaces Solr with IndexTank. ○ “We launched a new search engine yesterday. Calm down. It’s okay. I know. You’ve been hurt before.” - David King ‘10 ● 2012 - u/kemitche implements CloudSearch after LinkedIn shut down IndexTank “Q: Where do you see reddit in 10 years? A: Reddit search might work by then.” - Steve AMA ‘16
  • 39. Redditors told us how much they loved Search... “Reddit Search is great!” - said no redditor ever
  • 40. “This image should honestly replace the 503 error (all servers busy) page.” - u/seven0feleven
  • 41. “Ever since they moved away from scotch tape, I've been able to get irrelevant results in record time.” - u/El_Bandito_Blanquito
  • 42. In 2017, we set out to rebuild search from the ground up! Rebuilding Search
  • 43. Our First Cluster ● Create an AMI with Solr and Fusion packages installed ● Spin up servers with custom AMI ● SSH into each server ○ Install Fusion and Solr ○ Edit configuration files ○ Increase file descriptor limit ● Configured in AWS US West
  • 44. Our First Cluster Our new cluster was up and running well! We immediately started work on ingesting data and relevance tuning.
  • 45. But we ran into a couple of key issues when trying to scale up... Challenge #1
  • 46. Issues with Scaling our Solr Cluster ● Adding capacity to our cluster or changing instance types took a lot of effort ● Adding capacity our cluster meant that we needed to rebalance our cluster so that our replicas were equally distributed across machines ○ Solr 7+ introduced some basic autoscaling features but lacked policies to ensure a cluster was properly balanced ○ Rebalancing process was 100% manual ● Cross-region requests cost unnecessary latency ● As a result, our team was very cautious in scaling our cluster until it was absolutely needed, to reduce the number of times we scaled up
  • 48. Terraform + Puppet ● Together they allow us to programmatically make changes to infrastructure and server configuration quickly ● We can describe how we want servers to be setup ○ Install Java and Solr ○ Mount drives and add user groups/permissions ○ Set up Solr configuration files ● Modifications to servers and infra are reviewable, and revertible ● Rollout changes across our fleet with ease ● “Can you add more servers Jerry??” ○ No problem! One line code change.
  • 55. Equally Distribute by Availability Zone subreddits shard 1 replica 1 solr-01 us-east-1a replica 2 solr-02 us-east-1b replica 3 solr-03 us-east-1c shard 2 replica 1 solr-01 us-east-1a replica 2 solr-02 us-east-1b replica 3 solr-03 us-east-1c shard 3 replica 1 solr-01 us-east-1a replica 2 solr-02 us-east-1b replica 3 solr-03 us-east-1c
  • 56. No More Than 1 Replica From Same Shard subreddits shard 1 replica 1 solr-01 us-east-1a replica 2 solr-02 us-east-1b replica 3 solr-03 us-east-1c shard 2 replica 1 solr-01 us-east-1a replica 2 solr-02 us-east-1b replica 3 solr-03 us-east-1c shard 3 replica 1 solr-01 us-east-1a replica 2 solr-02 us-east-1b replica 3 solr-03 us-east-1c
  • 57. Equally Distribute Collection’s Replicas cluster solr-01 (us-east-1a) subreddits shard 1 replica 1 posts shard 1 replica 1 posts shard 2 replica 1 solr-04 (us-east-1a) posts shard 3 replica 1 solr-02 (us-east-1b) subreddits shard 1 replica 2 posts shard 1 replica 2 posts shard 2 replica 2 solr-03 (us-east-1c) subreddits shard 1 replica 3 posts shard 1 replica 3 posts shard 2 replica 3 solr-05 (us-east-1b) posts shard 3 replica 2 solr-06 (us-east-1c) posts shard 3 replica 3 subreddits - 1 shard; posts - 2 shards; each shard has 3 replicas
  • 58. Equally Distribute Cluster’s Replicas cluster solr-01 (us-east-1a) subreddits shard 1 replica 1 posts shard 1 replica 1 solr-04 (us-east-1a) posts shard 3 replica 1 posts shard 2 replica 1 solr-02 (us-east-1b) subreddits shard 1 replica 2 posts shard 1 replica 2 solr-03 (us-east-1c) subreddits shard 1 replica 3 posts shard 1 replica 3 solr-05 (us-east-1b) posts shard 3 replica 2 posts shard 2 replica 2 solr-06 (us-east-1c) posts shard 3 replica 3 posts shard 2 replica 3 subreddits - 1 shard; posts - 2 shards; each shard has 3 replicas
  • 59. Solr Rebalancing Tool ● Applied balancing rules in order ○ Check each shard’s availability zone distribution and replica distribution ○ Move replicas so that each collection’s replicas are on the most amount of machines ○ Move replicas so that each machine has the least amount of replicas possible ● Outputs list of operations to be performed and confirms with user each replica to move
  • 63. Our cluster was now scaling easily, but reindexing all of our data took many weeks... Challenge #2
  • 64. Indexing Data for Search ● Backfills ○ Pulls data from our datasource ○ Transforms it into the schema we need for indexing ○ Used to add/remove/change field indexing ● Streaming ○ Captures real-time updates so up-to-date information can be reflected in our indices ○ Transforms data the same way as backfills
  • 65. Why are fast backfills important? ● Quickly iterate on document schemas ● Test new ways to analyze document fields ● Create multiple clusters of the same data for testing ● Fix data issues rapidly
  • 67. Hive ● Pulled data from postgres with sqoop into Hive ● A series of transformations to ○ Join thing and data tables ○ Rotate the keys into columns ○ Store the final result as Parquet in S3 ● Fusion/Spark fetched S3 files and indexed data into Solr
  • 68.
  • 69. Issues with v1 ● Several weeks to transform data ○ Afraid of changing the schema ● Many stages of transformation, making it hard to debug and figure out how far upstream data transformation issues were ○ Hard to ensure the end result was correct
  • 70. Thing Service ● Search Service as the transformer and indexer of data ○ Fetches the latest data from the Thing Service ● Special logic in Thing Service made it easier to handle postgres data ○ Score of links, comments ○ Converting to actual data types (booleans, fullnames) ● Cut backfill time from multiple weeks to a single week with parallelization
  • 71.
  • 72. Issues with v2 ● Reliant upon a shared production service for what should be an offline job ○ We’ve pushed the thing service too hard with our backfills, affecting other services that rely upon it ● Other initiatives highlighted how slow our ingestion could get ○ HVTs (augmenting links with high value tokens from comments) ○ Attempts to index comment data
  • 73. Spark ● Running our own postgres replicas from wal-e backups in S3 ● Spark pulls data directly from postgres and transforms the data ● Can horizontally scale ingestion to be faster ○ Postgres to speed up ingestion of data into Spark ○ Spark to speed up transformation and joining of data ● We can adjust ingestion parallelism by repartitioning in the end ● Cut backfill time significantly from multiple weeks to days
  • 74. Random 100% CPU spikes prevented us from shipping search new features... Challenge #3
  • 75.
  • 76. Redditors Issue Expensive Queries ● High Recall Queries ○ the, would, you, ifs, news, games ● Crazy Queries ○ (AFD+OR+CDU+OR+CSU+OR+FDP+OR+Grünen+OR+SPD+OR+" Die+Linke"+OR+Energiepolitik+OR+Gesetze~+OR+Kabinetts~+O R+Regierungs~+OR+Referentenentwurf)+(Energiehandel~+OR+E nergiemanagement~+OR+Energiepreis~+OR+Energiesteuer~) ● These queries would take multiple seconds to complete, blocking a significant number of CPU cores in the cluster
  • 77. Cutting Queries Off ● Utilize timeAllowed in solrconfig.xml to prevent expensive queries taking up all of your cluster’s resources ○ NOTE: timeAllowed is not a hard cutoff. From the Solr docs: ○ As this check is periodically performed, the actual time for which a request can be processed before it is aborted would be marginally greater than or equal to the value of timeAllowed. If the request consumes more time in other stages, e.g., custom components, etc., this parameter is not expected to abort the request.
  • 78.
  • 80. Multi-Cluster Solr Environment ● One cluster per collection ● Hardware Isolation: one collections issues won’t affect other collections ● Scale each collection independently ● Balancing becomes really simple ○ Each machine has equally distributed number of replicas ○ Ensure AZ and shard awareness
  • 81. Solr 7.5 Autoscaling ● Solr 7.5 includes new policies that allow us to equally distribute replicas by ○ Arbitrary properties ○ Collection ○ Cluster ● Turn Solr Scaling into a one step process
  • 83. Thank you! Anupama Joshi anupama@reddit.com linkedin.com/in/anupamajoshi Jerry Bao jerry.bao@reddit.com linkedin.com/in/thejerrybao PS: We’re Hiring! reddit.com/jobs