SlideShare a Scribd company logo
1
Implementing Keyword Sort with
Elasticsearch
Andrei Nicusan
Tech Lead – Intelligent
Search
• Property Search facts and figures
• Prioritising search results by keywords
• A peek under the hood
• Lessons learnt
• What’s next
2
Agenda
3
Property Search facts and figures
• ~1.3M live properties
• ~29M searches per day
• 46 milliseconds median Elasticsearch response time (incl.
network roundtrip)
• 197 milliseconds at the 99th percentile Elasticsearch response
time
• 0.001% error rate (99.999% success rate)
• 3 hot clusters in 3 geographically-separated data centres
4
Property Search
5
Prioritising search results by keywords
• There’s a lot of text in a property description, so why not?
• To increase/ improve flexibility in the way people search
• Get some insights into what people search for as a basis to
further add ”smart” into our search offering
• Allow agents to target property features to users (search
engine optimisation for Rightmove)
11
Keyword sort – why?
12
A peek under the hood
13
A sample document
A sample document
Premium listing
Summary
Key features
Description
15
Analysis - mapping
16
Analysis chain
“No pets allowed” gets indexed as “No ~~pets allowed”
Kudos to Zachary Tong - https://www.elastic.co/blog/quick-tips-negative-
connotation-filter
17
Analysis chain
18
Analysis chain
19
Query analysis
20
Query analysis
21
Query
Single words
22
Query
Single words
23
Query
Single words
24
Query - fuzziness
• Fuzziness is expressed in terms of Levensthein-Damerau distance, i.e.
the number of changes needed to transform string A into string B,
where a change could be:
• Adding a character: detached -> detatched
• Removing a character: detached -> detaced
• Swapping two neighbouring characters: detached -> deatched
• Fuzzy matching raises the biggest concerns around false positives
• With a fuzziness setting of 2, attached would match detached
25
Query
Single words
26
Query – constant scoring
• Default Elasticsearch relevance algorithm (BM25):
• Term frequency: This house has a garden. The garden is big scores
higher than This house has a garden
• Inverse document frequency
• Field length norm: This house has a garden scores higher than This
house has a garden and a driveway.
• Why it doesn’t work for us:
• We want the results sorted by number of keywords found, regardless of
how many times
• Field length norm clashes with our preference for more detailed, more
complete descriptions
27
Query
Single words
28
Query
Phrases Putting it all together
29
Query
Phrases Putting it all together
30
Query
Phrases Putting it all together
31
Lessons learned
• Most queried words: garage, garden, pool, sea view, annexe (also
annex is 9th)
• ~105K keyword searches per day
• Users who apply keywords see 37 pages per session on average
(compared to 11 for those who don’t)
• 99ms median Elastic search response time (compared to 46
overall)
32
What we know
• False positives, e.g. acre/care
• Fuzzy matching
• It’s hard to track when fuzzy matching was “useful”
• Most enquiries/negative feedback are related to fuzzy matching
• No one will send you a thank you note when they got some
results right when they misspelled a word
• Highlights on property cards using default highlighters
33
What went wrong
• We have a few alternatives to improve precision on fuzzy matching
• Go back to default fuzziness settings which skip the first few
characters of a word
• Demote results that only matched upon applying fuzziness
• Only run a fuzzy search when the query word is not an actual English
word. We’d need a third party library to tell us what’s a legit English
word.
• Improve feedback when users click through to a property’s page
34
What happens next
35
Q&A time

More Related Content

What's hot

PCF Platform Monitoring with Prometheus and Grafana
PCF Platform Monitoring with Prometheus and GrafanaPCF Platform Monitoring with Prometheus and Grafana
PCF Platform Monitoring with Prometheus and Grafana
VMware Tanzu
 
Burris FastFire 2 Instruction Manual | Optics Trade
Burris FastFire 2 Instruction Manual | Optics TradeBurris FastFire 2 Instruction Manual | Optics Trade
Burris FastFire 2 Instruction Manual | Optics Trade
Optics-Trade
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWS
Bilal Aybar
 
Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?
Thoughtworks
 
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
 Tips for Writing Better Charters for Exploratory Testing Sessions by Michael... Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
TEST Huddle
 
Final Automation Testing
Final Automation TestingFinal Automation Testing
Final Automation Testing
priya_trivedi
 
Principles Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering HamburgPrinciples Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering Hamburg
Nils Meder
 
Test Automation Frameworks: Assumptions, Concepts & Tools
Test Automation Frameworks: Assumptions, Concepts & ToolsTest Automation Frameworks: Assumptions, Concepts & Tools
Test Automation Frameworks: Assumptions, Concepts & Tools
Amit Rawat
 
Automation Tools Overview
Automation Tools OverviewAutomation Tools Overview
Automation Tools Overview
Murageppa-QA
 
Is AI generation the next platform shift?
Is AI generation the next platform shift?Is AI generation the next platform shift?
Is AI generation the next platform shift?
Bessemer Venture Partners
 
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
eLearning Consortium 電子學習聯盟
 
人生で大事なことはXP白本と参考文献に教わった IN 神山
人生で大事なことはXP白本と参考文献に教わった IN 神山人生で大事なことはXP白本と参考文献に教わった IN 神山
人生で大事なことはXP白本と参考文献に教わった IN 神山
Takeshi Kakeda
 
開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて
開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて
開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて
SEGADevTech
 
Effective API Lifecycle Management
Effective API Lifecycle Management Effective API Lifecycle Management
Effective API Lifecycle Management
SmartBear
 
Mobile Performance Testing - Best Practices
Mobile Performance Testing - Best PracticesMobile Performance Testing - Best Practices
Mobile Performance Testing - Best Practices
Eran Kinsbrunner
 
Chaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field GuideChaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field Guide
matthewbrahms
 
Rita 8th edition book(best copy)
Rita 8th edition book(best copy)Rita 8th edition book(best copy)
Chaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient SystemsChaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient Systems
C4Media
 
Introduction to AI with Business Use Cases
Introduction to AI with Business Use CasesIntroduction to AI with Business Use Cases
Introduction to AI with Business Use Cases
Jack C Crawford
 
Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...
Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...
Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...
Amazon Web Services
 

What's hot (20)

PCF Platform Monitoring with Prometheus and Grafana
PCF Platform Monitoring with Prometheus and GrafanaPCF Platform Monitoring with Prometheus and Grafana
PCF Platform Monitoring with Prometheus and Grafana
 
Burris FastFire 2 Instruction Manual | Optics Trade
Burris FastFire 2 Instruction Manual | Optics TradeBurris FastFire 2 Instruction Manual | Optics Trade
Burris FastFire 2 Instruction Manual | Optics Trade
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWS
 
Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?
 
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
 Tips for Writing Better Charters for Exploratory Testing Sessions by Michael... Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
Tips for Writing Better Charters for Exploratory Testing Sessions by Michael...
 
Final Automation Testing
Final Automation TestingFinal Automation Testing
Final Automation Testing
 
Principles Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering HamburgPrinciples Of Chaos Engineering - Chaos Engineering Hamburg
Principles Of Chaos Engineering - Chaos Engineering Hamburg
 
Test Automation Frameworks: Assumptions, Concepts & Tools
Test Automation Frameworks: Assumptions, Concepts & ToolsTest Automation Frameworks: Assumptions, Concepts & Tools
Test Automation Frameworks: Assumptions, Concepts & Tools
 
Automation Tools Overview
Automation Tools OverviewAutomation Tools Overview
Automation Tools Overview
 
Is AI generation the next platform shift?
Is AI generation the next platform shift?Is AI generation the next platform shift?
Is AI generation the next platform shift?
 
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
 
人生で大事なことはXP白本と参考文献に教わった IN 神山
人生で大事なことはXP白本と参考文献に教わった IN 神山人生で大事なことはXP白本と参考文献に教わった IN 神山
人生で大事なことはXP白本と参考文献に教わった IN 神山
 
開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて
開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて
開発もQAも自動テスト!「LOST JUDGMENT:裁かれざる記憶」のQAテスター参加で進化した「テスト自動化チーム(仮)」の取り組みについて
 
Effective API Lifecycle Management
Effective API Lifecycle Management Effective API Lifecycle Management
Effective API Lifecycle Management
 
Mobile Performance Testing - Best Practices
Mobile Performance Testing - Best PracticesMobile Performance Testing - Best Practices
Mobile Performance Testing - Best Practices
 
Chaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field GuideChaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field Guide
 
Rita 8th edition book(best copy)
Rita 8th edition book(best copy)Rita 8th edition book(best copy)
Rita 8th edition book(best copy)
 
Chaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient SystemsChaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient Systems
 
Introduction to AI with Business Use Cases
Introduction to AI with Business Use CasesIntroduction to AI with Business Use Cases
Introduction to AI with Business Use Cases
 
Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...
Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...
Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...
 

Similar to Implementing Keyword Sort with Elasticsearch

2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin
Daniel Thornton
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
Phil Bradley
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
enterprisesearchmeetup
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Simon Hughes
 
Summit EU Machine Learning
Summit EU Machine LearningSummit EU Machine Learning
Summit EU Machine Learning
MapR Technologies
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
Simon Hughes
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
Findwise
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social Media
Jeffrey Nichols
 
Search Analytics - Comperio
Search Analytics - ComperioSearch Analytics - Comperio
Search Analytics - Comperio
Comperio - Search Matters.
 
Information Retrieval (for beginners)
Information Retrieval (for beginners)Information Retrieval (for beginners)
Information Retrieval (for beginners)
James Melzer
 
Ojala "The Sophisticated User"
Ojala "The Sophisticated User"Ojala "The Sophisticated User"
Efficient Query Processing Infrastructures
Efficient Query Processing InfrastructuresEfficient Query Processing Infrastructures
Efficient Query Processing Infrastructures
Crai Macdonald
 
Custom spellchecker for SOLR
Custom spellchecker for SOLRCustom spellchecker for SOLR
Custom spellchecker for SOLR
Murthy Remella
 
The data we want
The data we wantThe data we want
The data we want
Elena Simperl
 
Better Search Engine Testing - Eric Pugh
Better Search Engine Testing - Eric PughBetter Search Engine Testing - Eric Pugh
Better Search Engine Testing - Eric Pugh
lucenerevolution
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Lucidworks
 
Information Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based ResearchInformation Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based Research
David Nzoputa Ofili
 
What Questions Are Worth Answering?
What Questions Are Worth Answering?What Questions Are Worth Answering?
What Questions Are Worth Answering?
Ehren Reilly
 
Personalized search
Personalized searchPersonalized search
Personalized search
Toine Bogers
 
Practical Approaches to Sharing Information
Practical Approaches to Sharing InformationPractical Approaches to Sharing Information
Practical Approaches to Sharing Information
Christine Connors
 

Similar to Implementing Keyword Sort with Elasticsearch (20)

2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin2016-08-22_winning_on_technicalities_for_linkedin
2016-08-22_winning_on_technicalities_for_linkedin
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Summit EU Machine Learning
Summit EU Machine LearningSummit EU Machine Learning
Summit EU Machine Learning
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
 
Engaging with Users on Public Social Media
Engaging with Users on Public Social MediaEngaging with Users on Public Social Media
Engaging with Users on Public Social Media
 
Search Analytics - Comperio
Search Analytics - ComperioSearch Analytics - Comperio
Search Analytics - Comperio
 
Information Retrieval (for beginners)
Information Retrieval (for beginners)Information Retrieval (for beginners)
Information Retrieval (for beginners)
 
Ojala "The Sophisticated User"
Ojala "The Sophisticated User"Ojala "The Sophisticated User"
Ojala "The Sophisticated User"
 
Efficient Query Processing Infrastructures
Efficient Query Processing InfrastructuresEfficient Query Processing Infrastructures
Efficient Query Processing Infrastructures
 
Custom spellchecker for SOLR
Custom spellchecker for SOLRCustom spellchecker for SOLR
Custom spellchecker for SOLR
 
The data we want
The data we wantThe data we want
The data we want
 
Better Search Engine Testing - Eric Pugh
Better Search Engine Testing - Eric PughBetter Search Engine Testing - Eric Pugh
Better Search Engine Testing - Eric Pugh
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Information Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based ResearchInformation Discovery and Search Strategies for Evidence-Based Research
Information Discovery and Search Strategies for Evidence-Based Research
 
What Questions Are Worth Answering?
What Questions Are Worth Answering?What Questions Are Worth Answering?
What Questions Are Worth Answering?
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
Practical Approaches to Sharing Information
Practical Approaches to Sharing InformationPractical Approaches to Sharing Information
Practical Approaches to Sharing Information
 

More from Yann Cluchey

Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in Elasticsearch
Yann Cluchey
 
Machine Learning and the Elastic Stack
Machine Learning and the Elastic StackMachine Learning and the Elastic Stack
Machine Learning and the Elastic Stack
Yann Cluchey
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindow
Yann Cluchey
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
Yann Cluchey
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at Cogenta
Yann Cluchey
 
Concurrency Patterns with MongoDB
Concurrency Patterns with MongoDBConcurrency Patterns with MongoDB
Concurrency Patterns with MongoDB
Yann Cluchey
 

More from Yann Cluchey (6)

Annotated Text feature in Elasticsearch
Annotated Text feature in ElasticsearchAnnotated Text feature in Elasticsearch
Annotated Text feature in Elasticsearch
 
Machine Learning and the Elastic Stack
Machine Learning and the Elastic StackMachine Learning and the Elastic Stack
Machine Learning and the Elastic Stack
 
Elasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindowElasticsearch at AffiliateWindow
Elasticsearch at AffiliateWindow
 
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
 
Lightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at CogentaLightning talk: elasticsearch at Cogenta
Lightning talk: elasticsearch at Cogenta
 
Concurrency Patterns with MongoDB
Concurrency Patterns with MongoDBConcurrency Patterns with MongoDB
Concurrency Patterns with MongoDB
 

Recently uploaded

Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 

Recently uploaded (20)

Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 

Implementing Keyword Sort with Elasticsearch

  • 1. 1 Implementing Keyword Sort with Elasticsearch Andrei Nicusan Tech Lead – Intelligent Search
  • 2. • Property Search facts and figures • Prioritising search results by keywords • A peek under the hood • Lessons learnt • What’s next 2 Agenda
  • 4. • ~1.3M live properties • ~29M searches per day • 46 milliseconds median Elasticsearch response time (incl. network roundtrip) • 197 milliseconds at the 99th percentile Elasticsearch response time • 0.001% error rate (99.999% success rate) • 3 hot clusters in 3 geographically-separated data centres 4 Property Search
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. • There’s a lot of text in a property description, so why not? • To increase/ improve flexibility in the way people search • Get some insights into what people search for as a basis to further add ”smart” into our search offering • Allow agents to target property features to users (search engine optimisation for Rightmove) 11 Keyword sort – why?
  • 12. 12 A peek under the hood
  • 14. A sample document Premium listing Summary Key features Description
  • 16. 16 Analysis chain “No pets allowed” gets indexed as “No ~~pets allowed” Kudos to Zachary Tong - https://www.elastic.co/blog/quick-tips-negative- connotation-filter
  • 24. 24 Query - fuzziness • Fuzziness is expressed in terms of Levensthein-Damerau distance, i.e. the number of changes needed to transform string A into string B, where a change could be: • Adding a character: detached -> detatched • Removing a character: detached -> detaced • Swapping two neighbouring characters: detached -> deatched • Fuzzy matching raises the biggest concerns around false positives • With a fuzziness setting of 2, attached would match detached
  • 26. 26 Query – constant scoring • Default Elasticsearch relevance algorithm (BM25): • Term frequency: This house has a garden. The garden is big scores higher than This house has a garden • Inverse document frequency • Field length norm: This house has a garden scores higher than This house has a garden and a driveway. • Why it doesn’t work for us: • We want the results sorted by number of keywords found, regardless of how many times • Field length norm clashes with our preference for more detailed, more complete descriptions
  • 32. • Most queried words: garage, garden, pool, sea view, annexe (also annex is 9th) • ~105K keyword searches per day • Users who apply keywords see 37 pages per session on average (compared to 11 for those who don’t) • 99ms median Elastic search response time (compared to 46 overall) 32 What we know
  • 33. • False positives, e.g. acre/care • Fuzzy matching • It’s hard to track when fuzzy matching was “useful” • Most enquiries/negative feedback are related to fuzzy matching • No one will send you a thank you note when they got some results right when they misspelled a word • Highlights on property cards using default highlighters 33 What went wrong
  • 34. • We have a few alternatives to improve precision on fuzzy matching • Go back to default fuzziness settings which skip the first few characters of a word • Demote results that only matched upon applying fuzziness • Only run a fuzzy search when the query word is not an actual English word. We’d need a third party library to tell us what’s a legit English word. • Improve feedback when users click through to a property’s page 34 What happens next