SlideShare a Scribd company logo
1 of 54
Download to read offline
Lisa Jung
Developer Advocate @Elastic
Beginner’s Crash Course to Elastic Stack Series
Part 1.2: Understanding the Relevance of your search using
Elasticsearch and Kibana
Beginner’s crash course to the Elastic Stack Series
● Part 1.1: Intro to Elasticsearch and Kibana
○ use case of Elasticsearch and Kibana
○ the basic architecture of Elasticsearch
○ perform CRUD(Create, Read, Update, and Delete) operations with
Elasticsearch and Kibana
Missed the first workshop? No worries!
● Part 1.1: Intro to Elasticsearch and Kibana
○ Repo: https://ela.st/workshop-1-repo
The Elastic Stack
Reliably and securely take data from
any source, in any format, then search,
analyze, and visualize it in real time.
Elasticsearch
Store | Search | Analyze
Elastic is a search company.
We focus on value to users by producing fast results
that operate at scale and are relevant. This is our
DNA. We believe search is an experience. It is what
defines us, and makes us unique.
Scale,
Relevance
Relevance
How do we measure relevance?
● Precision
● Recall
Store | Search | Analyze
I store data as
documents!
Documents with similar traits
are grouped into an index!
Index
Document Document Document
Document Document
Document
When search query is sent, Elasticsearch retrieves relevant
documents and presents the documents as search results.
Index
Document Document Document
Document Document
Document
Search Results
These two diagrams depict the same thing!
Index
Index
Document Document Document
Document Document
Document
Index
True positives are relevant documents that are
returned to the user.
T
T
T
T
T True positives
Index
False positives are irrelevant documents that are
returned to the user.
T
T
T
T
T True positives
F
F False positives
Index
True negatives are irrelevant documents that are
not returned to the user.
T
T
T
T
T
T
T
F
F
True negatives
T
T
T
T
Index
False negatives are relevant documents that were
not returned to the user.
T
T
T
T
T
T
T
F
F
True negatives
False negatives
T
T
T
T
What is precision?
Precision =
True positives
True positives + False positives
What portion of the retrieved data is actually relevant to
the search query?
What is recall?
Recall =
True positives
True positives + False negatives
What portion of relevant data is being returned as search
results?
T
T
T
Precision and Recall are inversely related
Precision
I want all the
retrieved results to
be a perfect match to
the query, even if it
means returning less
documents.
I want to retrieve
more results even
if documents may
not be a perfect
match to the
query.
Precision Recall
Recall
Precision Recall
Precision and recall determine which documents are
included in the search results.
Precision and recall do not determine which of the returned
documents are more relevant compared to the other!
Ranking refers to ordering of the results (from most relevant
results at the top, to least relevant at the bottom).
Most Relevant
…
…
…
Less Relevant
…
…
…
Least Relevant
How to form good habits
(Highest Score)
(Lowest Score)
What is score?
● The score is a value that represents how relevant a document is to that
specific query
● A score is computed for each document that is a hit
What is score?
● Term Frequency(TF)
● Inverse Document Frequency(IDF)
What is score?
How to form good habits } Search Query
Hits
Most Relevant
…
…
…
Less Relevant
…
…
…
Least Relevant
(Highest Score)
(Lowest Score)
Term Frequency(TF) determines how many times each search
term appears in a document.
How to form good habits
TF= 4
TF= 1
If search terms are found in
high frequency in a
document, the document is
considered more relevant to
the search query.
Search Terms
} Search Query
What is Inverse Document Frequency(IDF)?
How to form good habits
How to form a meetup group
Hits
Good chicken recipe
How to form a band
Good times rolling
Good habits 101
Good habits are easy to master!
We may contain some
of the search terms
but we have nothing
to do with forming
good habits!
IDF diminishes the weight of
terms that occur very
frequently in the document
set and increases the weight
of terms that occur rarely!
Fine tuning precision or recall using
Elasticsearch and Kibana
Click on the link to the workshop repo.
https://ela.st/workshop-2-repo
Scroll down to the Resources section & open Free
Elastic Cloud Trial link in a new tab.
Scroll down to the Resources section & open Free
Elastic Cloud Trial link in a new tab.
Go to the email account you signed up with and
verify your email.
Open email from Elastic. Click on verify and accept
button.
Enter your password.
Click on start your free trial.
Select the Elastic Stack.
Configure your settings.
Name your deployment then create deployment.
Beginner’s Crash Course
Save your deployment credentials.
Open Kibana.
Beginner’s Crash Course
Beginner’s Crash Course
Click on Explore on my own option.
Click on Upload a file option
Download and Unzip News Category Dataset from Kaggle
Drag and drop a file you want to upload.
Kibana will give you an analysis of the first 1000 lines of
your data and give you a summary of your dataset.
Field section displays fields identified, high level statistics,
and top occuring values
Click on import button
Name your index and click on import.
Then Elasticsearch will do the rest!
news_headlines
Click on menu icon, and open Dev Tools.
Click on Explore on my own option.
Fine tuning precision or recall using
Elasticsearch and Kibana
Questions?
Join the Elastic Austin User Group for
updates on future workshops!
Lisa Jung
https://discuss.elastic.co/

More Related Content

Similar to Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf

LGL Certification Training Guide
LGL Certification Training GuideLGL Certification Training Guide
LGL Certification Training Guide
Erin Shumaker
 

Similar to Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf (20)

Intro to Elasticsearch and Kibana.pdf
Intro to Elasticsearch and Kibana.pdfIntro to Elasticsearch and Kibana.pdf
Intro to Elasticsearch and Kibana.pdf
 
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and DesignReal-time Recommendations for Retail: Architecture, Algorithms, and Design
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
 
Everything You Wish You Knew About Search
Everything You Wish You Knew About SearchEverything You Wish You Knew About Search
Everything You Wish You Knew About Search
 
LGL Certification Training Guide
LGL Certification Training GuideLGL Certification Training Guide
LGL Certification Training Guide
 
Writing Quality Code
Writing Quality CodeWriting Quality Code
Writing Quality Code
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
Agile Software Architecture
Agile Software ArchitectureAgile Software Architecture
Agile Software Architecture
 
Navigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointNavigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePoint
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and search
 
Elegant and Efficient Database Design
Elegant and Efficient Database DesignElegant and Efficient Database Design
Elegant and Efficient Database Design
 
Metric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleMetric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in Oracle
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
 
Scrum and kanban in the enterprise webinar
Scrum and kanban in the enterprise   webinarScrum and kanban in the enterprise   webinar
Scrum and kanban in the enterprise webinar
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Business analyst
Business analystBusiness analyst
Business analyst
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
IOT313_AWS IoT and Machine Learning for Building Predictive Applications with...
IOT313_AWS IoT and Machine Learning for Building Predictive Applications with...IOT313_AWS IoT and Machine Learning for Building Predictive Applications with...
IOT313_AWS IoT and Machine Learning for Building Predictive Applications with...
 
Get full visibility and find hidden security issues
Get full visibility and find hidden security issuesGet full visibility and find hidden security issues
Get full visibility and find hidden security issues
 

Recently uploaded

unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Mckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for ViewingMckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for Viewing
Nauman Safdar
 
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
ZurliaSoop
 
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
DUBAI (+971)581248768 BUY ABORTION PILLS IN ABU dhabi...Qatar
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
daisycvs
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
allensay1
 

Recently uploaded (20)

Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGParadip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdfTVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
Arti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfArti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdf
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investors
 
Cracking the 'Career Pathing' Slideshare
Cracking the 'Career Pathing' SlideshareCracking the 'Career Pathing' Slideshare
Cracking the 'Career Pathing' Slideshare
 
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Mckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for ViewingMckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for Viewing
 
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
 
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
 
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
!~+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUD...
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
 
Falcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial WingsFalcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial Wings
 
CROSS CULTURAL NEGOTIATION BY PANMISEM NS
CROSS CULTURAL NEGOTIATION BY PANMISEM NSCROSS CULTURAL NEGOTIATION BY PANMISEM NS
CROSS CULTURAL NEGOTIATION BY PANMISEM NS
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
 

Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf

  • 1. Lisa Jung Developer Advocate @Elastic Beginner’s Crash Course to Elastic Stack Series Part 1.2: Understanding the Relevance of your search using Elasticsearch and Kibana
  • 2. Beginner’s crash course to the Elastic Stack Series ● Part 1.1: Intro to Elasticsearch and Kibana ○ use case of Elasticsearch and Kibana ○ the basic architecture of Elasticsearch ○ perform CRUD(Create, Read, Update, and Delete) operations with Elasticsearch and Kibana
  • 3. Missed the first workshop? No worries! ● Part 1.1: Intro to Elasticsearch and Kibana ○ Repo: https://ela.st/workshop-1-repo
  • 4. The Elastic Stack Reliably and securely take data from any source, in any format, then search, analyze, and visualize it in real time.
  • 6.
  • 7. Elastic is a search company. We focus on value to users by producing fast results that operate at scale and are relevant. This is our DNA. We believe search is an experience. It is what defines us, and makes us unique. Scale, Relevance
  • 9. How do we measure relevance? ● Precision ● Recall
  • 10. Store | Search | Analyze I store data as documents! Documents with similar traits are grouped into an index! Index Document Document Document Document Document Document
  • 11. When search query is sent, Elasticsearch retrieves relevant documents and presents the documents as search results. Index Document Document Document Document Document Document Search Results
  • 12. These two diagrams depict the same thing! Index Index Document Document Document Document Document Document
  • 13. Index True positives are relevant documents that are returned to the user. T T T T T True positives
  • 14. Index False positives are irrelevant documents that are returned to the user. T T T T T True positives F F False positives
  • 15. Index True negatives are irrelevant documents that are not returned to the user. T T T T T T T F F True negatives T T T T
  • 16. Index False negatives are relevant documents that were not returned to the user. T T T T T T T F F True negatives False negatives T T T T
  • 17. What is precision? Precision = True positives True positives + False positives What portion of the retrieved data is actually relevant to the search query?
  • 18. What is recall? Recall = True positives True positives + False negatives What portion of relevant data is being returned as search results? T T T
  • 19. Precision and Recall are inversely related Precision I want all the retrieved results to be a perfect match to the query, even if it means returning less documents. I want to retrieve more results even if documents may not be a perfect match to the query. Precision Recall Recall Precision Recall
  • 20. Precision and recall determine which documents are included in the search results. Precision and recall do not determine which of the returned documents are more relevant compared to the other!
  • 21. Ranking refers to ordering of the results (from most relevant results at the top, to least relevant at the bottom). Most Relevant … … … Less Relevant … … … Least Relevant How to form good habits (Highest Score) (Lowest Score)
  • 22. What is score? ● The score is a value that represents how relevant a document is to that specific query ● A score is computed for each document that is a hit
  • 23. What is score? ● Term Frequency(TF) ● Inverse Document Frequency(IDF)
  • 24. What is score? How to form good habits } Search Query Hits Most Relevant … … … Less Relevant … … … Least Relevant (Highest Score) (Lowest Score)
  • 25. Term Frequency(TF) determines how many times each search term appears in a document. How to form good habits TF= 4 TF= 1 If search terms are found in high frequency in a document, the document is considered more relevant to the search query. Search Terms } Search Query
  • 26. What is Inverse Document Frequency(IDF)? How to form good habits How to form a meetup group Hits Good chicken recipe How to form a band Good times rolling Good habits 101 Good habits are easy to master! We may contain some of the search terms but we have nothing to do with forming good habits! IDF diminishes the weight of terms that occur very frequently in the document set and increases the weight of terms that occur rarely!
  • 27. Fine tuning precision or recall using Elasticsearch and Kibana
  • 28. Click on the link to the workshop repo. https://ela.st/workshop-2-repo
  • 29. Scroll down to the Resources section & open Free Elastic Cloud Trial link in a new tab.
  • 30. Scroll down to the Resources section & open Free Elastic Cloud Trial link in a new tab.
  • 31. Go to the email account you signed up with and verify your email.
  • 32. Open email from Elastic. Click on verify and accept button.
  • 34. Click on start your free trial.
  • 37. Name your deployment then create deployment. Beginner’s Crash Course
  • 38. Save your deployment credentials.
  • 39. Open Kibana. Beginner’s Crash Course Beginner’s Crash Course
  • 40. Click on Explore on my own option.
  • 41. Click on Upload a file option
  • 42. Download and Unzip News Category Dataset from Kaggle
  • 43. Drag and drop a file you want to upload.
  • 44. Kibana will give you an analysis of the first 1000 lines of your data and give you a summary of your dataset.
  • 45. Field section displays fields identified, high level statistics, and top occuring values
  • 46. Click on import button
  • 47. Name your index and click on import.
  • 48. Then Elasticsearch will do the rest! news_headlines
  • 49. Click on menu icon, and open Dev Tools.
  • 50. Click on Explore on my own option.
  • 51. Fine tuning precision or recall using Elasticsearch and Kibana
  • 53. Join the Elastic Austin User Group for updates on future workshops!