SlideShare a Scribd company logo
Search engine.
Elasticsearch
Andriy S.
What is Search Engine?
Search Engine - a set of applications designed to search for information. Usually
is part of the search engine.
The main criteria for the quality of the search engine is the relevance (degree of
compliance with the request and found that the relevance of results), fullness
index, accounting morphology of the language.
Most search services: Sphinx, Solr, ElasticSearch, etc...
Elasticsearch
Elasticsearch - search engine from json rest api, uses Lucene and written in Java.
Apache Lucene - a free library of open-source full-text search. Implemented in
Java, supported by the Apache Software Foundation and is produced under
license Apache Software.
Libraries: Java, C #, Python, JavaScript, PHP, Perl, Ruby
Requirements
In developing heavy websites or corporate systems often have trouble developing
fast and easy search engine. The following are the most important, in my opinion,
the requirements for this service:
◆ Speed
◆ Easy installation and configuration
◆ Price (preferably free and open source)
◆ Information exchange format JSON (over HTTP)
◆ Indexing in real time
◆ Multi-tenancy (flexible settings for individual user)
Index
Index - a database, document - a table in it, by understandable terms.
The document is a document format JSON, which is stored in elasticsearch. It's like a row in
a relational database. Each document is stored in the index, and is the type and ID. The
document is a JSON object (also known in other languages ​as hash / HashMap / associative
array) that contains zero or more fields or key-value pairs. The original JSON document
indexing will be stored in the field _source that returns a default receipt or document search.
Analysis
Analysis is the process of converting text, like the body of any email, into tokens or
terms which are added to the inverted index for searching. Analysis is performed
by an analyzer which can be either a built-in analyzer or a custom analyzer
defined per index.
Elasticsearch Mapping
Mapping is the process of defining how a document, and the fields it contains, are
stored and indexed. For instance, use mappings to define:
◆which string fields should be treated as full text fields.
◆which fields contain numbers, dates, or geolocations.
◆whether the values of all fields in the document should be indexed into the catch-all _all field.
◆the format of date values.
◆custom rules to control the mapping for dynamically added fields.
Elasticsearch Mapping
Each field has a data type which can be:
◆a simple type like text, keyword, date, long, double, boolean or ip.
◆a type which supports the hierarchical nature of JSON such as object or nested.
◆or a specialised type like geo_point, geo_shape, or completion.
Documents CRUD
Often, we use the terms object and document interchangeably. However, there is a distinction. An object
is just a JSON object—similar to what is known as a hash, hashmap, dictionary, or associative array.
Objects may contain other objects. In Elasticsearch, the term document has a specific meaning. It refers
to the top-level, or root object that is serialized into JSON and stored in Elasticsearch under a unique ID.
Query and filter context
The behaviour of a query clause depends on whether it is used in query context or in filter context:
1. Query context
A query clause used in query context answers the question “How well does this document match this query clause?”
Besides deciding whether or not the document matches, the query clause also calculates a _score representing how well
the document matches, relative to other documents.
1. Filter context
In filter context, a query clause answers the question “Does this document match this query clause?” The answer is a
simple Yes or No — no scores are calculated. Filter context is mostly used for filtering structured data, e.g.
Search examples with DSL Builder
Search Response
Geolocation Filter
Elasticsearch offers two ways of representing geolocations: latitude-longitude points using the geo_point field type, and
complex shapes defined in GeoJSON, using the geo_shape field type.
Geo-points allow you to find points within a certain distance of another point, to calculate distances between two points for
sorting or relevance scoring, or to aggregate into a grid to display on a map. Geo-shapes, on the other hand, are used
purely for filtering. They can be used to decide whether two shapes overlap, or whether one shape completely contains
other shapes.
Four geo-point filters can be used to include or exclude documents by geolocation:
● geo_bounding_box
Find geo-points that fall within the specified rectangle.
● geo_distance
Find geo-points within the specified distance of a central point.
● geo_distance_range
Find geo-points within a specified minimum and maximum distance from a central point.
● geo_polygon
Find geo-points that fall within the specified polygon. This filter is very expensive. If you find yourself wanting to use
it, you should be looking at geo-shapes instead.
Geo Distance Filter Example
Thank You!
Questions? Comments?

More Related Content

What's hot

Text mining
Text miningText mining
Text mining
Pankaj Thakur
 
An efficient approach for illustrating web data of user search result
An efficient approach for illustrating web data of user search resultAn efficient approach for illustrating web data of user search result
An efficient approach for illustrating web data of user search resultNeha Singh
 
Duplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy DatasetDuplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy Dataset
Sameera Horawalavithana
 
TextRank: Bringing Order into Texts
TextRank: Bringing Order into TextsTextRank: Bringing Order into Texts
TextRank: Bringing Order into Texts
Shubhangi Tandon
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Role of Text Mining in Search Engine
Role of Text Mining in Search EngineRole of Text Mining in Search Engine
Role of Text Mining in Search Engine
Jay R Modi
 
Boolean Retrieval
Boolean RetrievalBoolean Retrieval
Boolean Retrievalmghgk
 
Phrase Based Indexing
Phrase Based IndexingPhrase Based Indexing
Phrase Based Indexing
balaabirami
 
Designing of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: SurveyDesigning of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: Survey
Editor IJCATR
 
Data science chapter-7,8,9
Data science chapter-7,8,9Data science chapter-7,8,9
Data science chapter-7,8,9
varshakumar21
 
Keyword query routing
Keyword query routingKeyword query routing
Keyword query routing
Finalyear Projects
 
Comparisons of ranking algorithms
Comparisons of ranking algorithmsComparisons of ranking algorithms
Comparisons of ranking algorithmsPravin Patil
 
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routingIEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEEFINALYEARSTUDENTPROJECTS
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
Abhay Ratnaparkhi
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywords
IEEEFINALYEARPROJECTS
 
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
ajay pashankar
 
Architecture of a search engine
Architecture of a search engineArchitecture of a search engine
Architecture of a search engine
Sylvain Utard
 
Database management system session 6
Database management system session 6Database management system session 6
Database management system session 6
Infinity Tech Solutions
 
Mining Product Synonyms - Slides
Mining Product Synonyms - SlidesMining Product Synonyms - Slides
Mining Product Synonyms - Slides
Ankush Jain
 

What's hot (19)

Text mining
Text miningText mining
Text mining
 
An efficient approach for illustrating web data of user search result
An efficient approach for illustrating web data of user search resultAn efficient approach for illustrating web data of user search result
An efficient approach for illustrating web data of user search result
 
Duplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy DatasetDuplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy Dataset
 
TextRank: Bringing Order into Texts
TextRank: Bringing Order into TextsTextRank: Bringing Order into Texts
TextRank: Bringing Order into Texts
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Role of Text Mining in Search Engine
Role of Text Mining in Search EngineRole of Text Mining in Search Engine
Role of Text Mining in Search Engine
 
Boolean Retrieval
Boolean RetrievalBoolean Retrieval
Boolean Retrieval
 
Phrase Based Indexing
Phrase Based IndexingPhrase Based Indexing
Phrase Based Indexing
 
Designing of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: SurveyDesigning of Semantic Nearest Neighbor Search: Survey
Designing of Semantic Nearest Neighbor Search: Survey
 
Data science chapter-7,8,9
Data science chapter-7,8,9Data science chapter-7,8,9
Data science chapter-7,8,9
 
Keyword query routing
Keyword query routingKeyword query routing
Keyword query routing
 
Comparisons of ranking algorithms
Comparisons of ranking algorithmsComparisons of ranking algorithms
Comparisons of ranking algorithms
 
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routingIEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywords
 
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
 
Architecture of a search engine
Architecture of a search engineArchitecture of a search engine
Architecture of a search engine
 
Database management system session 6
Database management system session 6Database management system session 6
Database management system session 6
 
Mining Product Synonyms - Slides
Mining Product Synonyms - SlidesMining Product Synonyms - Slides
Mining Product Synonyms - Slides
 

Similar to Search engine. Elasticsearch

Lucene
LuceneLucene
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Robert Calcavecchia
 
Efficiently searching nearest neighbor in documents using keywords
Efficiently searching nearest neighbor in documents using keywordsEfficiently searching nearest neighbor in documents using keywords
Efficiently searching nearest neighbor in documents using keywords
eSAT Journals
 
Efficiently searching nearest neighbor in documents
Efficiently searching nearest neighbor in documentsEfficiently searching nearest neighbor in documents
Efficiently searching nearest neighbor in documents
eSAT Publishing House
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
Clifford James
 
Searching and Analyzing Qualitative Data on Personal Computer
Searching and Analyzing Qualitative Data on Personal ComputerSearching and Analyzing Qualitative Data on Personal Computer
Searching and Analyzing Qualitative Data on Personal Computer
IOSR Journals
 
Solr5
Solr5Solr5
Technical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search EngineTechnical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search Engine
s0P5a41b
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
Shagun Rathore
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
El Habib NFAOUI
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
Volodymyr Kraietskyi
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational Database
Richa Budhraja
 
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
RushikeshChikane2
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET Journal
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
Divij Sehgal
 

Similar to Search engine. Elasticsearch (20)

Lucene
LuceneLucene
Lucene
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
Efficiently searching nearest neighbor in documents using keywords
Efficiently searching nearest neighbor in documents using keywordsEfficiently searching nearest neighbor in documents using keywords
Efficiently searching nearest neighbor in documents using keywords
 
Efficiently searching nearest neighbor in documents
Efficiently searching nearest neighbor in documentsEfficiently searching nearest neighbor in documents
Efficiently searching nearest neighbor in documents
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Searching and Analyzing Qualitative Data on Personal Computer
Searching and Analyzing Qualitative Data on Personal ComputerSearching and Analyzing Qualitative Data on Personal Computer
Searching and Analyzing Qualitative Data on Personal Computer
 
Solr5
Solr5Solr5
Solr5
 
Lec 2
Lec 2Lec 2
Lec 2
 
Technical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search EngineTechnical Whitepaper: A Knowledge Correlation Search Engine
Technical Whitepaper: A Knowledge Correlation Search Engine
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational Database
 
 
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 

Recently uploaded

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 

Recently uploaded (20)

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 

Search engine. Elasticsearch

  • 2. What is Search Engine? Search Engine - a set of applications designed to search for information. Usually is part of the search engine. The main criteria for the quality of the search engine is the relevance (degree of compliance with the request and found that the relevance of results), fullness index, accounting morphology of the language. Most search services: Sphinx, Solr, ElasticSearch, etc...
  • 3. Elasticsearch Elasticsearch - search engine from json rest api, uses Lucene and written in Java. Apache Lucene - a free library of open-source full-text search. Implemented in Java, supported by the Apache Software Foundation and is produced under license Apache Software. Libraries: Java, C #, Python, JavaScript, PHP, Perl, Ruby
  • 4. Requirements In developing heavy websites or corporate systems often have trouble developing fast and easy search engine. The following are the most important, in my opinion, the requirements for this service: ◆ Speed ◆ Easy installation and configuration ◆ Price (preferably free and open source) ◆ Information exchange format JSON (over HTTP) ◆ Indexing in real time ◆ Multi-tenancy (flexible settings for individual user)
  • 5. Index Index - a database, document - a table in it, by understandable terms. The document is a document format JSON, which is stored in elasticsearch. It's like a row in a relational database. Each document is stored in the index, and is the type and ID. The document is a JSON object (also known in other languages ​as hash / HashMap / associative array) that contains zero or more fields or key-value pairs. The original JSON document indexing will be stored in the field _source that returns a default receipt or document search.
  • 6. Analysis Analysis is the process of converting text, like the body of any email, into tokens or terms which are added to the inverted index for searching. Analysis is performed by an analyzer which can be either a built-in analyzer or a custom analyzer defined per index.
  • 7. Elasticsearch Mapping Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. For instance, use mappings to define: ◆which string fields should be treated as full text fields. ◆which fields contain numbers, dates, or geolocations. ◆whether the values of all fields in the document should be indexed into the catch-all _all field. ◆the format of date values. ◆custom rules to control the mapping for dynamically added fields.
  • 8. Elasticsearch Mapping Each field has a data type which can be: ◆a simple type like text, keyword, date, long, double, boolean or ip. ◆a type which supports the hierarchical nature of JSON such as object or nested. ◆or a specialised type like geo_point, geo_shape, or completion.
  • 9. Documents CRUD Often, we use the terms object and document interchangeably. However, there is a distinction. An object is just a JSON object—similar to what is known as a hash, hashmap, dictionary, or associative array. Objects may contain other objects. In Elasticsearch, the term document has a specific meaning. It refers to the top-level, or root object that is serialized into JSON and stored in Elasticsearch under a unique ID.
  • 10. Query and filter context The behaviour of a query clause depends on whether it is used in query context or in filter context: 1. Query context A query clause used in query context answers the question “How well does this document match this query clause?” Besides deciding whether or not the document matches, the query clause also calculates a _score representing how well the document matches, relative to other documents. 1. Filter context In filter context, a query clause answers the question “Does this document match this query clause?” The answer is a simple Yes or No — no scores are calculated. Filter context is mostly used for filtering structured data, e.g.
  • 11. Search examples with DSL Builder
  • 13. Geolocation Filter Elasticsearch offers two ways of representing geolocations: latitude-longitude points using the geo_point field type, and complex shapes defined in GeoJSON, using the geo_shape field type. Geo-points allow you to find points within a certain distance of another point, to calculate distances between two points for sorting or relevance scoring, or to aggregate into a grid to display on a map. Geo-shapes, on the other hand, are used purely for filtering. They can be used to decide whether two shapes overlap, or whether one shape completely contains other shapes. Four geo-point filters can be used to include or exclude documents by geolocation: ● geo_bounding_box Find geo-points that fall within the specified rectangle. ● geo_distance Find geo-points within the specified distance of a central point. ● geo_distance_range Find geo-points within a specified minimum and maximum distance from a central point. ● geo_polygon Find geo-points that fall within the specified polygon. This filter is very expensive. If you find yourself wanting to use it, you should be looking at geo-shapes instead.