SlideShare a Scribd company logo
#LSNA17
#LSNA17
#LSNA17
SEO Relevance
Pages Liferay assets
Whole text is indexed Key/value docs are indexed
Opaque ranking criteria Scored queries, filters, field types
Reverse engineer Fine tune
Third party algorithms Search engine that you control
#LSNA17
GET /_search?explain
{
"query" : {
"term" : { "tag" : "LSNA17" }
}
}
GET /index/type/0/ _explain?q=user_id:2
"value" : 2.7051764,
"description" : "score(doc=0,freq=1.0), product of:",
"details" : [ {
"value" : 0.66422296,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 4.0726933,
"description" : "idf(docFreq=4, maxDocs=108)"
}, {
"value" : 0.16309182,
"description" : "queryNorm"
} ]
}, {
"value" : 4.0726933,
"description" : "fieldWeight in 0, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 4.0726933,
"description" : "idf(docFreq=4, maxDocs=108)"
}, {
"value" : 1.0,
"description" : "fieldNorm(doc=0)"
"failure to match filter: cache(user_id:[2 TO 2])"
#LSNA17
query = apple eclipse zzz yyy xxx qqq kkk ttt rrr
2.345 doc1: apple banana
2.345 doc2: eclipse moon sun
16.415 doc3: zzz yyy xxx qqq kkk ttt rrr 111
#LSNA17
(Term Frequency/Inverse Document Frequency)
In question form... Score increases...
Term frequency How often a term appears in a field? + When the term pops up a lot of
times along the text
Inverse Document
Frequency
How rare is the term in the whole index? + When the term is found in this
document and not many others
Field-length norm How short is the field where the term is? + When there isn't much else in
the same field (like, a title)
#LSNA17
•
•
{ "must" : { "bool" : { "should" : [ { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : {
"content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "content_en_US" : { "query" : "pigeon",
"type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "description_en_US" : { "query" : "pigeon",
"type" : "boolean" } } }, { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : {
"description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : {
"entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase_prefix"
} } } ] } }, "should" : { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : {
"should" : [ { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "title_en_US" : { "query" : "pigeon",
"type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }
#LSNA17
● → FacetedSearcher →
● Indexer
● fields
● score
{ "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } }
{ "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } }
{ "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }
#LSNA17
Natural
language?
string:
text
● TF/IDF
● case insensitive
Score!
IDs and
Serials?
string:
keyword
● not_analyzed
● case sensitive
● match | no match
No score!
Non string
data?
integer,
date,
geo_point...
● match | no match No score!
(... "no score" really a const = 1)
#LSNA17
// IndexSettingsContributor
typeMappingsHelper. addTypeMappings(indexName, myCustomFieldMappings);
liferay-type-mappings.json
"content": {
"index": "analyzed",
"type": "string"
},
"organizationId": {
"index": "not_analyzed",
"type": "string"
},
"publishDate": {
"format": "yyyyMMddHHmmss",
"type": "date"
}
#LSNA17
• Analyzed human searches
• query types
• combinations
• best relevance
Favor text fields over keyword fields.
#LSNA17
"*ubstrin*"
• lowercase
• * → "full scan" ↓↓↓
• don't score
#LSNA17
1. full text search
2. Prefix
3. n-grams
#LSNA17
• Match →
• Prefix →
• Phrase →
Know your field, use the right queries.
#LSNA17
Write a field specific query builder
@Component(service = FieldQueryBuilder.class, immediate = true)
public class MyFieldQueryBuilder implements FieldQueryBuilder {
public Query build(String field, String keywords) {
Fine tune the right queries for your field
myBooleanQuery.add(q1, MUST); myBooleanQuery.add(q2, SHOULD); ...
#LSNA17
多言語検索
• Map
• suffix →
• "b" "a" "d"
• Stemming, stopwords
(https://www.elastic.co/guide/en/elasticsearch/guide/current/using-language-analyzers.html)
Pick the right language analyzer.
#LSNA17
document.addText(" myField_ja_JP", japanese);
document.addText(" myField_en_US", english);
Locale defaultLocale = portal. getSiteDefaultLocale (groupId);
document.addText( getLocalizedName ("myField", defaultLocale), translation);
addSearchLocalizedTerm (searchQuery, searchContext, " myField");
searchContext. setLocale(themeDisplay.getLocale());
liferay-type-mappings.json
"template_ja": {
"mapping": {
"analyzer": "kuromoji"
},
"match": "w+_ja_[A-Z]{2}b"
}
#LSNA17
• description, content
• title, title_en_US
• content
2x matching query clauses = inflated relevance.
Match once and only once.
#LSNA17
If already indexing once...
document.addText(getLocalizedName("myField", languageId), translation);
… no need to index twice...
// DON'T //// document.addText(" myField", content);
… match once and only once.
addSearchLocalizedTerm(searchQuery, searchContext, " myField");
// DON'T //// addSearchTerm(searchQuery, searchContext, " myField");
#LSNA17
• docs
• value
• display
• highlight
Index for rendering, render from doc.
#LSNA17
analyzed
✔
✗
[30] Liferay
[15] DXP
[15] Symposium
#LSNA17
not_analyzed
✔
✗
[15] Liferay DXP
[15] Liferay Symposium
#LSNA17
• Aggregate not_analyzed
– [15] Liferay DXP
– [15] Liferay Symposium
• Match analyzed
–
2 fields, 1 analyzed, 1 not_analyzed.
#LSNA17
Search on the text field
new MatchQuery("myfield", keywords);
Aggregate on the keyword field
myFacet.setFieldName("myfield.raw");
#LSNA17
• multifields
(https://www.elastic.co/guide/en/elasticsearch/guide/current/aggregations-and-analysis.html)
• Copy Fields
(https://wiki.apache.org/solr/SchemaXml#Copy_Fields)
• analyzed
• not_analyzed
#LSNA17
• elasticsearch-head
• Solr Admin
• query string
• explain
Tweak clauses, re-run query, repeat.
#LSNA17
#LSNA17
#LSNA17
Thank you!
And lots of relevant content at #LSNA17
#LSNA17

More Related Content

What's hot

Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
Loïc Bertron
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful services
Markus Lanthaler
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxCoffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Lucidworks
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
Martin Rehfeld
 
03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out
OpenThink Labs
 
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
Markus Lanthaler
 
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Markus Lanthaler
 
Faster and better search results with Elasticsearch
Faster and better search results with ElasticsearchFaster and better search results with Elasticsearch
Faster and better search results with Elasticsearch
Enrico Polesel
 
JSON-LD Update
JSON-LD UpdateJSON-LD Update
JSON-LD Update
Gregg Kellogg
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
MongoDB
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
George Stathis
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
Alexander Byndyu
 
BigML.io - The BigML API
BigML.io - The BigML APIBigML.io - The BigML API
BigML.io - The BigML API
Francisco J Martin
 
GraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOGraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTO
La Cuisine du Web
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
Olga Lavrentieva
 
JSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked DataJSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked Data
Gregg Kellogg
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
Gregg Kellogg
 
Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014
Ryan Heaton
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Rothamsted Research, UK
 
Data exchange formats
Data exchange formatsData exchange formats
Data exchange formats
Przemysław Kamiński
 

What's hot (20)

Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful services
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxCoffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
 
03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out
 
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
 
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
 
Faster and better search results with Elasticsearch
Faster and better search results with ElasticsearchFaster and better search results with Elasticsearch
Faster and better search results with Elasticsearch
 
JSON-LD Update
JSON-LD UpdateJSON-LD Update
JSON-LD Update
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
 
BigML.io - The BigML API
BigML.io - The BigML APIBigML.io - The BigML API
BigML.io - The BigML API
 
GraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOGraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTO
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
 
JSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked DataJSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked Data
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
 
Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
 
Data exchange formats
Data exchange formatsData exchange formats
Data exchange formats
 

Similar to Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Symposium North America 2017, Austin, USA

ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
Philips Kokoh Prasetyo
 
GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
Pokai Chang
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
All Things Open
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
Pedro Franceschi
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
LearningTech
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
MongoDB
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
MongoDB
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Codemotion
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB
 
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep DiveMongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
Norberto Leite
 
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
Alexei Gorobets
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Jonathan Katz
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
rogerbodamer
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
Michal Vrchota
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
MongoDB
 

Similar to Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Symposium North America 2017, Austin, USA (20)

ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
 
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep DiveMongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
 
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
 

Recently uploaded

Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
Mobile app Development Services | Drona Infotech
Mobile app Development Services  | Drona InfotechMobile app Development Services  | Drona Infotech
Mobile app Development Services | Drona Infotech
Drona Infotech
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
YousufSait3
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
Ayan Halder
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
ToXSL Technologies
 
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative AnalysisOdoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Envertis Software Solutions
 

Recently uploaded (20)

Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
Mobile app Development Services | Drona Infotech
Mobile app Development Services  | Drona InfotechMobile app Development Services  | Drona Infotech
Mobile app Development Services | Drona Infotech
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
 
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative AnalysisOdoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
 

Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Symposium North America 2017, Austin, USA

  • 3. #LSNA17 SEO Relevance Pages Liferay assets Whole text is indexed Key/value docs are indexed Opaque ranking criteria Scored queries, filters, field types Reverse engineer Fine tune Third party algorithms Search engine that you control
  • 4. #LSNA17 GET /_search?explain { "query" : { "term" : { "tag" : "LSNA17" } } } GET /index/type/0/ _explain?q=user_id:2 "value" : 2.7051764, "description" : "score(doc=0,freq=1.0), product of:", "details" : [ { "value" : 0.66422296, "description" : "queryWeight, product of:", "details" : [ { "value" : 4.0726933, "description" : "idf(docFreq=4, maxDocs=108)" }, { "value" : 0.16309182, "description" : "queryNorm" } ] }, { "value" : 4.0726933, "description" : "fieldWeight in 0, product of:", "details" : [ { "value" : 1.0, "description" : "tf(freq=1.0), with freq of:", "details" : [ { "value" : 1.0, "description" : "termFreq=1.0" } ] }, { "value" : 4.0726933, "description" : "idf(docFreq=4, maxDocs=108)" }, { "value" : 1.0, "description" : "fieldNorm(doc=0)" "failure to match filter: cache(user_id:[2 TO 2])"
  • 5. #LSNA17 query = apple eclipse zzz yyy xxx qqq kkk ttt rrr 2.345 doc1: apple banana 2.345 doc2: eclipse moon sun 16.415 doc3: zzz yyy xxx qqq kkk ttt rrr 111
  • 6. #LSNA17 (Term Frequency/Inverse Document Frequency) In question form... Score increases... Term frequency How often a term appears in a field? + When the term pops up a lot of times along the text Inverse Document Frequency How rare is the term in the whole index? + When the term is found in this document and not many others Field-length norm How short is the field where the term is? + When there isn't much else in the same field (like, a title)
  • 7. #LSNA17 • • { "must" : { "bool" : { "should" : [ { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }
  • 8. #LSNA17 ● → FacetedSearcher → ● Indexer ● fields ● score { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }
  • 9. #LSNA17 Natural language? string: text ● TF/IDF ● case insensitive Score! IDs and Serials? string: keyword ● not_analyzed ● case sensitive ● match | no match No score! Non string data? integer, date, geo_point... ● match | no match No score! (... "no score" really a const = 1)
  • 10. #LSNA17 // IndexSettingsContributor typeMappingsHelper. addTypeMappings(indexName, myCustomFieldMappings); liferay-type-mappings.json "content": { "index": "analyzed", "type": "string" }, "organizationId": { "index": "not_analyzed", "type": "string" }, "publishDate": { "format": "yyyyMMddHHmmss", "type": "date" }
  • 11. #LSNA17 • Analyzed human searches • query types • combinations • best relevance Favor text fields over keyword fields.
  • 12. #LSNA17 "*ubstrin*" • lowercase • * → "full scan" ↓↓↓ • don't score
  • 13. #LSNA17 1. full text search 2. Prefix 3. n-grams
  • 14. #LSNA17 • Match → • Prefix → • Phrase → Know your field, use the right queries.
  • 15. #LSNA17 Write a field specific query builder @Component(service = FieldQueryBuilder.class, immediate = true) public class MyFieldQueryBuilder implements FieldQueryBuilder { public Query build(String field, String keywords) { Fine tune the right queries for your field myBooleanQuery.add(q1, MUST); myBooleanQuery.add(q2, SHOULD); ...
  • 16. #LSNA17 多言語検索 • Map • suffix → • "b" "a" "d" • Stemming, stopwords (https://www.elastic.co/guide/en/elasticsearch/guide/current/using-language-analyzers.html) Pick the right language analyzer.
  • 17. #LSNA17 document.addText(" myField_ja_JP", japanese); document.addText(" myField_en_US", english); Locale defaultLocale = portal. getSiteDefaultLocale (groupId); document.addText( getLocalizedName ("myField", defaultLocale), translation); addSearchLocalizedTerm (searchQuery, searchContext, " myField"); searchContext. setLocale(themeDisplay.getLocale()); liferay-type-mappings.json "template_ja": { "mapping": { "analyzer": "kuromoji" }, "match": "w+_ja_[A-Z]{2}b" }
  • 18. #LSNA17 • description, content • title, title_en_US • content 2x matching query clauses = inflated relevance. Match once and only once.
  • 19. #LSNA17 If already indexing once... document.addText(getLocalizedName("myField", languageId), translation); … no need to index twice... // DON'T //// document.addText(" myField", content); … match once and only once. addSearchLocalizedTerm(searchQuery, searchContext, " myField"); // DON'T //// addSearchTerm(searchQuery, searchContext, " myField");
  • 20. #LSNA17 • docs • value • display • highlight Index for rendering, render from doc.
  • 23. #LSNA17 • Aggregate not_analyzed – [15] Liferay DXP – [15] Liferay Symposium • Match analyzed – 2 fields, 1 analyzed, 1 not_analyzed.
  • 24. #LSNA17 Search on the text field new MatchQuery("myfield", keywords); Aggregate on the keyword field myFacet.setFieldName("myfield.raw");
  • 25. #LSNA17 • multifields (https://www.elastic.co/guide/en/elasticsearch/guide/current/aggregations-and-analysis.html) • Copy Fields (https://wiki.apache.org/solr/SchemaXml#Copy_Fields) • analyzed • not_analyzed
  • 26. #LSNA17 • elasticsearch-head • Solr Admin • query string • explain Tweak clauses, re-run query, repeat.
  • 30. Thank you! And lots of relevant content at #LSNA17