SlideShare a Scribd company logo
Harnessing The
Power of Search
André Ricardo Barreto de Oliveira ("Arbo")
Software Engineer - Team Lead - Search
Darmstadt, Germany
7 October, 2015
What's Search
and why is it so cool?
The dawn of Search
Searching higher
Search and the
Digital Experience
Understanding Search
Inside the Search Engine
The Index
Inside the Search Engine
The Index Documents
Inside the Search Engine
The Index Documents Fields
Inside the Search Engine
The Index Documents Fields
Not that different from ye olde database?...
Indexing documents
PUT /megacorp/employee/1
{
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
PUT /megacorp/employee/2
{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}
PUT /megacorp/employee/3
{
"first_name" : "Douglas",
"last_name" : "Fir",
"age" : 35,
"about": "I like to build cabinets",
"interests": [ "forestry" ]
}
Queries and Filters
GET /megacorp/employee/_search?q=last_name:Smith "hits": [
{
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
},
{
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 32,
"about": "I like to collect rock albums",
"interests": [ "music" ]
}
}
]
GET /megacorp/employee/_search
{
"query" : {
"filtered" : {
"filter" : {
"range" : {
"age" : { "gt" : 21 }
}
},
"query" : {
"match" : {
"last_name" : "smith"
}
}
}
}
}
Full-Text Search
GET /megacorp/employee/_search
{
"query" : {
"match" : {
"about" : "rock climbing"
}
}
}
"hits": [
{
"_score": 0.16273327,
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
},
{
"_score": 0.016878016,
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 32,
"about": "I like to collect rock albums",
"interests": [ "music" ]
}
}
]
Analysis and Analyzers
Set the shape to semi-transparent by calling Set_Trans(5)
Standard analyzer
set, the, shape, to, semi, transparent, by, calling, set_trans, 5
Simple analyzer
set, the, shape, to, semi, transparent, by, calling, set, trans
Whitespace analyzer
Set, the, shape, to, semi-transparent, by, calling, Set_Trans(5)
English language analyzer
set, shape, semi, transpar, call, set_tran, 5
Field mappings
{
"number_of_clicks": {
"type": "integer"
}
}
{
"tag": {
"type": "string",
"index": "not_analyzed"
}
}
{
"tweet": {
"type": "string",
"analyzer": "english"
}
}
Analytics and Aggregations
GET /megacorp/employee/_search
{
"query": {
"match": {
"last_name": "smith"
}
},
"aggs" : {
"all_interests" : {
"terms" : { "field" : "interests" },
"aggs" : {
"avg_age" : {
"avg" : { "field" : "age" }
}
}
}
}
}
"buckets": [
{
"key": "music",
"doc_count": 2,
"avg_age": {
"value": 28.5
}
},
{
"key": "sports",
"doc_count": 1,
"avg_age": {
"value": 25
}
}
]
The Liferay
Search Infrastructure
The Liferay Search architecture
Liferay
Portal
Assets:
web content,
message boards,
wiki pages...
Search
infrastructure
(Magic
happens
here)
Search
engine(s)
Indices,
documents,
analysis...
The Liferay Search Engine plugins
public interface SearchEngine {
public IndexSearcher getIndexSearcher();
public IndexWriter getIndexWriter();
}
public class ElasticsearchSearchEngine
extends BaseSearchEngine
public class ElasticsearchIndexSearcher
extends BaseIndexSearcher
public class ElasticsearchIndexWriter
extends BaseIndexWriter
public class SolrSearchEngine
extends BaseSearchEngine
public class SolrIndexSearcher
extends BaseIndexSearcher
public class SolrIndexWriter
extends BaseIndexWriter
Solr: schema.xml
<fields>
<field indexed="true"
name="articleId"
stored="true"
type="string_keyword_lowercase"
/>
<field indexed="true"
name="companyId"
stored="true"
type="long"
/>
<field indexed="true"
name="emailAddress"
stored="true"
type="string"
/>
</fields>
The Liferay Document Mappings
Elasticsearch: liferay-type-mappings.json
"LiferayDocumentType": {
"properties": {
"articleId": {
"analyzer": "keyword_lowercase",
"store": "yes",
"type": "string"
},
"companyId": {
"index": "not_analyzed",
"store": "yes",
"type": "string"
},
"emailAddress": {
"index": "not_analyzed",
"store": "yes",
"type": "string"
}
}
}
From Portal assets to Index documents…
public interface Indexer<T> {
public Document getDocument(T object);
}
public class JournalArticleIndexer extends BaseIndexer<JournalArticle> {
protected Document doGetDocument(JournalArticle journalArticle) {
Document document = getBaseModelDocument(CLASS_NAME, journalArticle);
document.addText(
LocalizationUtil.getLocalizedName(Field.CONTENT, languageId),
content);
document.addKeyword(
Field.VERSION, journalArticle.getVersion());
document.addDate(
"displayDate", journalArticle.getDisplayDate());
}
}
public class MBMessageIndexer extends BaseIndexer<MBMessage> {
protected Document doGetDocument(MBMessage mbMessage) {
Document document = getBaseModelDocument(CLASS_NAME, mbMessage);
document.addText(
Field.CONTENT, processContent(mbMessage));
document.addKeyword(
"discussion", discussion == null ? false : true);
if (mbMessage.isAnonymous()) {
document.remove(Field.USER_NAME);
}
}
}
public interface Document {
public void addKeyword(String name, String value);
public void addNumber(String name, long value);
}
… from Search Box to queries and filters
public class JournalArticleIndexer
extends BaseIndexer<JournalArticle> {
public void postProcessSearchQuery(
BooleanQuery searchQuery,
BooleanFilter fullQueryBooleanFilter,
SearchContext searchContext) {
addSearchTerm(searchQuery, searchContext,
Field.ARTICLE_ID, false);
addSearchLocalizedTerm(searchQuery, searchContext,
Field.CONTENT, false);
addSearchLocalizedTerm(searchQuery, searchContext,
Field.TITLE, false);
addSearchTerm(searchQuery, searchContext,
Field.USER_NAME, false);
}
}
public class MBThreadIndexer
extends BaseIndexer<MBThread> {
public void postProcessContextBooleanFilter(
BooleanFilter contextBooleanFilter,
SearchContext searchContext) {
contextBooleanFilter.addRequiredTerm(
"discussion", discussion);
if ((endDate > 0) && (startDate > 0)) {
contextBooleanFilter.addRangeTerm(
"lastPostDate", startDate, endDate);
}
}
}
Classic query types (and filters)
TermQuery / TermFilter
"term" : { "locale" : "de_DE" }
TermRangeQuery / RangeTermFilter
"range" : { "age" :
{ "gte" : 8, "lte" : 42 } }
WildcardQuery
"wildcard" : { "company" : "L*ray" }
StringQuery
"query_string": { "query":
"(content:this OR name:this) AND
(content:that OR name:that)" }
BooleanQuery / BooleanFilter
"bool" : {
"must" : {
"term" : { "locale" : "de_DE" }
},
"must_not" : {
"range" : {
"age" : { "from" : 8, "to" : 42 }
}
},
"should" : [
{
"wildcard" : { "company" : "L*ray" }
},
{
"term" : { "product" : "Portal" }
}
]
}
Speaking to the Search Engine
public interface Query {
public BooleanFilter getPreBooleanFilter();
public Filter getPostFilter();
}
public interface Filter {
public Boolean isCached();
}
public class StringQueryTranslatorImpl
implements StringQueryTranslator {
public QueryBuilder translate(StringQuery stringQuery) {
// Elasticsearch Client Java API
return QueryBuilders.queryStringQuery(stringQuery.getQuery());
}
}
public class ElasticsearchIndexSearcher extends BaseIndexSearcher {
protected SearchResponse doSearch(
SearchContext searchContext, Query query) {
// Elasticsearch Client Java API
Client client = _elasticsearchConnectionManager.getClient();
SearchRequestBuilder searchRequestBuilder = client.prepareSearch(
getSelectedIndexNames(queryConfig, searchContext));
QueryBuilder queryBuilder = _queryTranslator.translate(
query, searchContext);
searchRequestBuilder.setQuery(queryBuilder);
SearchResponse searchResponse = searchRequestBuilder.get();
return searchResponse;
}
}
Search in Liferay 7
What's new in Liferay 7
Liferay 6
● Embedded Lucene by default
● Remote: Solr only
● Solr 4
● Portal-centric Lucene clustering
Liferay 7
● Embedded Elasticsearch by default
● Remote: Elasticsearch and Solr
● Solr 5.x and SolrCloud
● Native, transparent Elasticsearch clustering
● Queries + Filters + Boosting + Geolocation
● Extensibility and modularization
● Enterprise extras
○ Shield for security
○ Marvel for cluster monitoring
○ Kibana for visualization
New Queries
MatchQuery
"match" : {
"subject" : {
"query" : "Liferay Portal",
"type" : "phrase"
}
}
MoreLikeThisQuery
"more_like_this" : {
"fields" : ["title", "content"],
"like_text" : "Search In Liferay 7",
"min_term_freq" : 1, "max_query_terms" : 12
}
DisMaxQuery
"dis_max" : {
"tie_breaker" : 0.7,
"queries" : [
{ "term" : { "age" : 34 } },
{ "term" : { "age" : 35 } }
]
}
FuzzyQuery
"fuzzy" : {
"user" : {
"value" : "ed",
"fuzziness" : 2,
"max_expansions": 100
}
}
MatchAllQuery / MatchAllFilter
"match_all" : {
"boost" : 1.2
}
MultiMatchQuery
"multi_match" : {
"query": "Enterprise. Open Source. For Life",
"type": "most_fields",
"fields": [ "title", "title.original", "title.shingles" ]
}
New Filters
ExistsFilter
"exists" : { "field" : "emailAddress" }
MissingFilter
"missing" : { "field" : "emailAddress" }
PrefixFilter
"prefix" : { "product" : "life" }
TermsFilter
"terms" : { "locale" : ["de_DE", "pt_BR", "en_CA"] }
QueryFilter
"fquery" : {
"query" : {
"bool" : {
"must" : [
{
"wildcard" : { "company" : "L*ray" }
},
{
"term" : { "product" : "Portal" }
}
]
}
},
"_cache" : true
}
Geolocation filters
GeoDistanceFilter
"geo_distance" : {
"distance" : "12km",
"pin.location" : {
"lat" : 40,
"lon" : -70
}
}
GeoBoundingBoxFilter
"geo_bounding_box" : {
"pin.location" : {
"top_left" : {
"lat" : 40.73,
"lon" : -74.1
},
"bottom_right" : {
"lat" : 40.01,
"lon" : -71.12
}
}
}
GeoDistanceRangeFilter
"geo_distance_range" : {
"from" : "200km",
"to" : "400km",
"pin.location" : {
"lat" : 40,
"lon" : -70
}
}
GeoPolygonFilter
"geo_polygon" : {
"person.location" : {
"points" : [
[-70, 40],
[-80, 30],
[-90, 20]
]
}
}
Query-time boosting
"should": [
{
"match": {
"title": {
"query": "Liferay Portal",
"boost": 2
}
}
},
{
"match": {
"content": {
"query": "Liferay Portal",
}
}
}
]
New Aggregations: Top Hits
"terms": {
"field": "conference",
"size": 2
},
"aggs": {
"talks": {
"top_hits": {
"size" : 1,
"sort": [
{
"attendees": {
"order": "desc"
}
}
]
}
}
}
{
"key": "Liferay DEVCON",
"talks": {
"hits": [
{
"_source": {
"title": "The Power of Search"
}
}
]
}
},
{
"key": "Liferay North America Symposium",
"talks": {
"hits": [
{
"_source": {
"title": "The ELK Stack"
}
}
]
}
}
New Aggregations: Extended Stats
"extended_stats" : {
"field" : "attendees"
}
"attendees_per_talk_stats": {
"count": 9,
"min": 72,
"max": 99,
"avg": 86,
"sum": 774,
"sum_of_squares": 67028,
"std_deviation": 7.180219742846005
}
Modularity and Search
● OSGi
● Liferay's default Search Engine: now a plugin in itself
● Extension points in Search
○ Node Settings contributors → fine tune your cluster
○ Index Settings contributors → fine tune your shards and
logs
○ Analyzers and Mappings contributors → fine tune your
fields and queries
Liferay 7:
Enter Elasticsearch
Why Elasticsearch?
Best of breed
Built for modern web applications
Distributed and clusterable by design
Lucene based
Multi-tenancy
Great vendor support
Great monitoring tools: Marvel, Logstash
Great for Developers
Open Source
Amazing documentation
High "just works" factor, e.g. zero-config indexing and clustering
REST for queries, health, admin - everything
Update live settings programmatically
Great Java Client API
Pretty JSON for talks ;-)
Clustering with Liferay and Elasticsearch
Production mode
Dev mode
Scaling and tuning made easy
Enterprise-level Search
in Liferay 7 EE
Security: Shield
Protect your Liferay index with a username and password
SSL/TLS encryption for traffic within the Liferay Elasticsearch cluster
Elasticsearch plugin - no need for an external security solution
Restrict access to Liferay Portal instances with IP filtering
Monitoring: Marvel
Visualization:
Kibana
Thanks and happy searching!
http://j.mp/SearchLiferayDevcon2015
andre.oliveira@liferay.com
github.com/arboliveira
@arbocombr

More Related Content

What's hot

GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
Pokai Chang
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
George Stathis
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
Matteo Moci
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
Loïc Bertron
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
Olga Lavrentieva
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started
OpenThink Labs
 
Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014
Ryan Heaton
 
JSON-LD update DC 2017
JSON-LD update DC 2017JSON-LD update DC 2017
JSON-LD update DC 2017
Gregg Kellogg
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful services
Markus Lanthaler
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
Alexander Byndyu
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
Martin Rehfeld
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with Fusion
Lucidworks
 
BigML.io - The BigML API
BigML.io - The BigML APIBigML.io - The BigML API
BigML.io - The BigML API
Francisco J Martin
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
Florian Hopf
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
Gregg Kellogg
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxCoffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Lucidworks
 
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
Hydra: A Vocabulary for Hypermedia-Driven Web APIsHydra: A Vocabulary for Hypermedia-Driven Web APIs
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
Markus Lanthaler
 
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
Markus Lanthaler
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Ícaro Medeiros
 

What's hot (20)

GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started
 
Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014
 
JSON-LD update DC 2017
JSON-LD update DC 2017JSON-LD update DC 2017
JSON-LD update DC 2017
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful services
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with Fusion
 
BigML.io - The BigML API
BigML.io - The BigML APIBigML.io - The BigML API
BigML.io - The BigML API
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxCoffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
 
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
Hydra: A Vocabulary for Hypermedia-Driven Web APIsHydra: A Vocabulary for Hypermedia-Driven Web APIs
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
 
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
 

Similar to Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany

Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended Version
Sonya Liberman
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
Ricardo Peres
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Codemotion
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Erwan Pigneul
 
Curiosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunesCuriosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunes
PagesJaunes
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
South Tyrol Free Software Conference
 
06. ElasticSearch : Mapping and Analysis
06. ElasticSearch : Mapping and Analysis06. ElasticSearch : Mapping and Analysis
06. ElasticSearch : Mapping and Analysis
OpenThink Labs
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
MongoDB
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
Philips Kokoh Prasetyo
 
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Niels Henrik Hagen
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
LearningTech
 
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin & Leanne La...
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin &  Leanne La...OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin &  Leanne La...
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin & Leanne La...
NETWAYS
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
confluent
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
Tom Chen
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
Sematext Group, Inc.
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
Norberto Leite
 
Multi faceted responsive search, autocomplete, feeds engine & logging
Multi faceted responsive search, autocomplete, feeds engine & loggingMulti faceted responsive search, autocomplete, feeds engine & logging
Multi faceted responsive search, autocomplete, feeds engine & logging
lucenerevolution
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Trey Grainger
 

Similar to Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany (20)

Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended Version
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113
 
Curiosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunesCuriosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunes
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
06. ElasticSearch : Mapping and Analysis
06. ElasticSearch : Mapping and Analysis06. ElasticSearch : Mapping and Analysis
06. ElasticSearch : Mapping and Analysis
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin & Leanne La...
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin &  Leanne La...OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin &  Leanne La...
OSMC 2023 | Experiments with OpenSearch and AI by Jochen Kressin & Leanne La...
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
 
Multi faceted responsive search, autocomplete, feeds engine & logging
Multi faceted responsive search, autocomplete, feeds engine & loggingMulti faceted responsive search, autocomplete, feeds engine & logging
Multi faceted responsive search, autocomplete, feeds engine & logging
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
 

Recently uploaded

Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Hivelance Technology
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 

Recently uploaded (20)

Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 

Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany

  • 1. Harnessing The Power of Search André Ricardo Barreto de Oliveira ("Arbo") Software Engineer - Team Lead - Search Darmstadt, Germany 7 October, 2015
  • 2.
  • 3.
  • 4. What's Search and why is it so cool?
  • 5. The dawn of Search
  • 9. Inside the Search Engine The Index
  • 10. Inside the Search Engine The Index Documents
  • 11. Inside the Search Engine The Index Documents Fields
  • 12. Inside the Search Engine The Index Documents Fields Not that different from ye olde database?...
  • 13. Indexing documents PUT /megacorp/employee/1 { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } PUT /megacorp/employee/2 { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] } PUT /megacorp/employee/3 { "first_name" : "Douglas", "last_name" : "Fir", "age" : 35, "about": "I like to build cabinets", "interests": [ "forestry" ] }
  • 14. Queries and Filters GET /megacorp/employee/_search?q=last_name:Smith "hits": [ { "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] GET /megacorp/employee/_search { "query" : { "filtered" : { "filter" : { "range" : { "age" : { "gt" : 21 } } }, "query" : { "match" : { "last_name" : "smith" } } } } }
  • 15. Full-Text Search GET /megacorp/employee/_search { "query" : { "match" : { "about" : "rock climbing" } } } "hits": [ { "_score": 0.16273327, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_score": 0.016878016, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ]
  • 16. Analysis and Analyzers Set the shape to semi-transparent by calling Set_Trans(5) Standard analyzer set, the, shape, to, semi, transparent, by, calling, set_trans, 5 Simple analyzer set, the, shape, to, semi, transparent, by, calling, set, trans Whitespace analyzer Set, the, shape, to, semi-transparent, by, calling, Set_Trans(5) English language analyzer set, shape, semi, transpar, call, set_tran, 5
  • 17. Field mappings { "number_of_clicks": { "type": "integer" } } { "tag": { "type": "string", "index": "not_analyzed" } } { "tweet": { "type": "string", "analyzer": "english" } }
  • 18. Analytics and Aggregations GET /megacorp/employee/_search { "query": { "match": { "last_name": "smith" } }, "aggs" : { "all_interests" : { "terms" : { "field" : "interests" }, "aggs" : { "avg_age" : { "avg" : { "field" : "age" } } } } } } "buckets": [ { "key": "music", "doc_count": 2, "avg_age": { "value": 28.5 } }, { "key": "sports", "doc_count": 1, "avg_age": { "value": 25 } } ]
  • 20. The Liferay Search architecture Liferay Portal Assets: web content, message boards, wiki pages... Search infrastructure (Magic happens here) Search engine(s) Indices, documents, analysis...
  • 21. The Liferay Search Engine plugins public interface SearchEngine { public IndexSearcher getIndexSearcher(); public IndexWriter getIndexWriter(); } public class ElasticsearchSearchEngine extends BaseSearchEngine public class ElasticsearchIndexSearcher extends BaseIndexSearcher public class ElasticsearchIndexWriter extends BaseIndexWriter public class SolrSearchEngine extends BaseSearchEngine public class SolrIndexSearcher extends BaseIndexSearcher public class SolrIndexWriter extends BaseIndexWriter
  • 22. Solr: schema.xml <fields> <field indexed="true" name="articleId" stored="true" type="string_keyword_lowercase" /> <field indexed="true" name="companyId" stored="true" type="long" /> <field indexed="true" name="emailAddress" stored="true" type="string" /> </fields> The Liferay Document Mappings Elasticsearch: liferay-type-mappings.json "LiferayDocumentType": { "properties": { "articleId": { "analyzer": "keyword_lowercase", "store": "yes", "type": "string" }, "companyId": { "index": "not_analyzed", "store": "yes", "type": "string" }, "emailAddress": { "index": "not_analyzed", "store": "yes", "type": "string" } } }
  • 23. From Portal assets to Index documents… public interface Indexer<T> { public Document getDocument(T object); } public class JournalArticleIndexer extends BaseIndexer<JournalArticle> { protected Document doGetDocument(JournalArticle journalArticle) { Document document = getBaseModelDocument(CLASS_NAME, journalArticle); document.addText( LocalizationUtil.getLocalizedName(Field.CONTENT, languageId), content); document.addKeyword( Field.VERSION, journalArticle.getVersion()); document.addDate( "displayDate", journalArticle.getDisplayDate()); } } public class MBMessageIndexer extends BaseIndexer<MBMessage> { protected Document doGetDocument(MBMessage mbMessage) { Document document = getBaseModelDocument(CLASS_NAME, mbMessage); document.addText( Field.CONTENT, processContent(mbMessage)); document.addKeyword( "discussion", discussion == null ? false : true); if (mbMessage.isAnonymous()) { document.remove(Field.USER_NAME); } } } public interface Document { public void addKeyword(String name, String value); public void addNumber(String name, long value); }
  • 24. … from Search Box to queries and filters public class JournalArticleIndexer extends BaseIndexer<JournalArticle> { public void postProcessSearchQuery( BooleanQuery searchQuery, BooleanFilter fullQueryBooleanFilter, SearchContext searchContext) { addSearchTerm(searchQuery, searchContext, Field.ARTICLE_ID, false); addSearchLocalizedTerm(searchQuery, searchContext, Field.CONTENT, false); addSearchLocalizedTerm(searchQuery, searchContext, Field.TITLE, false); addSearchTerm(searchQuery, searchContext, Field.USER_NAME, false); } } public class MBThreadIndexer extends BaseIndexer<MBThread> { public void postProcessContextBooleanFilter( BooleanFilter contextBooleanFilter, SearchContext searchContext) { contextBooleanFilter.addRequiredTerm( "discussion", discussion); if ((endDate > 0) && (startDate > 0)) { contextBooleanFilter.addRangeTerm( "lastPostDate", startDate, endDate); } } }
  • 25. Classic query types (and filters) TermQuery / TermFilter "term" : { "locale" : "de_DE" } TermRangeQuery / RangeTermFilter "range" : { "age" : { "gte" : 8, "lte" : 42 } } WildcardQuery "wildcard" : { "company" : "L*ray" } StringQuery "query_string": { "query": "(content:this OR name:this) AND (content:that OR name:that)" } BooleanQuery / BooleanFilter "bool" : { "must" : { "term" : { "locale" : "de_DE" } }, "must_not" : { "range" : { "age" : { "from" : 8, "to" : 42 } } }, "should" : [ { "wildcard" : { "company" : "L*ray" } }, { "term" : { "product" : "Portal" } } ] }
  • 26. Speaking to the Search Engine public interface Query { public BooleanFilter getPreBooleanFilter(); public Filter getPostFilter(); } public interface Filter { public Boolean isCached(); } public class StringQueryTranslatorImpl implements StringQueryTranslator { public QueryBuilder translate(StringQuery stringQuery) { // Elasticsearch Client Java API return QueryBuilders.queryStringQuery(stringQuery.getQuery()); } } public class ElasticsearchIndexSearcher extends BaseIndexSearcher { protected SearchResponse doSearch( SearchContext searchContext, Query query) { // Elasticsearch Client Java API Client client = _elasticsearchConnectionManager.getClient(); SearchRequestBuilder searchRequestBuilder = client.prepareSearch( getSelectedIndexNames(queryConfig, searchContext)); QueryBuilder queryBuilder = _queryTranslator.translate( query, searchContext); searchRequestBuilder.setQuery(queryBuilder); SearchResponse searchResponse = searchRequestBuilder.get(); return searchResponse; } }
  • 28. What's new in Liferay 7 Liferay 6 ● Embedded Lucene by default ● Remote: Solr only ● Solr 4 ● Portal-centric Lucene clustering Liferay 7 ● Embedded Elasticsearch by default ● Remote: Elasticsearch and Solr ● Solr 5.x and SolrCloud ● Native, transparent Elasticsearch clustering ● Queries + Filters + Boosting + Geolocation ● Extensibility and modularization ● Enterprise extras ○ Shield for security ○ Marvel for cluster monitoring ○ Kibana for visualization
  • 29. New Queries MatchQuery "match" : { "subject" : { "query" : "Liferay Portal", "type" : "phrase" } } MoreLikeThisQuery "more_like_this" : { "fields" : ["title", "content"], "like_text" : "Search In Liferay 7", "min_term_freq" : 1, "max_query_terms" : 12 } DisMaxQuery "dis_max" : { "tie_breaker" : 0.7, "queries" : [ { "term" : { "age" : 34 } }, { "term" : { "age" : 35 } } ] } FuzzyQuery "fuzzy" : { "user" : { "value" : "ed", "fuzziness" : 2, "max_expansions": 100 } } MatchAllQuery / MatchAllFilter "match_all" : { "boost" : 1.2 } MultiMatchQuery "multi_match" : { "query": "Enterprise. Open Source. For Life", "type": "most_fields", "fields": [ "title", "title.original", "title.shingles" ] }
  • 30. New Filters ExistsFilter "exists" : { "field" : "emailAddress" } MissingFilter "missing" : { "field" : "emailAddress" } PrefixFilter "prefix" : { "product" : "life" } TermsFilter "terms" : { "locale" : ["de_DE", "pt_BR", "en_CA"] } QueryFilter "fquery" : { "query" : { "bool" : { "must" : [ { "wildcard" : { "company" : "L*ray" } }, { "term" : { "product" : "Portal" } } ] } }, "_cache" : true }
  • 31. Geolocation filters GeoDistanceFilter "geo_distance" : { "distance" : "12km", "pin.location" : { "lat" : 40, "lon" : -70 } } GeoBoundingBoxFilter "geo_bounding_box" : { "pin.location" : { "top_left" : { "lat" : 40.73, "lon" : -74.1 }, "bottom_right" : { "lat" : 40.01, "lon" : -71.12 } } } GeoDistanceRangeFilter "geo_distance_range" : { "from" : "200km", "to" : "400km", "pin.location" : { "lat" : 40, "lon" : -70 } } GeoPolygonFilter "geo_polygon" : { "person.location" : { "points" : [ [-70, 40], [-80, 30], [-90, 20] ] } }
  • 32. Query-time boosting "should": [ { "match": { "title": { "query": "Liferay Portal", "boost": 2 } } }, { "match": { "content": { "query": "Liferay Portal", } } } ]
  • 33. New Aggregations: Top Hits "terms": { "field": "conference", "size": 2 }, "aggs": { "talks": { "top_hits": { "size" : 1, "sort": [ { "attendees": { "order": "desc" } } ] } } } { "key": "Liferay DEVCON", "talks": { "hits": [ { "_source": { "title": "The Power of Search" } } ] } }, { "key": "Liferay North America Symposium", "talks": { "hits": [ { "_source": { "title": "The ELK Stack" } } ] } }
  • 34. New Aggregations: Extended Stats "extended_stats" : { "field" : "attendees" } "attendees_per_talk_stats": { "count": 9, "min": 72, "max": 99, "avg": 86, "sum": 774, "sum_of_squares": 67028, "std_deviation": 7.180219742846005 }
  • 35. Modularity and Search ● OSGi ● Liferay's default Search Engine: now a plugin in itself ● Extension points in Search ○ Node Settings contributors → fine tune your cluster ○ Index Settings contributors → fine tune your shards and logs ○ Analyzers and Mappings contributors → fine tune your fields and queries
  • 37. Why Elasticsearch? Best of breed Built for modern web applications Distributed and clusterable by design Lucene based Multi-tenancy Great vendor support Great monitoring tools: Marvel, Logstash
  • 38. Great for Developers Open Source Amazing documentation High "just works" factor, e.g. zero-config indexing and clustering REST for queries, health, admin - everything Update live settings programmatically Great Java Client API Pretty JSON for talks ;-)
  • 39. Clustering with Liferay and Elasticsearch Production mode Dev mode
  • 40. Scaling and tuning made easy
  • 42. Security: Shield Protect your Liferay index with a username and password SSL/TLS encryption for traffic within the Liferay Elasticsearch cluster Elasticsearch plugin - no need for an external security solution Restrict access to Liferay Portal instances with IP filtering
  • 45. Thanks and happy searching! http://j.mp/SearchLiferayDevcon2015 andre.oliveira@liferay.com github.com/arboliveira @arbocombr