SlideShare a Scribd company logo
1 of 96
“ Terms of Endearment” The ElasticSearch query language explained Clinton Gormley, YAPC::EU 2011 DRTECH @clintongormley
search for : “ DELETE QUERY ”  We can
search for : “ DELETE QUERY ”  and find : “ deleteByQuery ” We can
but you can only find  what is stored in the database
Normalise values  “ deleteByQuery” 'delete' 'by' 'query' 'deletebyquery'
Normalise values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
Normalise  values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
Analyse  values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
What is stored in ElasticSearch?
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user  => { name  => "Clinton Gormley", email => "drtech@cpan.org", }, tags  => [" perl" ,"opinion"],  posts  => 2, } Document:
{ tweet   => "Perl is GREAT!", posted  => "2011-08-15", user   => { name   => "Clinton Gormley", email  => "drtech@cpan.org", }, tags   => [" perl" ,"opinion"],  posts   => 2, } Fields:
{ tweet  =>  "Perl is GREAT!", posted =>  "2011-08-15", user  =>  { name  =>  "Clinton Gormley", email =>  "drtech@cpan.org", }, tags   =>  [" perl" ,"opinion"],  posts  =>  2, } Values:
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user  => { name  => "Clinton Gormley", email => "drtech@cpan.org" }, tags  => [" perl" ,"opinion"],  posts  => 2, } Field types: # object # string # date # nested object # string # string # array of enums # integer
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user   => { name  => "Clinton Gormley", email  => "drtech@cpan.org", }, tags  => [" perl" ,"opinion"],  posts  => 2, } Nested objects flattened:
{ tweet  => "Perl is GREAT!", posted  => "2011-08-15", user.name  => "Clinton Gormley", user.email  => "drtech@cpan.org", tags  => [" perl" ,"opinion"],  posts  => 2, } Nested objects flattened
{ tweet  =>  "Perl is GREAT!", posted  =>  "2011-08-15", user.name  =>  "Clinton Gormley", user.email =>  "drtech@cpan.org", tags  =>  [" perl" ,"opinion"],  posts  =>  2, } Values analyzed into terms
{ tweet  =>  ['perl','great'], posted  =>  [Date(2011-08-15)], user.name  =>  ['clinton','gormley'], user.email =>  ['drtech','cpan.org'], tags  =>  [' perl' ,'opinion'],  posts  =>  [2], } Values analyzed into terms
database table row ⇒  many tables ⇒  many rows ⇒  one schema ⇒  many columns In MySQL
index type document ⇒  many types ⇒  many documents ⇒  one mapping ⇒  many fields In ElasticSearch
Create index with mappings $es-> create_index ( index  => 'twitter', mappings   => { tweet   => { properties  => { title  => { type => 'string' }, created => { type => 'date'  } }  } } );
Add a mapping $es-> put_mapping (  index => 'twitter', type  => ' user ', mapping   => { properties  => { name  => { type => 'string' }, created => { type => 'date'  }, }  } );
Can add to existing mapping
Can add to existing mapping Cannot change mapping for field
Core field types { type  => 'string', }
Core field types { type  => 'string', # byte|short|integer|long|double|float # date, ip addr, geolocation # boolean # binary (as base 64) }
Core field types { type  => 'string', index  => ' analyzed ', # 'Foo Bar'  ⇒  [ 'foo', 'bar' ] }
Core field types { type  => 'string', index  => ' not_analyzed ', # 'Foo Bar'  ⇒  [ 'Foo Bar' ] }
Core field types { type  => 'string', index  => ' no ', # 'Foo Bar'  ⇒  [ ] }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', }
Core field types { type  => 'string', index  => 'analyzed', index_ analyzer  => 'default', search_ analyzer => 'default', }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', boost  => 2, }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', boost  => 2, include_in_all  => 1 |0 }
[object Object]
Simple
Whitespace
Stop
Keyword Built in analyzers ,[object Object]
Language
Snowball
Custom
The Brown-Cow's Part_No.  #A.BC123-456 joe@bloggs.com keyword: The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
Token filters ,[object Object]
ASCII Folding
Length
Lowercase
NGram
Edge NGram
Porter Stem
Shingle
Stop
Word Delimiter ,[object Object]
KStem
Snowball
Phonetic
Synonym
Compound Word
Reverse
Elision
Truncate
Unique
Custom Analyzer $c->create_index( index  => 'twitter', settings  => { analysis => { analyzer => { ascii_html => { type  => 'custom', tokenizer  => 'standard', filter  => [ qw( standard lowercase asciifolding stop ) ], char_filter => ['html_strip'] } } }} );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet',  );
Searching $result = $es->search( index  =>  ['twitter','facebook'] , type  =>  ['tweet','post'] ,  );
Searching $result = $es->search( #  all indices #  all types );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet',  query  => { text => { _all => 'foo' }}, );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query b   =>  'foo' , #  b == ElasticSearch::SearchBuilder );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query  => { text => { _all => 'foo' }}, sort  => [{ '_score': 'desc' }] );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query  => { text => { _all => 'foo' }}, sort  => [{ '_score': 'desc' }] from  => 0, size  => 10, );
Query DSL
Queries   vs  Filters
Queries   vs  Filters  ,[object Object],[object Object]
Queries   vs  Filters  ,[object Object]
relevance scoring ,[object Object]
no scoring
Queries   vs  Filters  ,[object Object]
relevance scoring
slower ,[object Object]
no scoring
faster
Queries   vs  Filters  ,[object Object]
relevance scoring
slower
no caching ,[object Object]
no scoring
faster
cacheable
Queries   vs  Filters  ,[object Object]
relevance scoring
slower
no caching ,[object Object]
no scoring
faster
cacheable  Use filters for anything that doesn't affect the relevance score!
Query only Query DSL: $es->search(  query => {  text => { title => 'perl' }  } ); SearchBuilder: $es->search(  query b  => {  title => 'perl'  } );
Filter only Query DSL: $es->search( query => { constant_score => { filter => { term => { tag => 'perl } } } }); SearchBuilder: $es->search( query b  => { -filter => {  tag => 'perl'  } });
Query and filter Query DSL: $es->search( query => { filtered  => { query => {  text => { title => 'perl' } }, filter =>{  term => { tag => 'perl'  } } } }); SearchBuilder: $es->search( query b  => { title  => 'perl', -filter => {  tag => 'perl'  }  });

More Related Content

Similar to Terms of endearment - the ElasticSearch Query DSL explained

Php Basic Security
Php Basic SecurityPhp Basic Security
Php Basic Securitymussawir20
 
High-level Web Testing
High-level Web TestingHigh-level Web Testing
High-level Web Testingpetersergeant
 
Exploiting Php With Php
Exploiting Php With PhpExploiting Php With Php
Exploiting Php With PhpJeremy Coates
 
PHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodPHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodJeremy Kendall
 
Intro python
Intro pythonIntro python
Intro pythonkamzilla
 
Drupal Lightning FAPI Jumpstart
Drupal Lightning FAPI JumpstartDrupal Lightning FAPI Jumpstart
Drupal Lightning FAPI Jumpstartguestfd47e4c7
 
03 Php Array String Functions
03 Php Array String Functions03 Php Array String Functions
03 Php Array String FunctionsGeshan Manandhar
 
Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)MongoSF
 
Forum Presentation
Forum PresentationForum Presentation
Forum PresentationAngus Pratt
 
Intro to #memtech PHP 2011-12-05
Intro to #memtech PHP   2011-12-05Intro to #memtech PHP   2011-12-05
Intro to #memtech PHP 2011-12-05Jeremy Kendall
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
Haml & Sass presentation
Haml & Sass presentationHaml & Sass presentation
Haml & Sass presentationbryanbibat
 
Ods Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A TutorialOds Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A Tutorialsimienc
 
Introduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsIntroduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsJohannes Geppert
 

Similar to Terms of endearment - the ElasticSearch Query DSL explained (20)

Php Basic Security
Php Basic SecurityPhp Basic Security
Php Basic Security
 
High-level Web Testing
High-level Web TestingHigh-level Web Testing
High-level Web Testing
 
Exploiting Php With Php
Exploiting Php With PhpExploiting Php With Php
Exploiting Php With Php
 
PHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodPHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the Good
 
HTML5 Web Forms
HTML5 Web FormsHTML5 Web Forms
HTML5 Web Forms
 
Intro python
Intro pythonIntro python
Intro python
 
JSP Custom Tags
JSP Custom TagsJSP Custom Tags
JSP Custom Tags
 
Sencha Touch Intro
Sencha Touch IntroSencha Touch Intro
Sencha Touch Intro
 
JQuery 101
JQuery 101JQuery 101
JQuery 101
 
Drupal Lightning FAPI Jumpstart
Drupal Lightning FAPI JumpstartDrupal Lightning FAPI Jumpstart
Drupal Lightning FAPI Jumpstart
 
JQuery Basics
JQuery BasicsJQuery Basics
JQuery Basics
 
03 Php Array String Functions
03 Php Array String Functions03 Php Array String Functions
03 Php Array String Functions
 
Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)
 
Forum Presentation
Forum PresentationForum Presentation
Forum Presentation
 
Intro to #memtech PHP 2011-12-05
Intro to #memtech PHP   2011-12-05Intro to #memtech PHP   2011-12-05
Intro to #memtech PHP 2011-12-05
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Haml & Sass presentation
Haml & Sass presentationHaml & Sass presentation
Haml & Sass presentation
 
Ods Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A TutorialOds Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A Tutorial
 
Introduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsIntroduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid Tags
 
Mojolicious on Steroids
Mojolicious on SteroidsMojolicious on Steroids
Mojolicious on Steroids
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

Terms of endearment - the ElasticSearch Query DSL explained

  • 1. “ Terms of Endearment” The ElasticSearch query language explained Clinton Gormley, YAPC::EU 2011 DRTECH @clintongormley
  • 2. search for : “ DELETE QUERY ” We can
  • 3. search for : “ DELETE QUERY ” and find : “ deleteByQuery ” We can
  • 4. but you can only find what is stored in the database
  • 5. Normalise values “ deleteByQuery” 'delete' 'by' 'query' 'deletebyquery'
  • 6. Normalise values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 7. Normalise values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 8. Analyse values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 9. What is stored in ElasticSearch?
  • 10. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Document:
  • 11. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Fields:
  • 12. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Values:
  • 13. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org" }, tags => [" perl" ,"opinion"], posts => 2, } Field types: # object # string # date # nested object # string # string # array of enums # integer
  • 14. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Nested objects flattened:
  • 15. { tweet => "Perl is GREAT!", posted => "2011-08-15", user.name => "Clinton Gormley", user.email => "drtech@cpan.org", tags => [" perl" ,"opinion"], posts => 2, } Nested objects flattened
  • 16. { tweet => "Perl is GREAT!", posted => "2011-08-15", user.name => "Clinton Gormley", user.email => "drtech@cpan.org", tags => [" perl" ,"opinion"], posts => 2, } Values analyzed into terms
  • 17. { tweet => ['perl','great'], posted => [Date(2011-08-15)], user.name => ['clinton','gormley'], user.email => ['drtech','cpan.org'], tags => [' perl' ,'opinion'], posts => [2], } Values analyzed into terms
  • 18. database table row ⇒ many tables ⇒ many rows ⇒ one schema ⇒ many columns In MySQL
  • 19. index type document ⇒ many types ⇒ many documents ⇒ one mapping ⇒ many fields In ElasticSearch
  • 20. Create index with mappings $es-> create_index ( index => 'twitter', mappings => { tweet => { properties => { title => { type => 'string' }, created => { type => 'date' } } } } );
  • 21. Add a mapping $es-> put_mapping ( index => 'twitter', type => ' user ', mapping => { properties => { name => { type => 'string' }, created => { type => 'date' }, } } );
  • 22. Can add to existing mapping
  • 23. Can add to existing mapping Cannot change mapping for field
  • 24. Core field types { type => 'string', }
  • 25. Core field types { type => 'string', # byte|short|integer|long|double|float # date, ip addr, geolocation # boolean # binary (as base 64) }
  • 26. Core field types { type => 'string', index => ' analyzed ', # 'Foo Bar' ⇒ [ 'foo', 'bar' ] }
  • 27. Core field types { type => 'string', index => ' not_analyzed ', # 'Foo Bar' ⇒ [ 'Foo Bar' ] }
  • 28. Core field types { type => 'string', index => ' no ', # 'Foo Bar' ⇒ [ ] }
  • 29. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', }
  • 30. Core field types { type => 'string', index => 'analyzed', index_ analyzer => 'default', search_ analyzer => 'default', }
  • 31. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', boost => 2, }
  • 32. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', boost => 2, include_in_all => 1 |0 }
  • 33.
  • 36. Stop
  • 37.
  • 41. The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com keyword: The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
  • 42.
  • 46. NGram
  • 50. Stop
  • 51.
  • 52. KStem
  • 61. Custom Analyzer $c->create_index( index => 'twitter', settings => { analysis => { analyzer => { ascii_html => { type => 'custom', tokenizer => 'standard', filter => [ qw( standard lowercase asciifolding stop ) ], char_filter => ['html_strip'] } } }} );
  • 62. Searching $result = $es->search( index => 'twitter', type => 'tweet', );
  • 63. Searching $result = $es->search( index => ['twitter','facebook'] , type => ['tweet','post'] , );
  • 64. Searching $result = $es->search( # all indices # all types );
  • 65. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, );
  • 66. Searching $result = $es->search( index => 'twitter', type => 'tweet', query b => 'foo' , # b == ElasticSearch::SearchBuilder );
  • 67. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, sort => [{ '_score': 'desc' }] );
  • 68. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, sort => [{ '_score': 'desc' }] from => 0, size => 10, );
  • 70. Queries vs Filters
  • 71.
  • 72.
  • 73.
  • 75.
  • 77.
  • 80.
  • 83.
  • 87.
  • 90.
  • 93. cacheable Use filters for anything that doesn't affect the relevance score!
  • 94. Query only Query DSL: $es->search( query => { text => { title => 'perl' } } ); SearchBuilder: $es->search( query b => { title => 'perl' } );
  • 95. Filter only Query DSL: $es->search( query => { constant_score => { filter => { term => { tag => 'perl } } } }); SearchBuilder: $es->search( query b => { -filter => { tag => 'perl' } });
  • 96. Query and filter Query DSL: $es->search( query => { filtered => { query => { text => { title => 'perl' } }, filter =>{ term => { tag => 'perl' } } } }); SearchBuilder: $es->search( query b => { title => 'perl', -filter => { tag => 'perl' } });
  • 98. Filters : equality Query DSL: { term => { tags => 'perl' }} { terms => { tags => ['perl','ruby'] }} SearchBuilder: { tags => 'perl' } { tags => ['perl','ruby'] }
  • 99. Filters : range Query DSL: { range => { date => { gte => '2010-11-01', lt => '2010-12-01' }} SearchBuilder: { date => { gte => '2010-11-01', lt => '2011-12-01' }}
  • 100. Filters : range (many values) Query DSL: { numeric_range => { date => { gte => '2010-11-01', lt => '2010-12-01 }} SearchBuilder: { date => { ' >= ' => '2010-11-01', ' < ' => '2011-12-01' }}
  • 101. Filters : and | or | not Query DSL: { and => [ {term=>{X=>1}}, {term=>{Y=>2}} ]} { or => [ {term=>{X=>1}}, {term=>{Y=>2}} ]} { not => { or => [ {term=>{X=>1}}, {term=>{Y=>2}} ] }} SearchBuilder: { X => 1, Y => 2 } [ X => 1, Y => 2 ] { -not => { X => 1, Y => 2 } } # and { -not => [ X => 1, Y => 2 ] } # or
  • 102. Filters : exists | missing Query DSL: { exists => { field => 'title' }} { missing => { field => 'title' }} SearchBuilder: { -exists => 'title' } { -missing => 'title' }
  • 103. Filter example SearchBuilder: { -filter => [ featured => 1, { created_at => { gt => '2011-08-01' }, status => { '!=' => 'pending' }, }, ] }
  • 104. Filter example Query DSL: { constant_score => { filter => { or => [ { term => { featured => 1 }}, { and => [ { not => { term => { status => 'pending' }}, { range => { created_at => { gt => '2011-08-01' }}}, ] } ] } } }
  • 105.
  • 106. nested
  • 108. query
  • 110. prefix
  • 111.
  • 112. type
  • 117.
  • 120.
  • 121. range
  • 122. prefix
  • 123. fuzzy
  • 125. ids
  • 126.
  • 128.
  • 129.
  • 131.
  • 134.
  • 137.
  • 138. range
  • 139. prefix
  • 140. fuzzy
  • 142. ids
  • 143.
  • 145.
  • 146.
  • 148.
  • 153. Text/Analyzed Queries analyzed ⇒ text query using search_analyzer
  • 154. Text-Query Family Query DSL: { text => { title => 'great perl' }} Search Builder: { title => 'great perl' }
  • 155. Text-Query Family Query DSL: { text => { title => { query => 'great perl' }}} Search Builder: { title => { '=' => { query => 'great perl' }}}
  • 156. Text-Query Family Query DSL: { text => { title => { query => 'great perl' , operator => 'and' }}} Search Builder: { title => { '=' => { query => 'great perl', operator => 'and' }}}
  • 157. Text-Query Family Query DSL: { text => { title => { query => 'great perl' , fuzziness => 0.5 }}} Search Builder: { title => { '=' => { query => 'great perl', fuzziness => 0.5 }}}
  • 158. Text-Query Family Query DSL: { text => { title => { query => 'great perl', type => 'phrase' }}} Search Builder: { title => { '==' => { query => 'great perl', }}}
  • 159. Text-Query Family Query DSL: { text => { title => { query => ' great perl ', type => 'phrase' }}} Search Builder: { title => { '==' => { query => ' great perl ', }}}
  • 160. Text-Query Family Query DSL: { text => { title => { query => ' perl is great ', type => 'phrase' }}} Search Builder: { title => { '==' => { query => ' perl is great ', }}}
  • 161. Text-Query Family Query DSL: { text => { title => { query => ' perl great ', type => 'phrase', slop => 3 }}} Search Builder: { title => { '==' => { query => ' perl great ', slop => 3 }}}
  • 162. Text-Query Family Query DSL: { text => { title => { query => ' perl is gr ', type => ' phrase_prefix ', }}} Search Builder: { title => { '^' => { query => ' perl is gr ', }}}
  • 163. Query string / Field Lucene Query Syntax aware “ perl is great”~5 AND author:clint* -deleted
  • 164. Query string / Field Syntax errors: AND perl is great ” author : clint* -
  • 165. Query string / Field Syntax errors: AND perl is great ” author : clint* - ElasticSearch::QueryParser
  • 166. Combining: Bool Query DSL: { bool => { must => [ { term => { foo => 1}}, ... ], must_not => [ { term => { bar => 1}}, ... ], should => [ { term => { X => 2}}, { term => { Y => 2}},... ], minimum_number_should_match => 1, }}
  • 167. Combining: Bool SearchBuilder: { foo => 1, bar => { '!=' => 1}, -or => [ X => 2, Y => 2], } { -bool => { must => { foo => 1 }, must_not => { bar => 1 }, should => [{ X => 2}, { Y => 2 }], minimum_number_should_match => 1, }}
  • 168. Combining: DisMax Query DSL: { dis_max => { queries => [ { term => { foo => 1}}, { term => { bar => 1}}, ] }} SearchBuilder: { -dis_max => [ { term => { foo => 1}}, { term => { bar => 1}}, ], }
  • 169. Bool: combines scores DisMax: uses highest score from all matching clauses
  • 172. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string” }, }
  • 173. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string”, boost => 2, }, }, }
  • 174. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string”, boost => 2, }, rank => { type => “integer” }, }, _boost => { name => 'rank', null_value => 1.0 }, }
  • 175. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => 'perl' }}, ] }} SearchBuilder: { content => 'perl', title => 'perl' }
  • 176. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => { query => 'perl', }}, ] }} SearchBuilder: { content => 'perl', title => { '=' => { query => 'perl' }} }
  • 177. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => { query => 'perl', boost => 2 }}, ] }} SearchBuilder: { content => 'perl', title => { '=' => { query => 'perl', boost=> 2 }} }
  • 178. Boosting: custom_score Query DSL: { custom_score => { query => { text => { title => 'perl' }}, script => “_score * foo /doc['rank'].value”, }} SearchBuilder: { -custom_score => { query => { title => 'perl' }, script => “_score * foo /doc['rank'].value”, }}
  • 179. Query example SearchBuilder: { -or => [ title => { '=' => { query => 'custom score', boost => 2 }}, content => 'custom score', ], -filter => { repo => 'elasticsearch/elasticsearch', created_at => { '>=' => '2011-07-01', '<' => '2011-08-01'}, -or => [ creator_id => 123, assignee_id => 123, ], labels => ['bug','breaking'] } }
  • 180. Query example Query DSL: { query => { filtered => { query => { bool => { should => [ { text => { content => &quot;custom score&quot; } }, { text => { title => { boost => 2, query => &quot;custom score&quot; } } }, ], }, }, filter => { and => [ { or => [ { term => { creator_id => 123 } }, { term => { assignee_id => 123 } }, ]}, { terms => { labels => [&quot;bug&quot;, &quot;breaking&quot;] } }, { term => { repo => &quot;elasticsearch/elasticsearch&quot; } }, { numeric_range => { created_at => { gte => &quot;2011-07-01&quot;, lt => &quot;2011-08-01&quot; }}}, ]}, }}
  • 181.