Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

07. ElasticSearch : Full-Body Search

434 views

Published on

07. ElasticSearch : Full-Body Search

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

07. ElasticSearch : Full-Body Search

  1. 1. ElasticSearch Full-Body Search http://elastic.openthinklabs.com/
  2. 2. Empty Search GET /_search {} GET /index_2014*/type1,type2/_search {} GET /_search { "from": 30, "size": 10 } The search API also accepts POST requests, So, we can replace GET request with POST
  3. 3. Query DSL GET /_search { "query": YOUR_QUERY_HERE } GET /_search { "query": { "match_all": {} } }
  4. 4. Query DSL Structure of a Query Clause { QUERY_NAME: { ARGUMENT: VALUE, ARGUMENT: VALUE,... } } A query clause typically has this structure: If it references one particular field, it has this structure: GET /_search { "query": { "match": { "tweet": "elasticsearch" } } } { QUERY_NAME: { FIELD_NAME: { ARGUMENT: VALUE, ARGUMENT: VALUE,... } } } For instance, you can use a match query clause to find tweets that mention elasticsearch in thetweet field: { "match": { "tweet": "elasticsearch" } }
  5. 5. Query DSL Combining Multiple Clauses ● Query clauses are simple building blocks that can be combined with each other to create complex queries. Clauses can be as follows: ● Leaf clauses (like the match clause) that are used to compare a field (or fields) to a query string ● Compound clauses that are used to combine other query clauses. For instance, a bool clause allows you to combine other clauses that either must match, must_not match, or should match if possible. They can also include non-scoring, filters for structured search: { "bool": { "must": { "match": { "tweet": "elasticsearch" }}, "must_not": { "match": { "name": "mary" }}, "should": { "match": { "tweet": "full text" }}, "filter": { "range": { "age" : { "gt" : 30 }} } } } { "bool": { "must": { "match": { "email": "business opportunity" }}, "should": [ { "match": { "starred": true }}, { "bool": { "must": { "match": { "folder": "inbox" }}, "must_not": { "match": { "spam": true }} }} ], "minimum_should_match": 1 } }
  6. 6. Queries and Filters ● Filtering context (non-scoring query) ● Is the created date in the range 2013 – 2014? ● Does the status field contain the term published? ● Is the lat_lon field within 10km of a specified point? ● Query context. (scoring query) ● Best matching the words full text search ● Containing the word run, but maybe also matching runs, running, jog, or sprint ● Containing the words quick, brown, and fox—the closer together they are, the more relevant the document ● Tagged with lucene, search, or java—the more tags, the more relevant the document
  7. 7. Most Importang Queries match_all Query { "match_all": {}} Match Query { "match": { "tweet": "About Search" }} { "match": { "age": 26 }} { "match": { "date": "2014-09-01" }} { "match": { "public": true }} { "match": { "tag": "full_text" }} multi_match Query { "multi_match": { "query": "full text search", "fields": [ "title", "body" ] } } range Query { "range": { "age": { "gte": 20, "lt": 30 } } } term Query { "term": { "age": 26 }} { "term": { "date": "2014-09-01" }} { "term": { "public": true }} { "term": { "tag": "full_text" }} terms Query { "terms": { "tag": [ "search", "full_text", "nosql" ] }} exists and missing Queries { "exists": { "field": "title" } }
  8. 8. Combining Queries Together { "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }}, { "range": { "date": { "gte": "2014-01-01" }}} ] } } { "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }} ], "filter": { "range": { "date": { "gte": "2014-01-01" }} } } } { "bool": { "must": { "match": { "title": "how to make millions" }}, "must_not": { "match": { "tag": "spam" }}, "should": [ { "match": { "tag": "starred" }} ], "filter": { "bool": { "must": [ { "range": { "date": { "gte": "2014-01-01" }}}, { "range": { "price": { "lte": 29.99 }}} ], "must_not": [ { "term": { "category": "ebooks" }} ] } } } } { "constant_score": { "filter": { "term": { "category": "ebooks" } } } }
  9. 9. Validating Queries GET /gb/tweet/_validate/query { "query": { "tweet" : { "match" : "really powerful" } } } { "valid" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 } } GET /gb/tweet/_validate/query?explain { "query": { "tweet" : { "match" : "really powerful" } } } { "valid" : false, "_shards" : { ... }, "explanations" : [ { "index" : "gb", "valid" : false, "error" : "org.elasticsearch.index.query.QueryParsingException: [gb] No query registered for [tweet]" } ] } GET /_validate/query?explain { "query": { "match" : { "tweet" : "really powerful" } } } { "valid" : true, "_shards" : { ... }, "explanations" : [ { "index" : "us", "valid" : true, "explanation" : "tweet:really tweet:powerful" }, { "index" : "gb", "valid" : true, "explanation" : "tweet:realli tweet:power" } ] }
  10. 10. Referensi ● ElasticSearch, The Definitive Guide, A Distrib uted Real-Time Search and Analytics Engine, Cl inton Gormely & Zachary Tong, O’Reilly

×