• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Simple search with elastic search
 

Simple search with elastic search

on

  • 4,677 views

Talk given at Cakefest 2012 about elasticsearch and CakePHP.

Talk given at Cakefest 2012 about elasticsearch and CakePHP.

Statistics

Views

Total Views
4,677
Views on SlideShare
3,960
Embed Views
717

Actions

Likes
5
Downloads
37
Comments
0

7 Embeds 717

http://www.scoop.it 676
https://twitter.com 35
http://darya-ld1.linkedin.biz 2
https://si0.twimg.com 1
https://twimg0-a.akamaihd.net 1
http://www.php-talks.com 1
http://webcache.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

Simple search with elastic search Simple search with elastic search Presentation Transcript

  • SIMPLE SEARCH WITH ELASTIC SEARCH MARK STORY @MARK_STORY
  • WAT?Java basedLucene poweredJSON drivenDocument orientated databaseAll out super search solutionEasy to setup, and use
  • INDEXES AND TYPES INDEXES Similar concept to databases. Contain multiple types.
  • TYPESSimilar concept to tables.Defines datatypes and indexing rules.
  • DOCUMENT BASED REST API Simple to use and easy to understand.
  • CREATE A DOCUMENTcr -PS lclot90/otcspol - ul XOT oahs:20cnat/epe d { "ae:"akSoy, nm" Mr tr" "mi" "akmr-tr.o" eal: mr@aksoycm, "wte" "mr_tr" titr: @aksoy, "onr" "aaa, cuty: Cnd" "as:[ckpp,"aeet,"aaa] tg" "aeh" ckfs" cnd"}#Rsos: epne{o"tu, "k:re"idx:cnat" _ne""otcs,"tp""epe, _ye:pol""i""iMaiBDAWs5g, _d:9zCSQq1J8i7""vrin:} _eso"1
  • READ IT BACKcr -GTlclot90/otcspol/i?rtytu ul XE oahs:20cnat/epe$dpet=re#Rsos: epne{_ne""otcs, "idx:cnat""tp""epe, _ye:pol""i""iMaiBDAWs5g, _d:9zCSQq1J8i7""vrin:, _eso"1"xss:re eit"tu,"suc":{ _ore "ae:"akSoy, nm" Mr tr" "mi" "akmr-tr.o" eal: mr@aksoycm, "wte" "mr_tr" titr: @aksoy, "onr" "aaa, cuty: Cnd" "as:[ckpp,"aeet,"aaa] tg" "aeh" ckfs" cnd"}}
  • DELETE IT!cr -DLT lclot90/otcspol/i ul XEEE oahs:20cnat/epe$d#Rsos: epne{o"tu, "k:re"on"tu, fud:re"idx:cnat" _ne""otcs,"tp""epe, _ye:pol""i""iMaiBDAWs5g, _d:9zCSQq1J8i7""vrin:} _eso"2
  • SIMPLE SEARCH! More on search to comecr -GTlclot90/otcspol/sac?=akpet=re ul XE oahs:20cnat/epe_erhqMr&rtytu
  • #Rsos: epne{to"1, "ok:4"ie_u"fle tmdot:as,"sad"{ _hrs: "oa"5 ttl:, "ucsfl:, scesu"5 "ald: fie"0},"is: ht"{ "oa"1 ttl:, "a_cr"01746, mxsoe:.1424 "is: ht"[ {_ne""otcs, "idx:cnat" "tp""epe, _ye:pol" "i""llMTWqMZUlw, _d:sJyBSaAfU-D" "soe:.1424 _cr"01746, "suc":{ _ore "ae:"akSoy, nm" Mr tr" "mi" "akmr-tr.o" eal: mr@aksoycm, "wte" "mr_tr" titr: @aksoy, "onr" "aaa, cuty: Cnd" "as:[ckpp,"aeet,"aaa] tg" "aeh" ckfs" cnd" } } ]}}
  • THIS ALL SOUNDS TOO BADASS TO BE TRUE
  • DOCUMENT "DATABASE" IS A BIT LIMITED Partial updates are doable but painful No joins No map reduce Cannot replace all other datasources
  • BUT SEARCH IS AMAZZZING
  • SEARCH BETWEEN TYPES & INDEXES Search multiple typescr -GTlclot90/otcspol,opne/sac?=aeMr ul XE oahs:20cnat/epecmais_erhqnm:ak Search multiple indexes in your clustercr -GTlclot90/alpol/sac?=aeMr ul XE oahs:20_l/epe_erhqnm:ak
  • FANCY SEARCH OPTIONS
  • SEARCH WITH TEXT EXPRESSIONScr -GTlclot90/otcspol/sac?rtytu - ul XE oahs:20cnat/epe_erhpet=re d { "ur" { qey: "ur_tig:{ qeysrn" "ur" "akO wlo" qey: mr R edn } }}
  • HIGHLIGHT SEARCH KEYWORDSWrap search terms in highlighting text/markup/html. Great for larger documents, as you can extract fragments.
  • cr -GTlclot90/otcspol/sac?rtytu - ul XE oahs:20cnat/epe_erhpet=re d { "ur" { qey: "et:{ tx" "mi" "ak eal: mr" } }, "ihih" { hglgt: "ils:{ fed" "mi" {, eal: } "ae:{ nm" } } }}
  • FACETSFacets provide aggregated data about a query. You can use this data to create drill down search, or histogram data. Term counts. Custom script values. Ranges - like price ranges. Geo distance facets - aggregate results by distance.
  • cr -GTlclot90/otcspol/sac?rtytu - ul XE oahs:20cnat/epe_erhpet=re d { "ur" { qey: "ur_tig:{ qeysrn" "ur" ".o" qey: *cm } }, "aes:{ fct" "agd:{trs:{fed:"as}} tge" "em" "il" tg" }}
  • KNOBS & BUTTONS
  • MAPPINGSAllows fine-grained searching later on, and lets you configurecustom mappings.Control the data types, and indexing used for JSON documenttypes.Disable indexing on specific fields.Configure custom analyzers. For example, non-english stemming.
  • AVAILABLE MAPPING TYPESstring, integer, float, boolean, nullobject - Standard type for nested objects. AllowsArrays are automatically handled as the above.properties to be defined.
  • multi_field - Allows a field to be handled multiple ways with differentaliases.nested - Indexes sub objects, and works with nested filter/queries.ip - For ipv4 data.geo_point - For lat/lon values. Enables piles of search options.attachment - Store a blob. Can index many text based documentslike PDF.
  • CREATE A MAPPINGcr -PTlclot90/otcspol/mpig- ul XU oahs:20cnat/epe_apn d { "epe:{ pol" "rpris:{ poete" "ae:{tp" "tig} nm" "ye: srn", "mi" {tp" "tig} eal: "ye: srn", "wte" {tp" "tig} titr: "ye: srn", "onr" {tp" "tig} cuty: "ye: srn", "as:{tp" "tig} tg" "ye: srn" } }}
  • DEFINE THE ANALYZER USEDWhen defining a field you can use a a y e i d x a a y e , nlzr ne_nlzrand s a c _ n l z r customize the way data is stored, and or e r h a a y e tosearched.You can also disable analyzing for specific fields.
  • DISABLE INDEXING{ "ae:{ nm" "ye:"tig, tp" srn" "ne" "o_nlzd, idx: ntaaye" }, "oe:{ nn" "ye:"nee" tp" itgr, "ne" "o idx: n" }}
  • SHARDS & REPLICAS SHARDS Define how many nodes you want to split your data across. If a node goes down, you still have some of your data. You can use routing to control how data is sharded.More shards improves indexing performance, as work is distributed.
  • SIMPLE SHARDING
  • SHARD OVER MULTIPLE NODES
  • REPLICAS Define how many copies of your data you want. If several nodes go down, you might still have all your data.More replicas improves search performance and cluster availability.
  • REPLICAS
  • MULTI-TENANCYMulti-tenancy is a reasonably common requirement, and there are a few ways to do it.
  • ONE INDEX PER TENANTGreat for small number of tenants.Painful for larger number of tenants. As sharding and replicas canbe harder to manage.
  • cr -GTlclot90/akcnat/sac?rtytu - ul XE oahs:20mr/otcs_erhpet=re d { "ur" { qey: "ur_tig:{ qeysrn" "ur" "ednO js" qey: wlo R oe } }}
  • SPECIAL FILTER CONDITIONSMore error prone as you have to include a filter condition.Easy to shard and setup replicas.Easily scales to many tenants. As shards/replicas are shared.Make sure tenant id is a non-analyzed value.
  • cr -GTlclot90/conigivie/sac?rtytu - ul XE oahs:20acutn/nocs_erhpet=re d { "ur" { qey: "itrd:{ flee" "itr:{ fle" "em:{acutd:1 tr" "coni" } }, "ur" { qey: "ur_tig:{ qeysrn" "ur" "upewf" qey: prl ii } } } }}
  • OTHER BATTERIES INCLUDED Routing Define how documents are sharded. Rivers Pipe data in realtime from sources like RabbitMQ. Thrift Talk thirft to ElasticSearch.
  • INTEGRATION WITH CAKEPHP
  • HTTPSOCKET + JSON_ENCODE() Basic, can be hard to use. No magic.
  • ELASTICSEARCH DATASOURCE (David Kullman) Behavior to auto index on aftersave Datasource for searching elasticsearch Console app to index models
  • ELASTICSEARCH PLUGIN (Kevin von Zonneveld)Similar features to the previous pluginOffers more control on how data is indexed
  • QUESTIONS?