Your SlideShare is downloading. ×
0
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Debugging and Testing ES Systems
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Debugging and Testing ES Systems

4,540

Published on

Published in: Technology, News & Politics
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,540
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
20
Comments
0
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Debugging and Testing ES Systems Chris Birchall 2013/8/29 Elasticsearch 勉強会 第1回 #elasticsearchjp
  • 2. Elasticsearch and me ● At Infoscience, helped build a log management product based on ES + Hadoop ● At M3, ES evangelist (??) ○ Maintain ES cluster ○ Help dev teams integrate ES into their apps Twitter: @cbirchall Github: https://github.com/cb372
  • 3. Search at M3 ● Using ES for all new services ○ Search, recommendation (MoreLikeThis) ● Slowly migrating other services from Solr ● A few legacy services use Lucene directly ● Running all indices on one ES cluster ● Kuromoji for Japanese content
  • 4. Debugging Mostly debugging of queries ● “Why doesn’t doc X match query Y?” ● “Why does this search return no results?” Operational issues are very rare ● ES’s clustering magic is surprisingly stable! ● No performance issues so far
  • 5. Debugging - Step 1 Check for typos! ES will silently ignore many typos in settings/mapping definitions
  • 6. Typo - Example $ curl -X PUT localhost:9200/myapp -d '{ "settings": { "number_of_shards": 3 }, "mapping" : { "article" : { "_source": { "enabled": false }, "properties": { "title": { "type": "string", "store": "true" }, "body": { "type": "string", "store": "true" }, ... } }, ... }' Let’s create a new index...
  • 7. Typo - Example (cont’d) {"ok":true,"acknowledged":true} Response from ES: OK, seems fine...
  • 8. Typo - Example (cont’d) $ curl localhost:9200/myapp/_mappings?pretty Response from ES: { "myapp" : { } } Eh? Where are my lovingly-crafted mappings?! Now check the mappings...
  • 9. Typo - Example (cont’d) $ curl -X PUT localhost:9200/myapp -d '{ "settings": { "number_of_shards": 3 }, "mappings" : { "article" : { "_source": { "enabled": false }, "properties": { "title": { "type": "string", "store": "true" }, "body": { "type": "string", "store": "true" }, ... } }, ... }' Oops!
  • 10. Debugging - Step 2 Set up a local environment ● Makes it easy to wipe & rebuild index
  • 11. Setting up a local env (OSX) # Install $ brew install elasticsearch # Kuromoji plugin (optional) $ /usr/local/opt/elasticsearch/bin/plugin -install elasticsearch/elasticsearch-analysis-kuromoji/1.5.0 # Start $ elasticsearch # Create index $ curl -X PUT localhost:9200/my_app -d '{ ... }' # Insert some documents $ curl -X PUT localhost:9200/my_app/my_type/1 -d '{ ... }' $ curl -X PUT localhost:9200/my_app/my_type/2 -d '{ ... }' # Done!
  • 12. Useful commands - Analyze $ curl 'localhost:9200/myindex/_analyze?pretty' / -d '東京特許許可局許可局長' { "tokens" : [ { "token" : "東京", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 1 }, { "token" : "特許", "start_offset" : 2, "end_offset" : 4, "type" : "word", ... How is my document/query being tokenized?
  • 13. Useful commands - Explain $ curl 'localhost:9200/kuro/docs/123/_explain?pretty' / -d '{ "query": { "term": { "body": "東京" } } }' { ... "matched" : true, "explanation" : { "value" : 0.375, "description" : "weight(body:東京 in 0) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.375, "description" : "fieldWeight in 0, product of:", "details" : [ { "value" : 1.0, "description" : "tf(freq=1.0), with freq of:", "details" : [ { "value" : 1.0, "description" : "termFreq=1.0" ... Why does this document (not) match this query? Specify document ID
  • 14. Tuning queries Parameters to tweak ● default_operator (AND/OR) ● auto_generate_phrase_queries ● minumum_should_match ● Stop words/tags ● Kuromoji ○ Segmentation mode ○ Reading form filter ○ Disable Kuromoji! (for some fields)
  • 15. Why disable Kuromoji? Problem: occasionally weird tokenization ● AND query will fail, because not all terms match ● OR query will match any document with 病院 → low precision Phrase Terms 特定医療法人財団 日本会 東日本病院 (document field) 特定、医療、法人、財団、 日本、会、東日本、病院 東日本 (query) 東日、東日本、本 東日本病院 (query) 東、東日本、日本、病院
  • 16. Useful plugin - Head $ bin/plugin -install mobz/elasticsearch-head http://mobz.github.io/elasticsearch-head/
  • 17. Testing Main goal: Ensure that queries return the results that we expect ● Test coverage of representative queries ○ Freedom to tune for a given query without breaking other queries Ideally, tests should: ● Run fast ● Run standalone (i.e. no need to have an ES server running)
  • 18. Testing - Java elasticsearch-test is awesome ● DSL to set up/tear down ES ● Annotations + JUnit runner ● ES runs in-process ○ No need to start an external ES server ● Index is stored in-memory ○ Runs quickly https://github.com/tlrx/elasticsearch-test
  • 19. https://github.com/cb372/elasticsearch-test-example Testing - Java Simple elasticsearch-test example
  • 20. Testing - Ruby Simple Rails + Tire + RSpec example https://github.com/cb372/elasticsearch-rspec-example
  • 21. We’re hiring! TODO We are hiring slide http://bit.ly/m3jobs

×