Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CouchDB Day NYC 2017: Full Text Search


Published on

Go to
Sign up!

Published in: Software
  • Be the first to comment

CouchDB Day NYC 2017: Full Text Search

  1. 1. CouchDB Developer Day Full-Text Search Lab
  2. 2. Create a Cloudant account • Go to • Sign up!
  3. 3. Setup curl $ –X PUT curl $ –X PUT –d '{"indexes":{"baz":{"index":"function(doc){index("color", doc.color); index("size", doc.size);}"}}}' curl $ 1 –X PUT –d '{"size": "small", "color": "green"}' curl $ –X PUT –d '{"size": "large", "color": "green"}' curl $ –X PUT –d '{"size": "small", "color": "red"}'
  4. 4. Searching curl $ curl $ curl $ curl $
  5. 5. Pagination Every search request returns a "bookmark" attribute. Pass this back to Cloudant to get the next "page" of results. curl https://$*:*&limit=1 curl https://$*:*&limit=1&bookmark=g2wAAAABaANkA B9kYmNvcmVAZGI1LmplbmV2ZXIuY2xvdWRhbnQubmV0bAAAAAJhAGI_____amgCRj_wAAAAAA AAYQBq
  6. 6. Sorting The "sort" parameter lets you sort results on any indexed field or combination of indexed fields. curl https://$*:*&sort="size<string>" curl https://$*:*&sort="color<string>"
  7. 7. Tokenization ( • Tokenizers break down textual input into tokens for efficient and flexible searching • Using an appropriate tokenizer is often critical • Generic analyzers: standard, email, keyword, whitespace • Language specific analyzers: english, french, german, spanish, chinese, dutch... • You can configure different analyzers for different fields • Some tokenizers omit common words • Some tokenizers omit common prefixes or suffixes
  8. 8. Tokenization Examples > curl https://$ –Hcontent-type:application/json –d '{"analyzer":"standard", "text": ""}' {"tokens":["rnewson",""]} > curl https://$ –Hcontent-type:application/json –d '{"analyzer":"email", "text": ""}' {"tokens":[""]} > curl https://$ –Hcontent-type:application/json –d '{"analyzer":"english", "text": "running"}' {"tokens":["run"]}