Introduction to Elasticsearch

6,816 views
6,712 views

Published on

Elasticsearch is a powerful, distributed, open source searching technology. By integrating Elasticsearch into your application, you instantly provide a way to search a lot of data very quickly. Elasticsearch has a RESTful API, it scales, its super fast, you can use plugins to customize it, and much more. In this talk I go over the basics of setting up Elasticsearch, creating a search index, importing your data, and doing some basic searching. I also touch on a few advanced topics that will show the flexibility of this awesome service.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,816
On SlideShare
0
From Embeds
0
Number of Embeds
4,784
Actions
Shares
0
Downloads
27
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Introduction to Elasticsearch

  1. 1. Introduction to Elasticsearch Jason Austin - @jason_austin
  2. 2. The Problem • You are building a website to find beers • You have a huge database of beers and breweries to sift through • You want simple keyword-based searching • You also want structured searching, like finding all beers > 7% ABV • You want to run some analytics on what beers are in your dataset
  3. 3. Enter Elasticsearch • Lucene based • Distributed • Fast • RESTful interface • Document-Based with JSON
  4. 4. Install Elasticsearch • Download from http://elasticsearch.org • Requires Java to run
  5. 5. Run Elasticsearch • From the install directory: ./bin/elasticsearch -d! ! http://localhost:9200/!
  6. 6. Communicating • Elasticsearch listens to RESTful HTTP requests • GET, POST, PUT, DELETE • CURL works just fine
  7. 7. ES Structure Relational DB Databases Tables Rows Columns Elasticsearch Indices Types Documents Fields
  8. 8. ES Structure Elasticsearch Indices Types Documents Fields Elasticsearch phpbeer beer Pliny the Elder ABV, Name, Desc
  9. 9. Create an Index curl -XPOST 'http://localhost:9200/phpbeer'
  10. 10. What to Search? • Define the types of things to search • Beer • Brewery - Maybe later
  11. 11. Define a Beer • Name • Style • ABV • Brewery ‣ Name ‣ City
  12. 12. Beer JSON {
 ! "name": "Pliny the Elder",
 ! "style": "Imperial India Pale Ale",
 ! "abv": 7.0,
 ! "brewery": {
 ! ! "name": "Russian River Brewing Co.",
 ! ! "city": "Santa Rosa",
 "state": "California"
 ! }
 }

  13. 13. SavingThe Beer curl -XPOST 'http://localhost:9200/ phpbeer/beer/1' -d '{
 ! "name": "Pliny the Elder",
 ! "style": "Imperial India Pale Ale",
 ! "abv": 7.0,
 ! "brewery": {
 ! ! "name": "Russian River Brewing Co.",
 ! ! "city": "Santa Rosa",
 "state": "California"
 ! }
 }'

  14. 14. Getting a beer curl -XGET 'http://localhost:9200/phpbeer/ beer/1?pretty'
  15. 15. Updating a Beer curl -XPOST 'http://localhost:9200/ phpbeer/beer/1' -d '{
 ! "name": "Pliny the Elder",
 ! "style": "Imperial India Pale Ale",
 ! "abv": 8.0,
 ! "brewery": {
 ! ! "name": "Russian River Brewing Co.",
 ! ! "city": "Santa Rosa",
 "state": "California"
 ! }
 }'

  16. 16. POSTvs PUT • POST • No ID - Creates new doc, assigns ID • With ID - Updates or creates new doc • PUT • No ID - Error • With ID - Updates doc
  17. 17. Delete a Beer curl -XDELETE 'http://localhost:9200/ phpbeer/beer/1'
  18. 18. Finally! Searching! curl -XGET 'http://localhost:9200/_search? pretty&q=pliny'
  19. 19. Specific Field Searching curl -XGET 'http://localhost:9200/_search? pretty&q=style:pliny'! curl -XGET 'http://localhost:9200/_search? pretty&q=style:imperial'
  20. 20. Alternate Approach • Search using DSL (Domain Specific Language) • JSON in request body
  21. 21. DSL Searching curl -XGET 'http://localhost:9200/_search? pretty' -d '{
 "query" : {
 "match" : {
 "style" : "imperial"
 }
 }
 }'
  22. 22. DSL = Query + Filter • Query - “How well does the document match” • Filter - Yes or No question on the field
  23. 23. Query DSL • match • Used to query across all fields for a string • match_phrase • Used to query an exact phrase • match_all • Matches all documents • multi_match • Runs the same match query on multiple fields
  24. 24. Filter DSL • term • Exact match on a field • range • Match numbers over a specified range • exists / missing • Match based on the existence of a value for a field
  25. 25. More Complex Search • Find beer whose styles include “Pale Ale” that are less than 7% ABV
  26. 26. Match + Range curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "query" : {
 "match" : {
 "style" : "pale ale"
 }
 },
 "filter" : {
 "range" : {
 "abv" : { "lt" : 7 }
 }
 }
 }'
  27. 27. Embedded Field Search curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "query" : {
 "match" : {
 "brewery.state" : "California"
 }
 }
 }'
  28. 28. Highlighting Search Results
  29. 29. Highlighting Search Results curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "query" : {
 "match" : {
 "style" : "pale ale"
 }
 },
 "highlight": {
 "fields" : {
 "style" : {}
 }
 }
 }'
  30. 30. Aggregations • Collect analytics on your documents • 2 main types • Bucketing • Produce a set of buckets with documents in them • Metric • Compute metrics over a set of documents
  31. 31. Bucketing Aggregations
  32. 32. Metric Aggregations • How many beers exist of each style? • What is the average ABV of beers for each style? • How many beers exist that are brewed in California?
  33. 33. What is the average ABV of beers for each style? curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "aggs" : {
 "all_beers" : {
 "terms" : { "field" : "style" },
 "aggs" : {
 "avg_abv" : {
 "avg" : { "field" : "abv" }
 }
 }
 }
 }
 }'
  34. 34. Mappings • Define how ES searches • Completely optional • Must re-index after defining mapping
  35. 35. Create Index with Mapping curl -XPOST localhost:9200/phpbeer -d '{
 "mappings" : {
 "beer" : {
 "_source" : { "enabled" : true },
 "properties" : {
 "style" : { 
 "type" : "string", 
 "index" : "not_analyzed" 
 }
 }
 }
 }
 }' curl -XDELETE localhost:9200/phpbeer
  36. 36. What is the average ABV of beers for each style? curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "aggs" : {
 "all_beers" : {
 "terms" : { "field" : "style" },
 "aggs" : {
 "avg_abv" : {
 "avg" : { "field" : "abv" }
 }
 }
 }
 }
 }'
  37. 37. Non-Analyzed Fields curl -XGET 'http://localhost:9200/_search? pretty&q=style:imperial'! curl -XGET 'http://localhost:9200/_search? pretty&q=style:hefeweizen'
  38. 38. Flexibility • Mixing aggregations, filters and queries all together • What beers have the word “night” in the name that are between 4 and 6 % ABV, broken down by style.
  39. 39. Elasticsearch and PHP • Elasticsearch PHP Lib
 https://github.com/elasticsearch/elasticsearch-php • Elastica
 http://elastica.io/
  40. 40. Other Awesome ES Features • Search analyzers • Geo-based searching • Elasticsearch Plugins • kopf - http://localhost:9200/_plugin/kopf
  41. 41. Questions? • @jason_austin • http://www.pintlabs.com • https://joind.in/10821

×