Introduction to Elasticsearch

  • 4,373 views
Uploaded on

Elasticsearch is a powerful, distributed, open source searching technology. By integrating Elasticsearch into your application, you instantly provide a way to search a lot of data very quickly. …

Elasticsearch is a powerful, distributed, open source searching technology. By integrating Elasticsearch into your application, you instantly provide a way to search a lot of data very quickly. Elasticsearch has a RESTful API, it scales, its super fast, you can use plugins to customize it, and much more. In this talk I go over the basics of setting up Elasticsearch, creating a search index, importing your data, and doing some basic searching. I also touch on a few advanced topics that will show the flexibility of this awesome service.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,373
On Slideshare
0
From Embeds
0
Number of Embeds
29

Actions

Shares
Downloads
14
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Introduction to Elasticsearch Jason Austin - @jason_austin
  • 2. The Problem • You are building a website to find beers • You have a huge database of beers and breweries to sift through • You want simple keyword-based searching • You also want structured searching, like finding all beers > 7% ABV • You want to run some analytics on what beers are in your dataset
  • 3. Enter Elasticsearch • Lucene based • Distributed • Fast • RESTful interface • Document-Based with JSON
  • 4. Install Elasticsearch • Download from http://elasticsearch.org • Requires Java to run
  • 5. Run Elasticsearch • From the install directory: ./bin/elasticsearch -d! ! http://localhost:9200/!
  • 6. Communicating • Elasticsearch listens to RESTful HTTP requests • GET, POST, PUT, DELETE • CURL works just fine
  • 7. ES Structure Relational DB Databases Tables Rows Columns Elasticsearch Indices Types Documents Fields
  • 8. ES Structure Elasticsearch Indices Types Documents Fields Elasticsearch phpbeer beer Pliny the Elder ABV, Name, Desc
  • 9. Create an Index curl -XPOST 'http://localhost:9200/phpbeer'
  • 10. What to Search? • Define the types of things to search • Beer • Brewery - Maybe later
  • 11. Define a Beer • Name • Style • ABV • Brewery ‣ Name ‣ City
  • 12. Beer JSON {
 ! "name": "Pliny the Elder",
 ! "style": "Imperial India Pale Ale",
 ! "abv": 7.0,
 ! "brewery": {
 ! ! "name": "Russian River Brewing Co.",
 ! ! "city": "Santa Rosa",
 "state": "California"
 ! }
 }

  • 13. SavingThe Beer curl -XPOST 'http://localhost:9200/ phpbeer/beer/1' -d '{
 ! "name": "Pliny the Elder",
 ! "style": "Imperial India Pale Ale",
 ! "abv": 7.0,
 ! "brewery": {
 ! ! "name": "Russian River Brewing Co.",
 ! ! "city": "Santa Rosa",
 "state": "California"
 ! }
 }'

  • 14. Getting a beer curl -XGET 'http://localhost:9200/phpbeer/ beer/1?pretty'
  • 15. Updating a Beer curl -XPOST 'http://localhost:9200/ phpbeer/beer/1' -d '{
 ! "name": "Pliny the Elder",
 ! "style": "Imperial India Pale Ale",
 ! "abv": 8.0,
 ! "brewery": {
 ! ! "name": "Russian River Brewing Co.",
 ! ! "city": "Santa Rosa",
 "state": "California"
 ! }
 }'

  • 16. POSTvs PUT • POST • No ID - Creates new doc, assigns ID • With ID - Updates or creates new doc • PUT • No ID - Error • With ID - Updates doc
  • 17. Delete a Beer curl -XDELETE 'http://localhost:9200/ phpbeer/beer/1'
  • 18. Finally! Searching! curl -XGET 'http://localhost:9200/_search? pretty&q=pliny'
  • 19. Specific Field Searching curl -XGET 'http://localhost:9200/_search? pretty&q=style:pliny'! curl -XGET 'http://localhost:9200/_search? pretty&q=style:imperial'
  • 20. Alternate Approach • Search using DSL (Domain Specific Language) • JSON in request body
  • 21. DSL Searching curl -XGET 'http://localhost:9200/_search? pretty' -d '{
 "query" : {
 "match" : {
 "style" : "imperial"
 }
 }
 }'
  • 22. DSL = Query + Filter • Query - “How well does the document match” • Filter - Yes or No question on the field
  • 23. Query DSL • match • Used to query across all fields for a string • match_phrase • Used to query an exact phrase • match_all • Matches all documents • multi_match • Runs the same match query on multiple fields
  • 24. Filter DSL • term • Exact match on a field • range • Match numbers over a specified range • exists / missing • Match based on the existence of a value for a field
  • 25. More Complex Search • Find beer whose styles include “Pale Ale” that are less than 7% ABV
  • 26. Match + Range curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "query" : {
 "match" : {
 "style" : "pale ale"
 }
 },
 "filter" : {
 "range" : {
 "abv" : { "lt" : 7 }
 }
 }
 }'
  • 27. Embedded Field Search curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "query" : {
 "match" : {
 "brewery.state" : "California"
 }
 }
 }'
  • 28. Highlighting Search Results
  • 29. Highlighting Search Results curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "query" : {
 "match" : {
 "style" : "pale ale"
 }
 },
 "highlight": {
 "fields" : {
 "style" : {}
 }
 }
 }'
  • 30. Aggregations • Collect analytics on your documents • 2 main types • Bucketing • Produce a set of buckets with documents in them • Metric • Compute metrics over a set of documents
  • 31. Bucketing Aggregations
  • 32. Metric Aggregations • How many beers exist of each style? • What is the average ABV of beers for each style? • How many beers exist that are brewed in California?
  • 33. What is the average ABV of beers for each style? curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "aggs" : {
 "all_beers" : {
 "terms" : { "field" : "style" },
 "aggs" : {
 "avg_abv" : {
 "avg" : { "field" : "abv" }
 }
 }
 }
 }
 }'
  • 34. Mappings • Define how ES searches • Completely optional • Must re-index after defining mapping
  • 35. Create Index with Mapping curl -XPOST localhost:9200/phpbeer -d '{
 "mappings" : {
 "beer" : {
 "_source" : { "enabled" : true },
 "properties" : {
 "style" : { 
 "type" : "string", 
 "index" : "not_analyzed" 
 }
 }
 }
 }
 }' curl -XDELETE localhost:9200/phpbeer
  • 36. What is the average ABV of beers for each style? curl -XGET 'http://localhost:9200/_search?pretty' -d '{
 "aggs" : {
 "all_beers" : {
 "terms" : { "field" : "style" },
 "aggs" : {
 "avg_abv" : {
 "avg" : { "field" : "abv" }
 }
 }
 }
 }
 }'
  • 37. Non-Analyzed Fields curl -XGET 'http://localhost:9200/_search? pretty&q=style:imperial'! curl -XGET 'http://localhost:9200/_search? pretty&q=style:hefeweizen'
  • 38. Flexibility • Mixing aggregations, filters and queries all together • What beers have the word “night” in the name that are between 4 and 6 % ABV, broken down by style.
  • 39. Elasticsearch and PHP • Elasticsearch PHP Lib
 https://github.com/elasticsearch/elasticsearch-php • Elastica
 http://elastica.io/
  • 40. Other Awesome ES Features • Search analyzers • Geo-based searching • Elasticsearch Plugins • kopf - http://localhost:9200/_plugin/kopf
  • 41. Questions? • @jason_austin • http://www.pintlabs.com • https://joind.in/10821