quick intro to elastic search
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

quick intro to elastic search

on

  • 30,785 views

quick intro to elastic search

quick intro to elastic search

Statistics

Views

Total Views
30,785
Views on SlideShare
29,545
Embed Views
1,240

Actions

Likes
27
Downloads
500
Comments
1

11 Embeds 1,240

http://log.medcl.net 671
http://geeks.aretotally.in 522
http://bundlr.com 11
https://twitter.com 10
http://hendragunz.com 10
http://www.wumii.com 5
http://huhry.dyndns-web.com 4
http://a0.twimg.com 3
http://s.medcl.net 2
http://darya-ld1.linkedin.biz 1
http://fbweb-test.comoj.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

quick intro to elastic search Presentation Transcript

  • 1. ElasticSearch
    Introduction and quick startup
    medcl 9-29
  • 2. introduction
    ElasticSearch,a distributed search solution ,
    domain driven
    schema free
    anything pluggable
    open source, distributed, RESTful
    Author:shay.banon (expert in search and analytics)
    Compass
    GigaSpaces
    Current Version 0.11.0
  • 3. Features
    Reliable, Asynchronous Write Behind for long term persistency.
    (Near) Real Time Search.
    Built on top of Lucene.
    shard is a fully functional Lucene index.
    All the power of Lucene easily exposed through simple configuration / plugins.
    Per operation consistency
    Single document level operations are atomic, consistent, isolated and durable.
    Open Source under Apache 2 License.
  • 4. Distributed and Highly Available
    Each index is fully sharded with a configurable number of shards.
    Each shard can have zero or more replicas.
    Read / Search operations performed on either replica shard.
  • 5. Multi Tenant with Multi Types.
    Support for more than one index.
    Support for more than one type per index.
    Index level configuration (number of shards, index storage, ...).
  • 6. Document oriented
    No need for upfront schema definition.
    Schema can be defined per type for customization of the indexing process.
  • 7. Various set of APIs.
    HTTP RESTful API.
    Native Java API.
    3rd Clients
    perl、python、php、ruby、groovy、erlang、.NET
    All APIs perform automatic node operation rerouting.
  • 8. Up and run
  • 9. install
    Zero Conf
  • 10. index
    $ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" : "Shay Banon" }'$ curl -XPUT http://localhost:9200/twitter/tweet/1 -d '{     "user": "kimchy",     "post_date": "2009-11-15T13:12:00",     "message": "Trying out Elastic Search, so far so good?" }'$ curl -XPUT http://localhost:9200/twitter/tweet/2 -d '{     "user": "kimchy",     "post_date": "2009-11-15T14:12:12",     "message": "You know, for Search" }'
  • 11. Schema mapping
    $ curl -XPUT http://localhost:9200/twitter$ curl -XPUT http://localhost:9200/twitter/user/_mapping -d '{    "properties" : {        "name" : { "type" : "string" }    }}'
  • 12. GET
    $ curl -XPUT http://localhost:9200/twitter/tweet/2 -d '{ "user": "kimchy", "postDate": "2009-11-15T14:12:12", "message": "You know, for Search" }'$ curl -XGET http://localhost:9200/twitter/tweet/2
  • 13. Search
    $ curl -XPUT http://localhost:9200/twitter/tweet/2 -d '{ "user": "kimchy", "postDate": "2009-11-15T14:12:12", "message": "You know, for Search" }'$ curl -XGET http://localhost:9200/twitter/tweet/_search?q=user:kimchy$ curl -XGET http://localhost:9200/twitter/tweet/_search -d '{ "query" : { "term" : { "user": "kimchy" } } }'$ curl -XGET http://localhost:9200/twitter/_search?pretty=true -d '{ "query" : {         "range" : {             "post_date" : {                 "from" : "2009-11-15T13:00:00",                 "to" : "2009-11-15T14:30:00"             }         } } }'
  • 14. multenancy
    $ curl -XPUT http://localhost:9200/kimchy$ curl -XPUT http://localhost:9200/elasticsearch$ curl -XPUT http://localhost:9200/elasticsearch/tweet/1 -d '{ "post_date": "2009-11-15T14:12:12", "message": "Zug Zug", "tag": "warcraft" }'$ curl -XPUT http://localhost:9200/kimchy/tweet/1 -d '{ "post_date": "2009-11-15T14:12:12", "message": "Whatyouwant?", "tag": "warcraft" }'$ curl -XGET http://localhost:9200/kimchy,elasticsearch/tweet/_search?q=tag:warcraft$ curl -XGET http://localhost:9200/_all/tweet/_search?q=tag:warcraft
  • 15. Setting
    $ curl -XPUT http://localhost:9200/kimchy/ -d 'index :    store:        type: memory'$ curl -XPUT http://localhost:9200/elasticsearch/ -d ' {    "index" : {        "number_of_shards" : 2,        "number_of_replicas" : 3    }}'
  • 16. Behind ElasticSearch
  • 17. Modules
  • 18. Zen Discovery
    Zen is used for both discovery and master election. A master in elasticsearch is responsible for handling nodes coming and going and allocation of shards. Note, the master is not a single point of failure, if it fails, then another node will be elected as master.
     that nodes do not need to communicate with the master on each request, so its not a single point of bottleneck
    The readiness of nodes is done using the shard allocation algorithm. A shard allocated to a node is considered “ready” to receive requests only once it has fully initialized.
  • 19. scalability
     nodes that can hold data, and nodes that do not.
    There is no need for a load balancer in elasticsearch, each node can receive a request, and if it can’t handle it, it will automatically delegate it to the appropriate node(s).
    If you want to scale out search, you can simply have more shard replicas per shard.
  • 20. automatic shard allocation
    From:http://www.slideshare.net/elasticsearch/elasticsearch-at-berlinbuzzwords-2010#
  • 21. BASE support
    Each document you index is there once the index operation is done.
    No need to commit or something similar to get everything persisted.
    A shard can have 1 or more replicas for HA.
    Gateway persistency is done in the background in an async manner.
  • 22. The River
    A river is a pluggable service running within elasticsearch cluster pulling data (or being pushed with data) that is then indexed into the cluster.
  • 23. Geo Location and Search
    1. make your data geo enabled
    {    "pin" : {        "location" : {            "lat" : 40.12,            "lon" : -71.34        },        "tag" : ["food", "family"],        "text" : "my favorite family restaurant"    }}
    Find By Location
    Sorting
    Faceting … …
  • 24. More details in http://www.elasticsearch.com/docs/
  • 25. comparison
  • 26. Compare with solr
    Though support dynamic schema,but it sucks
    *i ,name_i,age_i,….
    Distribute ,just do many replica,Master-Slave,and with a dirty query like this:
    http://localhost:9080/solr/select/?q=xxx:xxx&shards=localhost:8080/solr,localhost:9080/solr WTF!
    Does it really RESTful?anyway, doesn’t matter
  • 27. Compare with katta
    Featrures
    Makes serving large or high load indices easy
    Serves very large Lucene or HadoopMapfile indices as index shards on many servers
    Replicate shards on different servers for performance and fault-tolerance
    Supports pluggable network topologies
    Master fail-over
    Fast, lightweight, easy to integrate
    Plays well with Hadoop clusters
    May heavy to us(may be not)
    Master-Node,complex and ops will killed us?can’t be a little easy?
    Lack of Client and documents
    Inactivity Community
    Lake of Some Search Features
  • 28. Resources
  • 29. Link:
    http://www.elasticsearch.com
    http://www.elasticsearch.com/blog
    http://www.elasticsearch.com/docs/
    http://www.elasticsearch.com/community/mailinglist/user/
    http://github.com/elasticsearch
    References:
    http://highscalability.com/blog/2010/2/10/elasticsearch-open-source-distributed-restful-search-engine.html
    http://blog.sematext.com/2010/05/03/elastic-search-distributed-lucene/
    http://mail-archives.apache.org/mod_mbox/hbase-user/201006.mbox/%3C149150.78881.qm@web50304.mail.re2.yahoo.com%3E
    http://www.slideshare.net/elasticsearch/elasticsearch-at-berlinbuzzwords-2010#
  • 30. Thanks/