Your SlideShare is downloading. ×
Craig Brown speaks on ElasticSearch
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Craig Brown speaks on ElasticSearch

1,522
views

Published on

Published in: Technology, News & Politics

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,522
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
24
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 3 DBAs walk into a NOSQL bar. A little while later they walk out ..... because they couldn't find a table.
  • 2. 3 nosql guys walk into a SQL bar... a little while later they leave because they couldn't find a relationship
  • 3. Objective – Understanding of ElasticSeach as a search engine and nosql datastore.
  • 4. About Me Search Architect Big Data/Hadoop Engineer NoSql Advocate blog.nosqltips.com @nosqltips on twitter
  • 5. ElasticSearch You know, for search Shay Banon - compass Elasticsearch.org Built on Lucene core (4.3 as of 0.90.3) JSON over HTTP (REST)
  • 6. ElasticSearch Very easy scalability – multicast, unicast, AWS Open Source – apache 2 license  https://github.com/elasticsearch Transports – HTTP, memcached, thrift Scripting – mvel, javascript, java, python, groovy  custom scoring, document updates
  • 7. Schema Free & Document Oriented $ curl -XPUT http://localhost:9200/twitter/user/kimchy -d '{ "name" : "Shay Banon" }' $ curl -XPUT http://localhost:9200/twitter/tweet/1 -d '{ "user": "kimchy", "post_date": "2009-11-15T13:12:00", "message": "Trying out elasticsearch, so far so good?" }' $ curl -XPUT http://localhost:9200/twitter/tweet/2 -d '{ "user": "kimchy", "post_date": "2009-11-15T14:12:12", "message": "You know, for Search" }'
  • 8. Search $ curl -XGET http://localhost:9200/twitter/tweet/_search?q=user:kimchy $ curl -XGET http://localhost:9200/twitter/tweet/_search -d '{ "query" : { "term" : { "user": "kimchy" } } }' $ curl -XGET http://localhost:9200/twitter/_search?pretty=true -d '{ "query" : { "range" : { "post_date" : { "from" : "2009-11-15T13:00:00", "to" : "2009-11-15T14:30:00" } } } }'
  • 9. GETting Some Data $ curl -XPUT http://localhost:9200/twitter/tweet/2 -d '{ "user": "kimchy", "post_date": "2009-11-15T14:12:12", "message": "You know, for Search" }' $ curl -XGET http://localhost:9200/twitter/tweet/2
  • 10. Schema Mapping $ curl -XPUT http://localhost:9200/twitter $ curl -XPUT http://localhost:9200/twitter/user/_mapping -d '{ "properties" : { "name" : { "type" : "string" } } }'
  • 11. Multi Tenancy $ curl -XPUT http://localhost:9200/kimchy $ curl -XPUT http://localhost:9200/elasticsearch $ curl -XPUT http://localhost:9200/elasticsearch/tweet/1 -d '{ "post_date": "2009-11-15T14:12:12", "message": "Zug Zug", "tag": "warcraft" }' $ curl -XPUT http://localhost:9200/kimchy/tweet/1 -d '{ "post_date": "2009-11-15T14:12:12", "message": "Whatyouwant?", "tag": "warcraft" }' $ curl -XGET http://localhost:9200/kimchy,elasticsearch/tweet/_search?q=tag:warcraft $ curl -XGET http://localhost:9200/_all/tweet/_search?q=tag:warcraft
  • 12. Settings $ curl -XPUT http://localhost:9200/elasticsearch/ -d '{ "settings" : { "number_of_shards" : 2, "number_of_replicas" : 3 } }'
  • 13. Distributed Shards – write scale Replicas – read scale, durability Segments Routing – index and search Discovery – multicast, unicast, AWS http://www.youtube.com/watch?v=l4ReamjCxHo
  • 14. Consistency Always read consistent with RT GET  View always consistent after write Document searchable after short delay(1s) Write tunable – one, quorum, all
  • 15. Gateway Local/NFS Amazon S3 Hadoop
  • 16. River Twitter CouchDB MongoDB RabbitMQ Wikipedia Logstash
  • 17. SOLR over ElasticSearch Release synchronized with Lucene Larger community Larger tool set Feature set a bit better XML configuration
  • 18. ElasticSearch over SOLR Natively distributed JSON based Dynamic, template, and defined schema Returns source document by default Avoids overhead of index commit after write Mock SOLR interface Rivers
  • 19. Who uses ElasticSearch StumbleUpon Mozilla Foundation Sony Computer Entertainment Infochimps Foursquare Github Ataxo Social Insider Sonian Inc.
  • 20. Demo
  • 21. Hadoop Integration Hadoop as gateway storage LoadFunc/StorFunc – Pig/Map Reduce Hadoop streaming interface Manual export and import of data
  • 22. ES in Big Data Endpoint for processed data Aggregator for BI or dashboard (facets) Used to query reduced data sets for machine learning algorithms Data storage engine in it's own right plus full search capabilities
  • 23. ElasticStore Make ES look and function more like a document store while exposing advanced ES features Influenced by Mongo API Expose a simpler, more programmer centric API Expose A QueryBuilder style API (HQL) Expose annotations for easier schema definition, properties, analyzers, etc. Allow both strong and weak object mapping https://github.com/nosqltips/elasticstore
  • 24. Resources www.elasticsearch.org www.elasticsearch.com https://github.com/elasticsearch http://lucene.apache.org/core