Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How ElasticSearch lives in my DevOps life

13,351 views

Published on

Published in: Technology
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download Full EPUB Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download Full doc Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download PDF EBOOK here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download EPUB Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... Download doc Ebook here { http://bit.ly/2m6jJ5M } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

How ElasticSearch lives in my DevOps life

  1. 1. ElasticSearch for DevOps
  2. 2. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • http://www.elasticsearch.org/
  3. 3. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • JSON-oriented; • RESTful API; • Schema free. MySQL ElasticSearch database Index table Type column field Defined data type Auto detected
  4. 4. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Master nodes & data nodes; • Auto-organize for replicas and shards; • Asynchronous transport between nodes.
  5. 5. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Flush every 1 second.
  6. 6. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Build on Apache lucene. • Also has facets just as solr.
  7. 7. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Give a cluster name, auto-discovery by unicast/multicast ping or EC2 key. • No zookeeper needed.
  8. 8. Howto Curl • Index $ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }‘ {"ok":true,"_index":“twitter","_type":“tweet","_id":"1","_v ersion":1}
  9. 9. Howto Curl • Get $ curl -XGET 'http://localhost:9200/twitter/tweet/1' { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_source" : { "user" : "kimchy", "postDate" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" } }
  10. 10. Howto Curl • Query $ curl -XPOST 'http://localhost:9200/twitter/tweet/_search? pretty=1&size=1' -d '{ "query" : { "term" : { "user" : "kimchy" } "fields": ["message"] } }'
  11. 11. Howto Curl • Query • Term => { match some terms (after analyzed)} • Match => { match whole field (no analyzed)} • Prefix => { match field prefix (no analyzed)} • Range => { from, to} • Regexp => { .* } • Query_string => { this AND that OR thus } • Must/must_not => {query} • Shoud => [{query},{}] • Bool => {must,must_not,should,…}
  12. 12. Howto Curl • Filter $ curl -XPOST 'http://localhost:9200/twitter/tweet/_search? pretty=1&size=1' -d '{ "query" : { “match_all" : {} }, "filter" : { "term" : { “user" : “kimchy" } } }' Much faster because filter is cacheable and do not calcute _score.
  13. 13. Howto Curl • Filter • And => [{filter},{filter}] (only two) • Not => {filter} • Or => [{filter},{filter}](only two) • Script => {“script”:”doc[‘field’].value > 10”} • Other like the query DSL
  14. 14. Howto Curl • Facets $ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?pretty=1&size=0' -d '{ "query" : { “match_all" : {} }, "filter" : { “prefix" : { “user" : “k" } }, "facets" : { “usergroup" : { "terms" : { "field" : “user" } } } }'
  15. 15. Howto Curl • Facets • terms => [{“term”:”kimchy”,”count”:20},{}] • Range <= [{“from”:10,”to”:20},] • Histogram <= {“field”:”user”,”interval”:10} • Statistical <= {“field”:”reqtime”} => [{“min”:,”max”:,”avg”:,”count”:}]
  16. 16. Howto Perl – ElasticSearch.pm use ElasticSearch; my $es = ElasticSearch->new( servers => 'search.foo.com:9200', # default '127.0.0.1:9200' transport => 'http' # default 'http' | 'httplite ' # 30% faster, future default | 'httptiny ' # 1% more faster | 'curl' | 'aehttp' | 'aecurl' | 'thrift', # generated code too slow max_requests => 10_000, # default 10000 trace_calls => 'log_file', no_refresh => 0 | 1, );
  17. 17. Howto Perl – ElasticSearch.pm use ElasticSearch; my $es = ElasticSearch->new( servers => 'search.foo.com:9200', transport => 'httptiny ‘, max_requests => 10_000, trace_calls => 'log_file', no_refresh => 0 | 1, ); • Get nodelist by /_cluster API from the $servers; • Rand change request to other node after $max_requests.
  18. 18. Howto Perl – ElasticSearch.pm $es->index( index => 'twitter', type => 'tweet', id => 1, data => { user => 'kimchy', post_date => '2009-11-15T14:12:12', message => 'trying out Elastic Search' } );
  19. 19. Howto Perl – ElasticSearch.pm $es->search( facets => { wow_facet => { query => { text => { content => 'wow' }}, facet_filter => { term => {status => 'active' }}, } } )
  20. 20. Howto Perl – ElasticSearch.pm $es->search( facets => { wow_facet => { queryb => { content => 'wow' }, facet_filterb => { status => 'active' }, } } ) ElasticSearch::SearchBuilder More perlish SQL::Abstract-like But I don’t like ==!
  21. 21. Howto Perl – Elastic::Model • Tie a Moose object to elasticsearch package MyApp; use Elastic::Model; has_namespace 'myapp' => { user => 'MyApp::User' }; no Elastic::Model; 1;
  22. 22. Howto Perl – Elastic::Model package MyApp::User; use Elastic::Doc; use DateTime; has 'name' => ( is => 'rw', isa => 'Str', ); has 'email' => ( is => 'rw', isa => 'Str', ); has 'created' => ( is => 'ro', isa => 'DateTime', default => sub { DateTime->now } ); no Elastic::Doc; 1;
  23. 23. Howto Perl – Elastic::Model package MyApp::User; use Moose; use DateTime; has 'name' => ( is => 'rw', isa => 'Str', ); has 'email' => ( is => 'rw', isa => 'Str', ); has 'created' => ( is => 'ro', isa => 'DateTime', default => sub { DateTime->now } ); no Moose; 1;
  24. 24. Howto Perl – Elastic::Model • Connect to db my $es = ElasticSearch->new( servers => 'localhost:9200' ); my $model = MyApp->new( es => $es ); • Create database and table $model->namespace('myapp')->index->create(); • CRUD my $domain = $model->domain('myapp'); $domain->newdoc()|get(); • search my $search = $domain->view->type(‘user’)->query(…)->filterb(…); $results = $search->search; say "Total results found: ".$results->total; while (my $doc = $results->next_doc) { say $doc->name; }
  25. 25. ES for Dev -- Github • 20TB data; • 1300000000 files; • 130000000000 code lines. • Using 26 Elasticsearch storage nodes(each has 2TB SSD) managed by puppet. • 1replica + 20 shards. • https://github.com/blog/1381-a-whole-new-code-search • https://github.com/blog/1397-recent-code-search-outages
  26. 26. ES for Dev – Git::Search • Thank you, Mateu Hunter! • https://github.com/mateu/Git-Search cpanm --installdeps . cp git-search.conf git-search-local.conf edit git-search-local.conf perl -Ilib bin/insert_docs.pl plackup -Ilib curl http://localhost:5000/text_you_want
  27. 27. ES for Perler -- Metacpan • search.cpan.org => metacpan.org • use ElasticSearch as API backend; • use Catalyst build website frontend. • Learn API: https://github.com/CPAN-API/cpan-api/wiki/API-docs • Have a try: http://explorer.metacpan.org/
  28. 28. ES for Perler – index-weekly • A Perl script (55 lines) to index devopsweekly into elasticsearch. • https://github.com/alcy/index-weekly • We can do same thing to perlweekly,right?
  29. 29. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • http://logstash.net/
  30. 30. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • Log is stream, not file! • Event is something not only oneline!
  31. 31. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • file/*mq/stdin/tcp/udp/websocket…(34 input plugins now)
  32. 32. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • date/geoip/grok/multiline/mutate…(29 filter plugins now)
  33. 33. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • transfer:stdout/*mq/tcp/udp/file/websocket… • alert:ganglia/nagios/opentsdb/graphite/irc/xmpp /email… • store:elasticsearch/mongodb/riak • (47 output plugins now)
  34. 34. ES for logging - Logstash
  35. 35. ES for logging - Logstash input { redis { host => "127.0.0.1“ type => "redis-input“ data_type => "list“ key => "logstash“ } } filter { grok { type => “redis-input“ pattern => "%{COMBINEDAPACHELOG}" } } output { elasticsearch { host => "127.0.0.1“ } }
  36. 36. ES for logging - Logstash • Grok(Regexp capture): %{IP:client:string} %{NUMBER:bytes:int} More default patterns at source: https://github.com/logstash/logstash/tree/master/patterns
  37. 37. ES for logging - Logstash For example: 10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1" 304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"
  38. 38. ES for logging - Logstash {"@source":"file://chenryn-Lenovo/home/chenryn/test.txt", "@tags":[], "@fields":{ "clientip":["10.2.21.130"], "ident":["-"], "auth":["-"], "timestamp":["08/Apr/2013:11:13:40 +0800"], "verb":["GET"], "request":["/mediawiki/load.php"], "httpversion":["1.1"], "response":["304"], "referrer":[""http://som.d.xiaonei.com/mediawiki/index.php""], "agent":[""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10""] }, "@timestamp":"2013-04-08T03:34:37.959Z", "@source_host":"chenryn-Lenovo", "@source_path":"/home/chenryn/test.txt", "@message":"10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1" 304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"", "@type":"apache“ }
  39. 39. ES for logging - Logstash "properties" : { "@fields" : { "dynamic" : "true", "properties" : { "client" : { "type" : "string", "index" : "not_analyzed“ }, "size" : { "type" : "long", "index" : "not_analyzed“ }, "status" : { "type" : "string", "index" : "not_analyzed“ }, "upstreamtime" : { "type" : "double“ }, } },
  40. 40. ES for logging - Kibana
  41. 41. ES for logging – Message::Passing • Logstash port to Perl5 • 17 CPAN modules
  42. 42. ES for logging – Message::Passing use Message::Passing::DSL; run_message_server message_chain { output elasticsearch => ( class => 'ElasticSearch', elasticsearch_servers => ['127.0.0.1:9200'], ); filter regexp => ( class => 'Regexp', format => ':nginxaccesslog', capture => [qw( ts status remotehost url oh responsetime upstreamtime bytes )] output_to => 'elasticsearch', ); filter tologstash => ( class => 'ToLogstash', output_to => 'regexp', ); input file => ( class => 'FileTail', output_to => ‘tologstash', ); };
  43. 43. Message::Passing vs Logstash 100_000 lines nginx access log logstash::output::elasticsearch_http (default) 4m30.013s logstash::output::elasticsearch_http (flush_size => 1000) 3m41.657s message::passing::filter::regexp (v0.01 call $self->_regex->regexp() everyline) 1m22.519s message::passing::filter::regexp (v0.04 store $self->_regex->regexp() to $self->_re) 0m44.606s
  44. 44. D::P::Elasticsearch & D::P::Ajax
  45. 45. Build Website using PerlDancer get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  46. 46. use Dancer ‘:syntax’; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  47. 47. use Dancer::Plugin::Auth::Extensible; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  48. 48. use Dancer::Plugin::Ajax; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  49. 49. use Dancer::Plugin::ElasticSearch; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  50. 50. use Dancer::Plugin::ElasticSearch; sub area_terms { my ( $index, $level, $limit, $from, $to ) = @_; my $data = elsearch->search( index => $index, type => $type, facets => { area => { facet_filter => { and => [ { range => { date => { from => $from, to => $to } } }, { numeric_range => { timeCost => { gte => $level } } }, ], }, terms => { field => "fromArea", size => $limit, } } } ); return $data->{facets}->{area}->{terms}; }
  51. 51. ES for monitor – oculus(Etsy Kale) • Kale to detect anomalous metrics and see if any other metrics look similar. • http://codeascraft.com/2013/06/11/introd ucing-kale/
  52. 52. ES for monitor – oculus(Etsy Kale) • Kale to detect anomalous metrics and see if any other metrics look similar. • https://github.com/etsy/skyline
  53. 53. ES for monitor – oculus(Etsy Kale) • Kale to detect anomalous metrics and see if any other metrics look similar. • https://github.com/etsy/oculus
  54. 54. ES for monitor – oculus(Etsy Kale) • import monitor data from redis/ganglia to elasticsearch • Using native script to calculate distance: script.native: oculus_euclidian.type: com.etsy.oculus.tsscorers.EuclidianScriptFactory oculus_dtw.type: com.etsy.oculus.tsscorers.DTWScriptFactory
  55. 55. ES for monitor – oculus(Etsy Kale) • https://speakerdeck.com/astanway/bring-the-noise- continuously-deploying-under-a-hailstorm-of-metrics
  56. 56. VBox example • apt-get install -y git cpanminus virtualbox • cpanm Rex • git clone https://github.com/chenryn/esdevops • cd esdevops • rex init --name esdevops

×