Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Intro elasticsearch taswarbhatti

261 views

Published on

A gentle intro to elastic search stack, kibana, logstash and beats.

Published in: Software
  • Be the first to comment

Intro elasticsearch taswarbhatti

  1. 1. A Gentle Intro to ElasticSearch Taswar Bhatti System/Solutions Architect (Ottawa) GEMALTO
  2. 2. Who amI?  System/Solution Architect at Gemalto Ottawa (Microsoft MVP)  I am somewhat of a language geek; I speak a few languages  Kind of like Neo (I KNOW KUNG FU) for languages 2 - Merhaba - नमस्ते - 你好 - ‫ہیلو‬ - Comment ca va? - ਸਤ ਸਰੀ ਅਕਾਲ
  3. 3. 9/14/2018 3 Reuters Top 100: Gemalto rated top Global Tech Leaders https://www.thomsonreuters.com/en/products-services/technology/top-100.html
  4. 4. Agenda  Problem we had and wanted to solve with Elastic Stack  Intro to Elastic Stack (Ecosystem)  Logstash  Kibana  Beats  Elastic Search flows designs that we have considered  Future plans of using Elastic Search 4
  5. 5. How doyouTroubleshootorfindyourbugs?  Typically in a distributed environment one has to go through the logs to find out where the issue is  Could be multiple systems that you have to go through which machine/server generated the log or monitoring multiple logs  Even monitor firewall logs to find traffic routing through which data center  Chuck Norris never troubleshoot; the trouble kills themselves when they see him coming 9/14/2018 5
  6. 6. 9/14/2018 6
  7. 7. OurProblem  We had distributed systems (microservices) that would generate many different types of logs, in different data centers  We also had authentication audit logs that had to be secure and stored for 1 year  We generate around 2 millions records of audit logs a day, 4TB with replications  We need to generate reports out of our data for customers  We were still using Monolith Solution in some core parts of the application  Growing pains of a successful application  We want to use a centralized scalable logging system for all our logs 9/14/2018 7
  8. 8. Findingbugsthroughlogs 9/14/2018 8
  9. 9. Alittlehistoryof ElasticSearch  Shay Banon created Compass in 2004  Released Elastic Search 1.0 in 2010  ElasticSearch the company was formed in 2012  Shay wife is still waiting for her receipe app 9/14/2018 9
  10. 10. 9/14/2018 10
  11. 11. ElasticStack 9/14/2018 11
  12. 12. ElasticSearch  Written in Java backed by Lucene  Schema free, REST & JSON based document store  Search Engine  Distributed, Horizontally Scalable  No database storage, storage is Lucene  Apache 2.0 License 9/14/2018 12
  13. 13. CompaniesusingElasticStack 9/14/2018 13
  14. 14. ElasticSearchindices  Elastic organizes document in indices  Lucene writes and maintains the index files  ElasticSearch writes and maintains metadata on top of Lucene  Example: field mappings, index settings and other cluster metadata 9/14/2018 14
  15. 15. Databasevs ElasticSearch 9/14/2018 15
  16. 16. ElasticConcepts  Cluster : A cluster is a collection of one or more nodes (servers)  Node : A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities  Index : An index is a collection of documents that have somewhat similar characteristics. (e.g Product, Customer, etc)  Type : Within an index, you can define one or more types. A type is a logical category/partition of your index.  Document : A document is a basic unit of information that can be indexed  Shard/Replica: Index divided into multiple pieces called shards, replicas are copy of your shards 9/14/2018 16
  17. 17. Elasticnodes  Master Node : which controls the cluster  Data Node : Data nodes hold data and perform data related operations such as CRUD, search, and aggregations.  Ingest Node : Ingest nodes are able to apply an ingest pipeline to a document in order to transform and enrich the document before indexing  Coordinating Node : only route requests, handle the search reduce phase, and distribute bulk indexing. 9/14/2018 17
  18. 18. 9/14/2018 18
  19. 19. ElasticsearchCLUSTER 9/14/2018 19
  20. 20. TYPICALCLUSTERSHARD&REPLICA 9/14/2018 20
  21. 21. Shardsearchandindex 9/14/2018 21
  22. 22. DemoofElasticSearch 9/14/2018 22
  23. 23. LOGSTASH  Ruby application runs under JRuby on the JVM  Collects, parse, enrich data  Horizontally scalable  Apache 2.0 License  Large amount of public plugins written by Community  https://github.com/logstash-plugins 9/14/2018 23
  24. 24. Typicalusageof Logstash 9/14/2018 24
  25. 25. 9/14/2018 25
  26. 26. Logstashinput 9/14/2018 26
  27. 27. Logstashfilter 9/14/2018 27
  28. 28. Logstashoutput 9/14/2018 28
  29. 29. DEMOLogstash 9/14/2018 29
  30. 30. Beats 9/14/2018 30
  31. 31. Beats  Lightweight shippers written in Golang (Non JVM shops can use them)  They follow unix philosophy; do one specific thing, and do it well  Filebeat : Logfile (think of it tail –f on steroids)  Metricbeat : CPU, Memory (like top), redis, mongodb usage  Packetbeat : Wireshark uses libpcap, monitoring packet http etc  Winlogbeat : Windows event logs to elastic  Dockbeat : Monitoring docker  Large community lots of other beats offered as opensource 9/14/2018 31
  32. 32. 9/14/2018 32
  33. 33. FILEBEAT 9/14/2018 33
  34. 34. X-Pack  Elastic commercial offering (This is one of the ways they make money)  X-Pack is an Elastic Stack extension that bundles  Security (https to elastic, password to access Kibana)  Alerting  Monitoring  Reporting  Graph capabilities  Machine Learning 9/14/2018 34
  35. 35. 9/14/2018 35
  36. 36. Kibana  Visual Application for Elastic Search (JS, Angular, D3)  Powerful frontend for dashboard for visualizing index information from elastic search  Historical data to form charts, graphs etc  Realtime search for index information 9/14/2018 36
  37. 37. 9/14/2018 37
  38. 38. DEMOKIBANA 9/14/2018 38
  39. 39. Designswewentthrough  We started with simple design to measure throughput  One instance of logstash and one instance of ElasticSearch with filebeat 9/14/2018 39
  40. 40. DotnetCoreapp  We used a dotnetcore application to generate logs  Serilog to generate into json format and stored on file  Filebeat was installed on the linux machine to ship the logs to logstash 9/14/2018 40
  41. 41. Performanceelastic  250 logs item per second for 30 minutes 9/14/2018 41
  42. 42. overview 9/14/2018 42
  43. 43. logstash 9/14/2018 43
  44. 44. Elasticsearchruntwo  1000 logs per second, run for 30 minutes 9/14/2018 44
  45. 45. performance 9/14/2018 45
  46. 46. Otherdesigns 9/14/2018 46
  47. 47. Otherdesignsusingredis 9/14/2018 47
  48. 48. Usingfilebeat 9/14/2018 48
  49. 49. Filebeatwithoutrelay 9/14/2018 49
  50. 50. Log4j 9/14/2018 50
  51. 51. Log4jdirect 9/14/2018 51
  52. 52. Whatwearegoingwithfornow,until….. 9/14/2018 52
  53. 53. Considerationsofdata  Index by day make sense in some cases  In other you may want to index by size rather (Black Friday more traffic than other days) when Shards are not balance ElasticSearch doesn’t like that  Don’t index everything, if you are not going to search on specific fields mark them as text 9/14/2018 53
  54. 54. FutureConsiderations  Investigate into Elastic Search Machine learning  ElasticSearch with Kafka for cross data center replication  Logstash Centralizex Pipeline for SEIM intergations 9/14/2018 54
  55. 55. Thankyou& Opento questions  - Questions???  - Contact: Taswar.bhatti@gemalto.com  - LinkedIn (find me and add me) 9/14/2018 55

×