Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using AWS Elasticsearch for fast feedback on business data


Published on

My presentation to the Wellington AWS User Group on giving the business situational awareness, anomaly detection, and process monitoring. A how to guide on using "traditional" IT tools on the generally-more-important problems of the business.

Published in: Internet
  • Login to see the comments

Using AWS Elasticsearch for fast feedback on business data

  1. 1. Using the 
 AWS Elasticsearch Service 
 to provide fast feedback 
 for your business Wellington AWS User Group December 7, 2017
  2. 2. +64 27 620 1237 Steven Ensslen
  3. 3. IT knows the value 
 of fast feedback The Second Way 
 of DevOps is 
 “shorten and amplify feedback loops”.
  4. 4. Feedback tools in software development 1. IDE feedback 2. Unit tests 3. Continuous Integration 4. SecOps tests in build pipeline 5. Application Performance Monitoring
  5. 5. IT has sub-second feedback Most business leaders have the same feedback cycle as they did 50 years ago.
  6. 6. “War is the realm of uncertainty; three quarters of the factors on which action in war is based are wrapped in a fog of greater or lesser uncertainty. A sensitive and discriminating judgment is called for; a skilled intelligence to scent out the truth.” –Carl von Clausewitz
 19th Century Prussian General
  7. 7. The fog of business Information is late, disconnected, and vague.
  8. 8. Totals Hide Information If your analysis uses fixed- length periods you will miss trends. Weekly hours Monday Tuesday Wednesday Thursday Friday Saturday Sunday Total Constant 8 8 8 8 8 0 0 40 Spiky 2 5 3 12 11 5 2 40 Constant Spiky
  9. 9. Arbitrary Boundaries Hide Trends 0 20 40 60 80 2017-9-1 2017-9-10 2017-9-19 2017-9-28 2017-10-7 2017-10-16 2017-10-25 September Total = 1865 October Total = 1880 November 
  10. 10. “If it moves, graph it. 
 If it doesn’t move, 
 graph it anyway, 
 just in case it does.” –Etsy
  11. 11. Elasticsearch 1. Makes fast feedback easy,
 both for IT and business people 2. Makes awesome graphs 3. Is super fast and massively scalable
  12. 12. What is ELK? Elasticsearch is a RESTful API and clustering software over Apache Lucene, which is a document database optimised for search. Logstash is a data ingestion tool. It transforms and ships data across networks. Beats are a lighter, less-capable agent for Elasticsearch. Kibana is a powerful ad hoc query tool that quickly creates beautiful graphs.
  13. 13. AWS Elasticsearch Service
  14. 14. Demonstration
  15. 15. Business Intelligence Tips 1. Work closely with a champion customer 2. Start small, both in scope and audience 3. Reuse the language and labels of your audience 4. Reuse the time periods that are already part of your processes 
 (i.e. financial quarters) 5. Transform data and index the things that your audience think about, like sessions, products, and orders, especially if your raw data doesn’t quite map to them
  16. 16. Test Driven Design 1. Use Kinesis Firehose to save all of your production stream to S3, 
 then apply lifecycle policies 2. At the very beginning, play a static, fake data set. Replay feature of the Logstash sleep plugin. Do not develop or test with a random generator! 3. Whenever you encounter undesirable behaviour, 
 add the recording segment to your test suite. 4. Test Elasticsearch with xUnit in your code pipleline 5. Monitor Kibana and Elasticsearch with your APM
  17. 17. Elasticsearch tips 1. Predefine your index mapping 2. Only use one type per index (ES6.x removes support for many types per index) 3. Partition your index by time, typically by day 4. There are no joins, use Lambda to enrich data before loading it into Elasticsearch 5. Ideally an ES cluster has 3 small masters and < 10 workers, 
 above 10 nodes scale-up before scale-out
  18. 18. AWS ES tips 1. The Elasticsearch port is 80, not 9200 2. Do NOT expose ES or Kibana to the public internet! 3. Start bigger, then shrink (IMHO, seven M4.large is big) 4. Do not use ES as a data store; use RDS, or DynamoDB, or Redshift, or S3 with Athena
  19. 19. Cloudwatch vs ES 1. Only fixed thresholds for alerts 2. Much easier to use 3. Much less admin 4. Scales elastically
  20. 20. Kinesis Analytics vs ES 1. Simpler for detection 2. Elastic scaling 3. No graphs 4. MillisBehindLatest can be minutes!
  21. 21. Athena & Quicksight vs ES 1. Massive, admin-free scaling 2. Need to add Lambda, even then runs periodically not event driven 3. Worse latency 4. Conceivably could be more expensive 
 (1440 scheduled queries * ?)
  22. 22. Photo: Micheal Filion, Situational Awareness 100% uptime on the GPS of this car isn’t going to help anything
  23. 23. Clarity Cloudworks illuminating issues before they become problems