Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS October Webinar Series - Introducing Amazon Elasticsearch Service

32,566 views

Published on

Running Elasticsearch often requires specialized expertise and significant resources to operate and manage infrastructure and Elasticsearch software.

Amazon Elasticsearch Service makes it easy to deploy, operate, and scale Elasticsearch in AWS.

In this webinar, we will walk through how to launch a fully functional Amazon Elasticsearch domain, load your data, and analyze it using the built-in Kibana integration. We will also cover the CloudWatch Logs integration, which enables you to have your log data, such as VPC logs, automatically loaded into your Amazon Elasticsearch domain for analysis and exploration.

Published in: Technology

AWS October Webinar Series - Introducing Amazon Elasticsearch Service

  1. 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pravin Pillai, Sr. Product Manager Jon Handler, Principal Solutions Architect October, 2015 Introduction to Amazon Elasticsearch Service
  2. 2. Amazon Elasticsearch Service
  3. 3. What to Expect from the Session • Context: Managing your growing data • Introducing Amazon Elasticsearch Service (Amazon ES) • Configuring, securing, connecting, monitoring, and scaling your Amazon ES cluster
  4. 4. Your data is constantly growing Product usage
  5. 5. Your data is constantly growing System logs
  6. 6. Your data is constantly growing Customer conversations
  7. 7. That’s a lot of data!
  8. 8. “Big data is not about the data” - Gary King, Harvard University, making the point that while data is plentiful and easy to collect, the real value is in the analytics.
  9. 9. So what can you do with all this data? • Share information • Extract insight • Recognize patterns • Track performance Ultimately, make better business, technical, and operational decisions
  10. 10. Scenario 1: Full-text search Knowledge Sharing Systems •Your team is constantly generating content •You are tasked with making this knowledge base searchable and accessible •You need key search features including text matching, faceting, filtering, fuzzy search, auto complete, and highlighting
  11. 11. Scenario 2: Streaming data analytics Intrusion detection •You have to protect your system from attacks •You need easy to use, yet powerful analytics and data visualization tools to detect issues in near real-time •Easy and flexible data ingestion is important to capture information from a variety of key data sources
  12. 12. Scenario 3: Batch data analytics Usage Monitoring •You are a mobile app developer •You have to monitor/manage users across multiple app versions •You want to analyze and report on usage and migration between app versions
  13. 13. What options do you have?
  14. 14. How Elasticsearch can help A powerful, real-time, distributed, open-source search and analytics engine: •Built on top of Apache Lucene •Schema free •Developer friendly RESTful API
  15. 15. How Elasticsearch can help Combined with Logstash and Kibana, the ELK stack provides a tool for real-time analytics and data visualization
  16. 16. Operating Elasticsearch is time-consuming “Elasticsearch allows us to easily and quickly build bleeding edge big data and analytics applications using the ELK stack. By offering direct access to the Elasticsearch API while offloading administrative tasks, Amazon Elasticsearch Service gives us the manageability, flexibility and control we need ” Sean Curtis, SVP Engineering at Major League Baseball Advanced Engineering
  17. 17. Introducing Amazon Elasticsearch Service Amazon Elasticsearch Service is a managed service from AWS that makes it easy to set up, operate, and scale Elasticsearch clusters in the cloud.
  18. 18. Key benefits Easy cluster creation and configuration management Support for ELK Security with AWS IAM Monitoring with Amazon CloudWatch Auditing with AWS CloudTrail Integration options with other AWS services (CloudWatch Logs, Amazon DynamoDB, Amazon S3, Amazon Kinesis)
  19. 19. Create the cluster
  20. 20. AWS CLI commands add-tags create-elasticsearch-domain delete-elasticsearch-domain describe-elasticsearch-domain describe-elasticsearch-domain-config describe-elasticsearch-domains list-domain-names list-tags remove-tags update-elasticsearch-domain-config aws es create-elasticsearch-domain --domain-name my-domain --elasticsearch-cluster-config InstanceType=m3.xlarge.elasticsearch,InstanceCount=3 --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=512
  21. 21. Amazon ES domain overview Amazon Route 53 Elastic Load Balancing IAM CloudWatch Elasticsearch API CloudTrail
  22. 22. Amazon Route 53 Elastic Load Balancing IAM CloudWatch Elasticsearch API CloudTrail Amazon ES domain overview Nodes under management
  23. 23. IAM CloudWatchCloudTrail Elasticsearch API Amazon Route 53 Elastic Load Balancing Amazon ES domain overview Single endpoint, REST API
  24. 24. CloudWatchCloudTrail Elasticsearch API Amazon Route 53 Elastic Load Balancing IAM Amazon ES domain overview IAM integration
  25. 25. Elasticsearch API Amazon Route 53 Elastic Load Balancing IAM CloudWatchCloudTrail Amazon ES domain overview CloudWatch/CloudTrail for monitoring
  26. 26. Scale for your workload
  27. 27. Online scaling operations XUpdate
  28. 28. Data partitioning for search Shard 1 Shard 2 { { Id Id Id . . . Documents { Index • Document: The unit of search • ID: Unique identifier, one per document • Field: Documents comprise a collection of fields • Shard: An instance of Lucene with a portion of an index • Index: A collection of data
  29. 29. Deployment of indices to a cluster • Index 1 • Shard 1 • Shard 2 • Shard 3 • Index 2 • Shard 1 • Shard 2 • Shard 3 Amazon ES cluster 1 2 3 1 2 3 1 2 3 1 2 3 Primary Replica 1 3 3 1 Instance 1 2 1 1 2 Instance 2 3 2 2 3 Instance 3
  30. 30. Performance: single shard, single node Instance type (EBS Volume) Average Write (EBS) 1000 doc _bulks Average Read (EBS) vCPU RAM (GB) T2.micro (35GB) - (1.3) - (0.47) 1 1 T2.small (35GB) - (2.6) - (0.77) 1 2 T2.medium (35GB) - (4.2) - (1.3) 2 4 M3.medium (100GB) 2.95 (2.86) 1.31 (1.39) 1 3.75 M3.large (100GB) 6.35 (6.29) 2.81 (2.84 2 7.5 M3.xlarge (100GB) 11.6 (11.6) 4.62 (5.57) 4 15 M3.2xlarge (100GB) 18.45 (18) 11.32 (12.05) 8 30 R3.large (100GB) 5.72 (5.94) 2.86 (2.88) 2 15.25 R3.xlarge (100GB) 10.8 (10.5) 5.76 (5.79) 4 30.5 R3.2xlarge (100GB) 16.8 (16.5) 11.31 (11.38) 8 61 R3.4xlarge (100GB) 19.1 (19.2) 24.05 (24.66) 16 122 R3.8xlarge (100GB) 22.2 (21.8) 44 (47.29) 32 244 I2.xlarge (100GB) 10.8 (10.8) 5.09 (5.88) 4 30.5 I2.2xlarge (100GB) 17.8 (18.1) 10.05 (10.93) 8 61
  31. 31. Instance type recommendations Instance Workload T2 Entry point. Dev and test. OK for dedicated masters. M3 Equal read and write volumes. Up to 5 TB of storage with EBS. R3 Read-heavy or workloads with high query demands (e.g., aggregations). I2 Up to 16 TB of SSD instance storage.
  32. 32. Secure access to your domain
  33. 33. Secure access to your domain { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*" } ] }
  34. 34. Secure access to your domain Control access by user with signed requests { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*" } ] }
  35. 35. Secure access to your domain Allow/Deny HTTP methods and Config operations per policy { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*" } ] }
  36. 36. Secure access to your domain Fine-grained control to the index level { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*" } ] }
  37. 37. Secure access to your domain And/or use IP-based access control { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*", "Condition": "IpAddress": { "aws:SourceIp": [ "xx.xx.xx.xx/yy" ] } } ] }
  38. 38. Load data
  39. 39. Direct access to the Elasticsearch API $ curl -XPUT https://<endpoint>/blog -d '{ "settings" : { "number_of_shards" : 3, "number_of_replicas" : 1 } }' $ curl -XPOST http://<endpoint>/blog/post/1 -d '{ "author":"jon handler", "title":"Amazon ES Launch" }' $ curl -XPOST https://<endpoint>/blog/post/_bulk -d ' { "index" : { "_index" : "blog", "_type" : "post", "_id" : "2"}} {"title":"Amazon ES for search", "author": "pravin pillai"}, { "index" : { "_index":"blog", "_type":"post", "_id":"3" } } { "title":"Analytics too", "author": "vivek sriram"}' $ curl -XGET http://<endpoint>/_search?q=ES {"took":16,"timed_out":false,"_shards": {"total":3,"successful":3,"failed":0},"hits":{"total":2,"max_score":0.13424811,"hits": [{"_index":"blog","_type":"post","_id":"1","_score":0.13424811,"_source":{"author":"jon handler", "title":"Amazon ES Launch" }}, {"_index":"blog","_type":"post","_id":"2","_score":0.11506981,"_source": {"title":"Amazon ES for search", "author": "pravin pillai"},}]}}
  40. 40. Loading data using Logstash Application nodes/ Logstash forwarders Logstash indexer Amazon Elasticsearch Service
  41. 41. Logstash plugin for Amazon ES https://github.com/awslabs/logstash-output-amazon_es output { amazones { *hosts => ["foo.us-east-1.es.amazonaws.com"] *region => "us-east-1" access_key => 'ACCESS_KEY' (optional) secret_key => 'SECRET_KEY' (optional) codec => "plain" workers => 1 index => "logstash-%{+YYYY.MM.dd}" } }
  42. 42. Loading data using Lambda Amazon Lambda Amazon Elasticsearch Service Amazon S3 DynamoDB Amazon Kinesis
  43. 43. Lambda code snippet (node.js) for upload var AWS = require('aws-sdk'); var creds = new AWS.EnvironmentCredentials('AWS'); function postDocumentToES(doc, context) { var req = new AWS.HttpRequest(endpoint); var signer = new AWS.Signers.V4(req, 'es'); signer.addAuthorization(creds, new Date()); var send = new AWS.NodeHttpClient(); send.handleRequest(req, null, function(httpResp)... https://github.com/awslabs/amazon-elasticsearch-lambda-samples
  44. 44. Export logs to Amazon ES CloudWatch Amazon Elasticsearch Service
  45. 45. Export CloudWatch Logs Demo
  46. 46. Monitor and audit CloudWatch CloudTrail
  47. 47. Monitoring
  48. 48. What should I monitor? • FreeStorageSpace – monitor and alarm before the cluster runs out of space • CPUUtilization – alarm at 80% CPU to signal the need to scale up • ClusterStatus.yellow – check whether replication requires additional nodes • JVMMemoryPressure – check instance type and count for sufficient resources • MasterCPUUtilization – monitoring for master nodes is separated from data nodes
  49. 49. Snapshot and restore for data durability
  50. 50. Daily automated snapshots • No additional charges • Snapshots retained for 14 days
  51. 51. Taking manual snapshots Amazon S3 role Snapshot repository Trust relationship: { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "es.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
  52. 52. Taking manual snapshots Amazon S3 Snapshot repository { "Version":"2012-10-17", "Statement":[ { "Action":[ "s3:ListBucket" ], "Effect":"Allow", "Resource": [ "arn:aws:s3:::bucket" ] }, { "Action":[ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "iam:PassRole" ], "Effect":"Allow", "Resource":[ "arn:aws:s3:::bucket/*" ] } ] } role
  53. 53. Taking manual snapshots Register the bucket curl -XPUT http://<endpoint>/_snapshot/<repo-name> -d '{"type":"s3", "settings": { "bucket":"<bucket>", "region":"<region>", "role-arn":"<arn>"}}' Take a snapshot curl -XPUT http://<endpoint>/_snapshot/<repo-name>/snapshot1 Snapshot time is proportional to size.
  54. 54. Built-in Kibana
  55. 55. Application overview Logstash indexer Amazon Elasticsearch Service Application nodes/ Logstash forwarders
  56. 56. Kibana UI
  57. 57. Securing Kibana IAMProxy (Optional)
  58. 58. IAM policy for Kibana { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:ESHttpHead"], "Resource": [ "arn:aws:es:us-east-1:####:domain/<domain>/*" ], "Condition": { "IpAddress": { "aws:SourceIp": [ xx.xx.xx.xx ] } } } ] }
  59. 59. Pay for what you use
  60. 60. Pay for compute and storage you use With Amazon Elasticsearch Service, you pay only for the compute and storage resources you use. AWS Free Tier for qualifying customers.
  61. 61. Amazon Elasticsearch Service is publicly available now! • us-east-1 • us-west-1 • us-west-2 • eu-west-1 • eu-central-1 • ap-southeast-1 • ap-southeast-2 • ap-northeast-1 • sa-east-1 You can use Amazon Elasticsearch Service in these regions:
  62. 62. Wrap up 1. Elasticsearch is a tool for full-text search, analysis, and visualization of time series data that helps you get the most out of your growing data set 2. Amazon Elasticsearch Service makes it easy to deploy and manage an Elasticsearch cluster in the AWS cloud 3. Amazon Elasticsearch Service is a drop-in replacement for your existing Elasticsearch cluster
  63. 63. Thank you! aws.amazon.com/elasticsearch-service

×