Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Elasticsearch 5 in Amazon Elasticsearch Service


Published on

Elasticsearch is a popular tool for log analytics, full text search, application monitoring, and other analytics use cases. Amazon Elasticsearch Service delivers Elasticsearch’s easy-to-use APIs and real-time capabilities along with the availability, scalability, and security required by production workloads. In this tech talk, we will provide an overview of Amazon Elasticsearch Service and review the new features in Elasticsearch 5 and Kibana 5.

Learning Objectives:
• Get an overview of Amazon Elasticsearch Service and latest features including support for ES5
• Understand how to take advantage of the new Elasticsearch 5 features

Published in: Technology
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website!
    Are you sure you want to  Yes  No
    Your message goes here

Elasticsearch 5 in Amazon Elasticsearch Service

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch 5 in Amazon Elasticsearch Service Darin Briskman Amazon Web Services Technical Evangelist or @briskmad 15 Feb 2017 Jon Handler AWS Principal Solutions Architect or @_searchgeek
  2. 2. Get started at Amazon Search Services Amazon CloudSearch Amazon Elasticsearch Service
  3. 3. Get started at Open Source Distributed Index Managed Service using Elasticsearch and Kibana Fully managed; Zero admin Highly Available and Reliable RESTful API for easy integration Amazon Elasticsearch Service
  4. 4. Get started at Amazon Elasticsearch Service Leading Use Cases Log Analytics & Operational Monitoring • Monitor the performance of applications, web servers, and hardware • Easy to use, powerful data visualization tools to detect issues quickly • Dig into logs in an intuitive, fine-grained way • Kibana provides fast, easy visualization Search • Application or website provides search capabilities over diverse documents • Tasked with making this knowledge base searchable and accessible • Text matching, faceting, filtering, fuzzy search, auto complete, highlighting, and other search features • Query API to support application search
  5. 5. Leading enterprises trust Amazon Elasticsearch Service for their search and analytics applications Media & Entertainment Online Services Technology Other
  6. 6. Get started at Adobe Developer Platform (Adobe I/O) P R O B L E M • Cost effective monitor for XL amount of log data • Over 200,000 API calls per second at peak - destinations, response times, bandwidth • Integrate seamlessly with other components of AWS eco-system. S O L U T I O N • Log data is routed with Amazon Kinesis to Amazon Elasticsearch Service, then displayed using AES Kibana • Adobe team can easily see traffic patterns and error rates, quickly identifying anomalies and potential challenges B E N E F I T S • Management and operational simplicity • Flexibility to try out different cluster config during dev and test Amazon Kinesis Streams Spark Streaming Amazon Elasticsearch Service Data Sources 1
  7. 7. Get started at McGraw Hill Education P R O B L E M • Supporting a wide catalog across multiple services in multiple jurisdictions • Over 100 million learning events each month • Tests, quizzes, learning modules begun / completed / abandoned S O L U T I O N • Search and analyze test results, student/teacher interaction, teacher effectiveness, student progress • Analytics of applications and infrastructure are now integrated to understand operations in real time B E N E F I T S • Confidence to scale throughout the school year. From 0 to 32TB in 9 months • Focus on their business, not their infrastructure
  8. 8. Get started at Easy to Use Deploy a production-ready Elasticsearch cluster in minutes Simplifies time-consuming management tasks such as software patching, failure recovery, backups, and monitoring Open Get direct access to the Elasticsearch open-source API Fully compatible with the open source Elasticsearch API, for all code and applications Secure Secure Elasticsearch clusters with AWS Identity and Access Management (IAM) policies with fine-grained access control access for users and endpoints Automatically applies security patches without disruption, keeping Elasticsearch environments secure Available Provides high availability using Zone Awareness, which replicates data between two Availability Zones Monitors the health of clusters and automatically replaces failed nodes, without service disruption AWS Integrated Integrates with Amazon Kinesis Firehose, AWS IOT, and Amazon CloudWatch Logs for seamless data ingestion AWS CloudTrail for auditing, AWS Identity and Access Management (IAM) for security, and AWS CloudFormation for cloud orchestration Scalable Scale clusters from a single node up to 20 nodes Configure clusters to meet performance requirements by selecting from a range of instance types and storage options including SSD-powered EBS volumes Amazon Elasticsearch Service Benefits
  9. 9. Get started at Easy to use and scalable AWS SDK AWS CLI AWS CloudFormation Elastic Load Balancing AWS IAM Amazon CloudWatch AWS CloudTrail
  10. 10. Get started at Open • Drop-in replacement • Zero-change, no-risk migration to or from open source Elasticsearch
  11. 11. Get started at Secure • Control access based on originating IP or Principal • Mix policies to provide application access and Kibana access • Use IAM roles to provide access for other services
  12. 12. Get started at Available Amazon Elasticsearch Service cluster 1 3 Instance 1 2 1 2 Instance 2 3 2 1 Instance 3 Availability Zone 1 Availability Zone 2 2 1 Instance 4 3 3
  13. 13. Get started at Logstash REST CWL Agent EC2 Instances Amazon Kinesis Amazon RDS Amazon DynamoDB Amazon SQS Queue Logstash Cluster Amazon Elasticsearch Service Amazon CloudWatch AWS Lambda AWS CloudTrail Access Logs Amazon VPC Flow Logs Amazon S3 bucket AWS IoT Amazon Kinesis Firehose AWS integrated Amazon ECS
  14. 14. Dedicated master nodes improve stability Amazon ES cluster 1 3 3 1 Instance 1 2 1 1 2 Instance 2 3 2 2 3 Instance 3Dedicated master nodes Data nodes: queries and updates
  15. 15. Get started at Firehose delivery architecture with transformations intermediate Amazon S3 bucket backup S3 bucket source records data source source records Amazon Elasticsearch Service Firehose delivery stream transformed records transformed records transformation failure delivery failure
  16. 16. Get started at Repository Search • File metadata and possibly file contents for traditional search • Lambda to keep the repository current • Good for up to ~60TB of metadata/source data (current limits) See also: Indexing S3 Metadata blog post by Amit Sharma
  17. 17. Amazon Elasticsearch Service support for Elasticsearch 5
  18. 18. Get started at What to do with a terabyte of logs?
  19. 19. Get started at Visualize it with Kibana 5!
  20. 20. Get started at Scripting with Amazon Elasticsearch Service Scripting is fully supported using the Painless language. With scripts you can • Change the precedence of search results • Delete index fields by query • Modify search results to return specific fields • Alter elements in a field Painless is explicitly designed for Elasticsearch and is both performant and secure.
  21. 21. Get started at Ingest Pipelines and Processors When you index documents, you can specify a pipeline. The pipeline can have a series of processors that pre-process the data before indexing. Twenty processors are available, some are simple: { "append": { "field": "field1" "value": ["item2", "item3", "item4"] } } Others are more complex, like the Grok processor for regex with aliased expressions.
  22. 22. Get started at Lots of New Elasticsearch APIs /_alias /_aliases /_all /_analyze /_bulk /_cache/clear (Index only) /_cat /_cluster/allocation/explain /_cluster/health /_cluster/pending_tasks /_cluster_settings (PUT only): indices.breaker.fielddata.limit indices.breaker.request.limit /_cluster/state /_cluster/stats /_count /_delete_by_query* /_explain /_field_stats /_flush /_forcemerge (Index only) /_mapping /_mget /_msearch /_mtermvectors /_nodes /_plugin/kibana /_recovery (Index only) /_refresh /_reindex* /_rollover /_search /_search profile /_segments (Index only) /_shard_stores /_shrink /_snapshot /_stats /_status /_tasks /_template /_termvectors /_update_by_query* /_validate
  23. 23. Get started at Shrink and Rollover Shrink an index to a single shard: POST source_index/_shrink/target_index Very useful for time-series indexes once ingestion is done! Rollover an index based on number of documents: POST logs_index/_rollover { "conditions": {"max_docs": 100000 } }
  24. 24. Get started at Supported Elasticsearch 5 Plugins • Smart Chinese Analysis plugin • Stempel Polish Analysis plugin • Ingest Processor Attachment plugin • Ingest Geoip Processor Plugin • Ingest User Agent Processor plugin • Mapper Murmur3 Plugin 中文 Polskie
  25. 25. Get started at Testing Ingest Performance • Load generator • m4.large, single process, single thread • Amazon Elasticsearch Service • 1 instance, 1 primary, no replicas, EBS gp2 storage • Data • 1.8m apache web log lines, comprising 196 MB • _bulk API calls with 10K lines per call • Monitoring data gathered from load generator process and from the Amazon Elasticsearch Service domain
  26. 26. Get started at Amazon Elasticsearch Service with v2.3 Engine Instance Avg Index Docs/sec m3.medium 3.93 ms 2811 m3.2xlarge 11.83 ms 3966 r3.large 8.87 ms 3932 r3.8xlarge 10.58 ms 4404 I2.2xlarge 11.2 ms 5305 Ingest Performance Test Results Instance Avg Index Docs/sec m3.medium 3.12 ms 3629 m3.2xlarge 11.1 ms 5816 r3.large 8.76 ms 7221 r3.8xlarge 9.59 ms 7726 I2.2xlarge 10.3 ms 9676 Amazon Elasticsearch Service with v5.1 Engine Up to 82% more documents per second!
  27. 27. Get started at Migrating from v2.3 to v5.1 The easy way: 1. Create a new Amazon Elasticsearch Service v5.1 cluster 2. Snapshot your v2.3 indexes 3. Restore the indexes to the v5.1 cluster … but this won’t get most of the benefits of v5.1 There are many breaking changes in v5, documented at
  28. 28. Get started at Three Things to Remember • Amazon Elasticsearch Service is a drop-in replacement for new and existing Elasticsearch workloads • Deploy, manage, and scale Elasticsearch more easily in the AWS cloud • Support for Elasticsearch 5.1 brings scripting, additional plugins and additional performance to Amazon Elasticsearch Service
  29. 29. Get started at Find out more: AWS Centralized Logging: Elasticsearch at the AWS Database Blog: Or ask your Solutions Architect! Amazon Elasticsearch Service