Successfully reported this slideshow.
Your SlideShare is downloading. ×

ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 33 Ad

ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis

Download to read offline

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we present an end-to-end streaming data solution using Kinesis Streams for data ingestion, Kinesis Analytics for real-time processing, and Kinesis Firehose for persistence. We review in detail how to write SQL queries using streaming data and discuss best practices to optimize and monitor your Kinesis Analytics applications. Lastly, we discuss how to estimate the cost of the entire system.

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we present an end-to-end streaming data solution using Kinesis Streams for data ingestion, Kinesis Analytics for real-time processing, and Kinesis Firehose for persistence. We review in detail how to write SQL queries using streaming data and discuss best practices to optimize and monitor your Kinesis Analytics applications. Lastly, we discuss how to estimate the cost of the entire system.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis (20)

Advertisement

More from Amazon Web Services (20)

ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Analyzing Streaming Data in Real Time with Amazon Kinesis R y a n N i e n h u i s , S e n i o r P r o d u c t M a n a g e r , A m a z o n K i n e s i s N o v e m b e r 2 0 1 7
  2. 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hourly server logs Weekly or monthly bills Daily web-site clickstream Daily fraud reports Real time metrics Real time spending alerts/caps Real time clickstream analysis Real time detection It’s All About the Pace Batch Processing Stream Processing
  3. 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why? Data loses value over time Ingest data as it is generated Analyze data in real time to get insights immediately Deliver data to in seconds instead of hours
  4. 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Simple Pattern for Streaming Data Continuously creates data Continuously writes data to a stream Can be almost anything Data Producer Durably stores data Provides temporary buffer that preps data Supports very high- throughput Streaming Service Continuously processes data Cleans, prepares, & aggregates Transforms data to information Data Consumer Mobile Client Amazon Kinesis Amazon Kinesis app
  5. 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Amazon Kinesis Data Streams Amazon Kinesis Data Analytics Amazon Kinesis Data Firehose Build custom applications that process and analyze streaming data Easily process and analyze streaming data with standard SQL Easily load streaming data into AWS
  6. 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Streams • Easy administration and low cost • Build real time applications with framework of choice • Secure, durable storage
  7. 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Analytics • Powerful real time applications • Easy to use, fully managed • Automatic elasticity
  8. 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Firehose • Zero administration and seamless elasticity • Direct-to-data store integration • Serverless, continuous data transformations Amazon S3 Amazon Redshift
  9. 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Analytics Applications Easily write SQL code to process streaming data Connect to streaming source Continuously deliver SQL results
  10. 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Common Use Cases
  11. 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming Ingest- Transform-Load Continuous Metric Generation Actionable Insights Three Common Scenarios Compute analytics as the data is generated React to analytics based off of insights Deliver data to analytics tools faster and cheaper
  12. 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Web Analytics and Leaderboards Amazon Kinesis Data Analytics AWS Lambda function Amazon Cognito Lightweight JS client code Web Server on Amazon EC2 Instance OR Amazon DynamoDB Table Amazon Kinesis Data Streams Compute top 10 usersIngest web app data Persist to feed live apps
  13. 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring IoT Devices IoT sensors AWS IoT Amazon RDS MySQL DB instance Amazon Kinesis Data Streams Amazon Kinesis Data Analytics AWS Lambda function Compute avg temp every 10 sec Ingest sensor data Persist time series analytic to database
  14. 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Analyzing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon Kinesis Data Analytics AWS Lambda function Amazon S3 bucket for raw data Amazon S3 bucket for processed data Amazon DynamoDB Table(s) Chart.JS Dashboard Compute operational metrics Ingest and deliver raw log data Deliver to a real time dashboards and archival Amazon Kinesis Data Firehose Amazon Kinesis Data Firehose
  15. 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Dive into Analyzing CloudTrail Event Logs
  16. 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ingest and deliver CloudTrail events • CloudTrail provides continuous account activity logging • Events are sent in real time (to near real time) to Kinesis Data Firehose or Streams • Each event includes a timestamp, IAM user, AWS service name, API call, response, and more AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Kinesis Data Firehose
  17. 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Automatic ingestion Easy setup Write your own Stream Data to Amazon Kinesis Amazon VPC Flow Logs Elastic Load Balancing Amazon RDS Amazon CloudWatch Logs AWS CloudTrail Event Logs Amazon Pinpoint Amazon API Gateway AWS IoT events AWS SDKs Amazon DynamoDB Amazon Kinesis Agent Amazon Kinesis Producer Library As a proxy: For change data capture: Just a sample… many more ways stream data to Amazon Kinesis
  18. 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Compute operational metrics in real time Compute metrics using SQL in real time like: • Total calls by IP, service, API call, IAM user • Amazon EC2 API failures (or any other service) • Anomalous behavior of Amazon EC2 API (or any other service) • Top 10 API calls across all services Amazon Kinesis Data Analytics Raw data Real time analytics
  19. 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do I write streaming SQL? Easy! Streams (in memory tables) CREATE STREAM calls_per_ip_stream( eventTimeStamp TIMESTAMP, computationType VARCHAR(256), category VARCHAR(1024), subCategory VARCHAR(1024), unit VARCHAR(256), unitValue BIGINT );
  20. 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do I write streaming SQL? Easy! Pumps (continuous query) CREATE OR REPLACE PUMP calls_per_ip_pump AS INSERT INTO calls_per_ip_stream SELECT STREAM "eventTimestamp", COUNT(*), "sourceIPAddress" FROM source_sql_stream_001 ctrail GROUP BY "sourceIPAddress", STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  21. 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do we aggregate streaming data? • Aggregations (count, sum, min,…) take granular real time data and turn it into insights • Data is continuously processed so you need to tell the application when you want results Windows!
  22. 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Window Types • Sliding, tumbling, and custom windows • Tumbling windows are fixed size and grouped keys do not overlap
  23. 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Event, ingest, and processing time • Event time is the timestamp is assigned when the event occurred, also called client-side time. • Processing time is when your application reads and analyzes the data (ROWTIME). … GROUP BY "sourceIPAddress", /* Trigger for results */ STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), /* A timestamp grouping key */ STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  24. 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Persist data for real time dashboards • Use Kinesis Data Firehose to archive processed to in S3 • Use AWS Lambda to deliver data to DynamoDB (or another database) • Open source or other tools to visualize the data Real time analytics AWS Lambda function Amazon S3 bucket for processed data Amazon DynamoDB Table(s) Chart.JS Dashboard Amazon Kinesis Data Firehose
  25. 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Late results • An event is late if it arrives after the computation for which it logically belongs to has been completed • Your Kinesis Analytics application will produce an amendment … GROUP BY "sourceIPAddress", /* Trigger for results */ STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), /* A timestamp grouping key */ STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  26. 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Updating a database • Perform inserts but on duplicate key update • For DyanamoDB, here is the AWS Lambda code: … GROUP BY "sourceIPAddress", /* Trigger for results */ STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), /* A timestamp grouping key */ STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  27. 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What does all this cost? • All services used in the solution are pay as you go • All services used are serverless and have lower devops expense • This solution will cost the “average” customer less than: $100 per month
  28. 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Where do go next?
  29. 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Try it out yourself Go to aws.amazon.com/kinesis/ Some good examples: • Get started in minutes with a clickthrough template for AWS CloudTrail Event Log Analytics - <link> (friendly URL) • Tinyurl.com/rt-dashboard • Great blog posts with example use cases
  30. 30. Lots of customer examples 1 billion events/wk from connected devices | IoT 17 PB of game data per season | Entertainment 80 billion ad impressions/day, 30 ms response time | Ad Tech 100 GB/day click streams from 250+ sites | Enterprise 50 billion ad impressions/day sub-50 ms responses | Ad Tech 10 million events/day | Retail Amazon Kinesis as Databus - Migrate from Kafka to Kinesis| Enterprise Funnel all production events through Amazon Kinesis
  31. 31. Integrate with your current solutions
  32. 32. Get help from partner systems integrators
  33. 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU!

×