Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ABD321_Don’t Wait Until Tomorrow How to Use Streaming Data to Gain Real-time Insights into Your Business

1,156 views

Published on

In recent years, there has been an explosive growth in the number of connected devices and real-time data sources. Because of this, data is being produced continuously and its production rate is accelerating. Businesses can no longer wait for hours or days to use this data. To gain the most valuable insights, they must use this data immediately so they can react quickly to new information. In this workshop, you learn how to take advantage of streaming data sources to analyze and react in near real-time. You are presented with several requirements for a real-world streaming data scenario and you're tasked with creating a solution that successfully satisfies the requirements using services such as Amazon Kinesis, AWS Lambda and Amazon SNS.

  • Be the first to comment

ABD321_Don’t Wait Until Tomorrow How to Use Streaming Data to Gain Real-time Insights into Your Business

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Don’t Wait Until Tomorrow: How to Use Streaming Data to Gain Real- time Insights into Your Business A B D 3 2 1
  2. 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A g e n d a 3 : 1 5 P M - 5 : 4 5 P M I n t r o t o s t r e a m i n g d a t a o n A W S S e c t i o n 1 : C a p t u r e a n d V i s u a l i z e r e a l - t i m e s e n s o r d a t a ( 1 h o u r ) S e c t i o n 2 : R e a l - t i m e A n a l y s i s + A l e r t i n g w i t h K i n e s i s A n a l y t i c s ( 1 h o u r )
  3. 3. Data is produced continuously Mobile Apps Web Clickstream Application Logs Metering Records IoT Sensors Smart Buildings [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/h tdocs/test
  4. 4. The diminishing value of data
  5. 5. Amazon Kinesis makes it easy to work with real-time streaming data Amazon Kinesis Streams • For technical developers • Collect and stream data for ordered, replayable, real-time processing Amazon Kinesis Firehose • For all developers, data scientists • Easily load massive volumes of streaming data into Amazon S3, Amazon Redshift, Amazon ES Amazon Kinesis Analytics • For all developers, data scientists • Easily analyze data streams using standard SQL queries
  6. 6. Amazon Kinesis Streams • Reliably ingest and durably store streaming data at low cost • Build custom real-time applications to process streaming data
  7. 7. Amazon Kinesis Firehose • Reliably ingest and deliver batched, compressed, and encrypted data to S3, Amazon Redshift, and Amazon ES • Point and click setup with zero administration and seamless elasticity
  8. 8. Amazon Kinesis Analytics • Interact with streaming data in real-time using SQL • Build fully managed and elastic stream processing applications that process data for real-time visualizations and alarms
  9. 9. Amazon Kinesis Data Producers SDKs • Publish directly from application code via PutRecord and PutRecords APIs Kinesis Agent • Tail log files and forward lines as messages to Kinesis Streams Kinesis Producer Library (KPL) • Background process aggregates and batches messages • Producer application calls PutUserRecord method Third-party and open source • Log4j appender • Flume, fluentd source libraries
  10. 10. Amazon Kinesis Data Consumers Direct API access • Custom application, using GetShardIterator and GetRecords APIs • Application responsible for shard processing, check-points, reshard operations Kinesis Client Library (KCL) • Open source library, available in several languages • Manages stream checkpointing • Manages shard-worker relationships on reshard, or consumer instance scaling AWS Lambda • Serverless stream processing • Lambda function is invoked only when messages exist on stream • One Lambda function instance per shard Third-party and open source • Spark Streaming • Storm Spout
  11. 11. Tools: http://bit.ly/2okWPnH
  12. 12. Workshop Questions 1. Utilization: What is the busiest toll station? 2. Promotions: Who are the most active users? 3. Support: Detect failing Toll sensors.
  13. 13. Resources Workshop Quick start http://amzn.to/2zO4J2I
  14. 14. Recap Data Visualization example
  15. 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Real-Time Analytics & Alerting Section 2:
  16. 16. Connect to streaming source • Streaming data sources include Kinesis Firehose or Kinesis Streams • Input formats include JSON, .csv, variable column, unstructured text • Each input has a schema; schema is inferred, but you can edit • Reference data sources (S3) for data enrichment Amazon Kinesis Analytics Core Concepts
  17. 17. Write SQL code • Build streaming applications with one-to- many SQL statements • Robust SQL support and advanced analytic functions • Extensions to the SQL standard to work seamlessly with streaming data • Support for at-least-once processing semantics Amazon Kinesis Analytics Core Concepts
  18. 18. Continuously deliver SQL results • Send processed data to multiple destinations • Amazon S3, Amazon Redshift, Amazon ES (through Firehose) • Streams (with AWS Lambda integration for custom destinations) • End-to-end processing speed as low as sub- second • Separation of processing and data delivery Amazon Kinesis Analytics Core Concepts
  19. 19. Kinesis Firehose Ingestion Stream Kinesis Analytics Kinesis Anomaly Stream Lambda Function S3 Bucket Sample Architecture Toll Data (From KDG) SMS Alerts on Anomaly Anomaly Detection
  20. 20. Streaming Analytics Demo
  21. 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU! A d d i t i o n a l R e s o u r c e s : h t t p : / / a m z n . t o / 2 A U a P v i W h i t e p a p e r : “ S t r e a m i n g D a t a S o l u t i o n s o n A W S ”

×