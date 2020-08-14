Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark © 2020, Amazon...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Agenda Why real-time analyti...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Why streaming analytics • Th...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential The of data diminishes over ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Cannot I just use batch big ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Cannot I just use batch big ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simple problem (until you ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simple big data problem (u...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simple streaming problem I...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simplish streaming problem...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A quite standard streaming p...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential An elastic and scalable stre...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential An almost real-life streamin...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A real business use case for...
© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved.
http://gunshowcomic.com/648
Probably less than you think ~20 lines of JAVA code (plus a few hundreds with imports, POJOs, and boilerplate, because JAV...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Streaming data pipeline over...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Durability and reliability N...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stateful processing Working ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Continuous and fast Data can...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Processing-Time based window...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Event-Time Based Windows Eve...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Session Windows Event Time P...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Correctness: Late-arriving d...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Correctness: Delivery semant...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Reactive All the components ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Streaming analytics componen...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Ingestion/in-stream storage:...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Ingestion/in-stream storage:...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Sp...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Sp...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Fl...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Fl...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Fl...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Storage: Apache Cassa...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Storage: Apache Cassa...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Storage: Apache HBase...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Dashboard: Elasticsearch wit...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Dashboard: Grafana Grafana a...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Challenges of data streaming...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Streaming real-time data wit...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for Ingestion/in-st...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for stream processi...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for stream storage ...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for analytics dashb...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A serverless data stream (pe...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Fully managed stateful strea...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Getting Started https://engi...
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark © 2020, Amazon...
Upcoming SlideShare
Loading in …5
×

Getting started with streaming analytics: streaming basics (1 of 3)

25 views

Published on

In this webinar we explain which are some of the problems of streaming analytics, and why they are different to batch/big data analytics. Then we go into introducing some basic streaming concepts, like event queues, event processors, event vs processing time, and delivery guarantees. We end this first part of the series presenting a few of the most common open source components for streaming (Kafka, Spark, Flink, Cassandra, or ElasticSearch) and we mention the different options you have to run them on AWS.

Published in: Data & Analytics
License: CC Attribution-NonCommercial-ShareAlike License
no profile picture user

  • Be the first to comment

  • Be the first to like this

Getting started with streaming analytics: streaming basics (1 of 3)

  1. 1. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Part 1 of 3: The basics of real-time streaming analytics Getting started with streaming analytics Javier Ramirez AWS Developer Advocate @supercoco9
  2. 2. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Agenda Why real-time analytics and data streaming? Challenges of streaming analytics Useful concepts to reason about streaming data Components of a streaming analytics pipeline Overview of popular Open Source components for streaming analytics: Apache Kafka, Apache Spark, Apache Flink, Apache Cassandra, Apache HBase, ElasticSearch AWS toolbox for streaming analytics: Amazon MSK, Amazon EMR, Amazon Kinesis, Amazon Keyspaces, Amazon DynamoDB, Amazon ElasticSearch
  3. 3. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Why streaming analytics • The number of “smart” devices is projected to be 200 billion by 2020 (over 100X increase in ten years) • 90% of the data in the world was generated in the last 2 years • There are 2.5 quintillion bytes of data created each day, and this pace is accelerating Source: BI Intelligence Estimates Source: Forbes – How much data do we produce Data streaming technology enables a customer to ingest, process, and analyze high volumes of high-velocity data from a variety of sources
  4. 4. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential The of data diminishes over time Source: Perishable insights, Mike Gualtieri, Forrester Real time Seconds Minutes Hours Days Months Valueofdatatodecision-making Preventive/predictive Actionable Reactive Historical Time-critical decisions Traditional “batch” business intelligence Information half-life in decision-making
  5. 5. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Cannot I just use batch big data analytics tools? https://aws.amazon.com/streaming-data/
  6. 6. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Cannot I just use batch big data analytics tools? Data is never complete You don’t know the volume of the data before you start Low-latency is expected Data can come out of order System should remain available during upgrades
  7. 7. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simple problem (until you know the details) I want to calculate the total and average of several numbers
  8. 8. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simple big data problem (until you know the details) I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive
  9. 9. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simple streaming problem I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive The dataset is not static, new numbers are coming all the time
  10. 10. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A simplish streaming problem I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive The dataset is not static, new numbers are coming all the time From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time
  11. 11. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A quite standard streaming problem I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive The dataset is not static, new numbers are coming all the time From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data
  12. 12. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential An elastic and scalable streaming problem I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive The dataset is not static, new numbers are coming all the time From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data Flow will not be constant (from few events per second to thousands)
  13. 13. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential An almost real-life streaming analytics scenario I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive The dataset is not static, new numbers are coming all the time From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data Flow will not be constant (from few events per second to thousands) And I don’t want just the total average, but total per month, per week, per day, per hour, per minute…
  14. 14. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A real business use case for streaming I want to calculate the total and average of several numbers They might be MANY numbers, more than you can store in memory, or in a single hard drive The dataset is not static, new numbers are coming all the time From different sensors, which are geo distributed and moving. We will be adding and removing sensors all the time And since they use 3G and batteries, some might go quiet for a while and then send a bunch of stale data Flow will not be constant (from few events per second to thousands) And I don’t want just the total average, but total per month, per week, per day, per hour, per minute… We need pretty dashboards with current status, comparison with the past, trends, and anomaly detection To run this reliably, we need advanced monitoring, alerts, and autoscaling No, I am not hiring a whole new operations team to manage the system
  15. 15. © 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved.
  16. 16. http://gunshowcomic.com/648
  17. 17. Probably less than you think ~20 lines of JAVA code (plus a few hundreds with imports, POJOs, and boilerplate, because JAVA) a simple GROUP BY statement in SQL with streaming extensions (plus a few lines of boilerplate for schema definition) OR
  18. 18. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Streaming analytics concepts
  19. 19. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Streaming data pipeline overview Ingest Transform Analyze React Persist • Durable • Stateful • Continuous • Fast • Correct • Reactive • Reliable What are the key requirements?
  20. 20. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Durability and reliability Need to store intermediate data You might want to be able to replay the stream Self-healing architecture. If one component goes down while data is in-flight, the system needs to re-balance and data needs to be reassigned seamlessly Monitoring
  21. 21. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stateful processing Working on per-element streams is relatively easy (i.e. change format of each item, or filter our records based on their own properties) 13:00 14:008:00 9:00 10:00 11:00 12:00 Processing Time Graphics from The Beam Model. By Tyler Akidau and Frances Perry. https://beam.apache.org/community/presentation-materials/ The real fun starts when you need to do transforms/ aggregations over groups of elements: group by, count, max, average, joins, filtering based on properties from related records, or complex pattern detection
  22. 22. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Continuous and fast Data can come in spikes, faster than we can process it. Need to account for reliable persistent storage while in- flight You will need to think how to update a system that never stops receiving data Since data is never complete, in the case of stateful computations, we need to decide when to output data (windowing)
  23. 23. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Processing-Time based windows 13:00 14:008:00 9:00 10:00 11:00 12:00 Processing Time Graphics from The Beam Model. By Tyler Akidau and Frances Perry. https://beam.apache.org/community/presentation-materials/
  24. 24. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Event-Time Based Windows Event Time Processing Time 11:0010:00 15:0014:0013:0012:00 11:0010:00 15:0014:0013:0012:00 Input Output Graphics from The Beam Model. By Tyler Akidau and Frances Perry. https://beam.apache.org/community/presentation-materials/
  25. 25. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Session Windows Event Time Processing Time 11:0010:00 15:0014:0013:0012:00 11:0010:00 15:0014:0013:0012:00 Input Output Graphics from The Beam Model. By Tyler Akidau and Frances Perry. https://beam.apache.org/community/presentation-materials/
  26. 26. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Correctness: Late-arriving data Event-time vs Processing-time Graphics from The Beam Model. By Tyler Akidau and Frances Perry. https://beam.apache.org/community/presentation-materials/
  27. 27. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Correctness: Delivery semantics • Exactly once • At least once • At most once
  28. 28. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Reactive All the components need to be designed for low-latency Source: Perishable insights, Mike Gualtieri, Forrester Real time Seconds Minutes Hours Days Months Valueofdatatodecision-making Preventive/predictive Actionable Reactive Historical Time-critical decisions Traditional “batch” business intelligence Information half-life in decision-making
  29. 29. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Components of a streaming analytics pipeline
  30. 30. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Streaming analytics components Devices and/or applications that produce real-time data at high velocity Data from tens of thousands of data sources can be written to a single stream Data are stored in the order they were received for a set duration of time and can be replayed indefinitely during that time Records are read in the order they are produced, enabling real-time analytics or streaming ETL Database (NoSQL most common), Message broker, Notification system, File Storage, or Data Lake ` Analytics dashboard
  31. 31. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark The (excellent) Open Source ecosystem
  32. 32. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Ingestion/in-stream storage: Apache Kafka A distributed streaming platform Concepts: Producers Topics Brokers Consumers
  33. 33. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Ingestion/in-stream storage: Apache Flume Distributed, reliable, and available service for collecting, aggregating, and moving large amounts of log data
  34. 34. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Spark Unified Analytics Engine for large-scale data processing Concepts: Driver/Workers Data Source Discretized Stream Transforms Streaming SQL Outputs
  35. 35. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Spark Unified Analytics Engine for large-scale data processing
  36. 36. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Flink Stateful computation over Data Streams Concepts: Job Manager/Workers Source DataStream Transforms/Operators TableAPI/SQL Sinks
  37. 37. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Flink Stateful computation over Data Streams
  38. 38. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Processing: Apache Flink Stateful computation over Data Streams
  39. 39. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Storage: Apache Cassandra Manage massive amounts of data, fast, without losing sleep https://cassandra.apache.org/ Concepts: Nodes Token Ring Consistency Levels Column Families
  40. 40. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Storage: Apache Cassandra Manage massive amounts of data, fast, without losing sleep
  41. 41. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Stream Storage: Apache HBase The Hadoop database, a distributed, scalable, big data store https://hbase.apache.org/book.html First, make sure you have enough data. If you have hundreds of millions or billions of rows, then HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS might be a better choice due to the fact that all of your data might wind up on a single node (or two) and the rest of the cluster may be sitting idle. Concepts: Hbase Master, Regions, Region Servers, Data Nodes, Column Families
  42. 42. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Dashboard: Elasticsearch with Kibana Elasticsearch is a distributed JSON-based search and analytics engine. Kibana gives shape to your data https://www.elastic.co/kibana Wikimedia has a live interactive dashboard powered by Kibana at https://wikimedia.biterg.io/ Concepts: Master Node Data Nodes Shard Index
  43. 43. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Dashboard: Grafana Grafana allows you to query, visualize, alert on and understand your metrics no matter where they are stored. https://grafana.com/grafana/ Wikimedia also has a live interactive metrics dashboard powered by Grafana at https://grafana.wikimedia.org/ Concepts: Data Source
  44. 44. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Challenges of data streaming components Difficult to setup Tricky to scale Hard to achieve high availability Integration required development Error prone and complex to manage Expensive to maintain
  45. 45. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark AWS services for streaming analytics Both managed services and native services
  46. 46. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Streaming real-time data with AWS * Some services scale up and down elastically, while others allow you to automate when to scale up/down ** It is possible to have a serverless data streaming pipeline, in which you pay only for what you use. In the case of managed non-serverless services, you can dynamically adapt to your traffic
  47. 47. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for Ingestion/in-stream storage Amazon Managed Streaming for Apache Kafka Fully managed version of Apache Kafka Amazon Kinesis Data Streams Massively scalable, elastic, and durable real-time data streaming Amazon Kinesis Data Firehose Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services. AWS Glue with serverless streaming Simple, flexible, and cost-effective ETL
  48. 48. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for stream processing Amazon Kinesis Data Analytics for Apache Flink Fully managed, elastic, version of Apache Flink Amazon Kinesis Data Analytics for SQL Applications Process and analyze streaming data using standard SQL Amazon EMR Easily run and scale Apache Spark and other big data frameworks. You can also run Apache Flink and Apache HBase on EMR AWS Glue with serverless streaming Simple, flexible, and cost-effective ETL. Supports Spark for serverless ETL
  49. 49. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for stream storage Amazon Keyspaces for Apache Cassandra Scalable, highly available, and managed Apache Cassandra compatible db service Amazon DynamoDB Fast and flexible NoSQL database service for any scale (for example, in 2017 Samsung Cloud Service was serving 300M users with a total storage of 860TB) Amazon EMR Easily run and scale Apache HBase and other big data frameworks. You can also run Apache Flink and Apache Spark on EMR
  50. 50. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Services for analytics dashboards Amazon Elasticsearch Service Fully managed, scalable, and secure Elasticsearch service Amazon Quicksight Fast, cloud-powered business intelligence service that makes it easy to deliver insights to everyone in your organization.
  51. 51. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential A serverless data stream (per element processing) data producer Kinesis Data Streams Amazon SNS Continuously stream data Lambda service Lambda functionA Lambda function B Continuously polls for new data, 1 poll per second Automatically invokes your function(s) when data found DynamoDB
  52. 52. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Fully managed stateful streaming analytics
  53. 53. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Getting Started https://engineering.linkedin.com/distributed-systems/log-what-every-software- engineer-should-know-about-real-time-datas-unifying A great write-up on streaming analytics challenges https://aws.amazon.com/streaming-data/ Streaming data https://docs.aws.amazon.com/msk/latest/developerguide/what-is-msk.html Getting started with Apache Kafka/Amazon MSK https://aws.amazon.com/kinesis/ Amazon Kinesis Services for streaming data https://aws.amazon.com/elasticsearch-service/ Amazon ElasticSearch Service https://dl.acm.org/doi/10.1145/543613.543615 Research about Models and Issues in data stream systems
  54. 54. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential ThanksJavier Ramirez AWS Developer Advocate @supercoco9

×