Successfully reported this slideshow.
Your SlideShare is downloading. ×

Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis

Loading in …3

Check these out next

1 of 29 Ad

More Related Content

Slideshows for you (20)

Viewers also liked (20)


Similar to Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis (20)

More from Amazon Web Services (20)


Recently uploaded (20)

Managing Data with Voume Velocity, and Variety with Amazon ElastiCache for Redis

  1. 1. Michael Labib, Specialist Solutions Architect August 2016 Managing IoT and Time Series Data with Amazon ElastiCache for Redis
  2. 2. 2 Learning Objectives  Understand Time Series Data & Challenges  Learn about Amazon ElastiCache for Time Series Data  Learn how to build Time Series Solutions with Amazon ElastiCache  Explore a Sensor Data Demonstration
  3. 3. Time Series Data
  4. 4. Time Series Data – What is it?  Device / Sensor Data  Website Clickstream events  Logging and metrics  Social Media and sentiment analysis  Can be analyzed upon collection or batches 4
  5. 5. Time Series Data – Challenges  What type of database should you use?  Relational or NoSQL  How should you model the data to support the queries and analysis needed?  What types of aggregations will you need?  What type of information would do you need to gather?  How will the applications access the data?  How do you build the solution in a cost effective manner?  How long should you retain the data?  How do you manage scalability given the amount of data and sensors grow?  And so on… 5
  6. 6. Database Families Relational + Mature Technology + SQL is widely adopted + Durable and Consistent - Inflexible data model - Hard to scale - Performance limitations - Updates and other changes can be slow - Limited to speed of storage technology 1 = Row 2 = Column 6
  7. 7. Database Families NoSQL + ’Schema-less’ = flexible data model + Quickly adapt to new requirements + Easier to scale + Performance + Generally higher throughput and + lower latency than Relational DB - Can be less durable and consistent - Query tools less mature 7
  8. 8. Database Families NoSQL: In-Memory Key-Value + Very fast performance + Sub millisecond reads and writes + High throughput + Schema-Less – Flexible data model + Redis - Advanced Data Structures - Still emerging technology - Query and other tools less mature - Less connectors than other database options - Less durable 8
  9. 9. Request Rate High Low Latency Low High Structure Low High 9 Data Volume Low High Amazon RDS Amazon S3 Amazon Glacier Amazon CloudSearch and Amazon Elasticsearch Amazon DynamoDB Amazon ElastiCache HDFS
  10. 10. Ingesting Time Series Data with Amazon ElastiCache AWS Lambda Amazon Kinesis Streams AWS IoT Data Processor Streaming Real-time Data Device Data Amazon ElastiCache for Redis Datastore 1 Amazon EC2 10
  11. 11. Amazon ElastiCache Overview
  12. 12. In-Memory Key-Value Store High-performance Redis and Memcached Fully managed; Zero admin Highly Available and Reliable Hardened by Amazon 12
  13. 13. Redis – The In-Memory Leader Powerful ~200 commands + Lua scripting In-memory data structure server Utility data structures strings, lists, hashes, sets, sorted sets, bitmaps & HyperLogLogs Simple Atomic operations supports transactions has ACID properties Ridiculously fast! <1ms latency for most commands Highly Available replication Persistent snapshots or append-only log Open Source 13
  14. 14. Fully managed service = Automated Operations Redis datastore hosted on Amazon EC2 Amazon ElastiCache for Redis 14
  15. 15. ElastiCache - Customer Value Extreme Performance Sub-millisecond access latencies Engineered for Cloud Scale Open Source Compatible Compatible with Redis and Memcached Existing code will work when you update node end points Fully Managed Automates tasks such as failed node replacement, software patching, upgrades and backups CloudWatch enables you to monitor cache performance metrics Secure and Hardened Supports Amazon VPC and IAM for secure and fine grained access. Monitors your nodes and applies security patches when necessary Highly Available and Scalable Multi-AZ with automatic failover to a read replica, no human intervention required. Easily scale your Redis (vertically) and Memcached (horizontally) environments Cost Effective Pay as low as US$0.017 per hour. Get started with 750 free hours per month of a micro node for a year. No cross availability zone data transfer costs. 15
  16. 16. Building Time Series Solutions with Amazon ElastiCache
  17. 17. Streaming Data Amazon ElastiCache Amazon EC2 Data Sources AWS Lambda (Dashboard) 1 Amazon Kinesis Streams Amazon DynamoDB Hot Data Longer Retention 17
  18. 18. Streaming Data Analytics Data Sources 1 Amazon Kinesis Streams Amazon EMR (Spark Streaming) Amazon ElastiCache Amazon S3 Amazon EC2 (Dashboard) Amazon Redshift Spark Redis Connector Data Lake 18
  19. 19. Capturing Moving Data  Equity prices, clickstreams, sensor streams …  Amazon Kinesis and AWS Lambda are serverless  Amazon Kinesis + Amazon ElastiCache for Redis ensure order among incoming data  Kinesis provides an ordered Stream  Redis provides a Sorted Set  Fault Tolerant  Data Retention – Data accessible in Kinesis 24 hours after its been added to the stream and can be extended up to 7 days.  ElastiCache provides HA with automatic failover to a read replica  ElastiCache is fast and flexible  Can query and retrieve hot data quickly 19
  20. 20. ElastiCache for Streaming Data Analytics  Access all Redis Data structures directly from Spark  Significantly accelerate Spark performance over disk-based solutions  Simplify analytics by leveraging Redis Data Structures reducing processing overhead and complexity  Improve application performance by accessing aggregations and hot data from Redis 20
  21. 21. Event Driven Data Capture time ordered changes from a DynamoDB table using DynamoDB Streams and propagate to ElastiCache AWS Lambda Amazon DynamoDB Streams Amazon SES Amazon S3 Amazon Cognito Amazon SNS Amazon CloudWatch Amazon Kinesis Streams Amazon API Gateway Event Sources 21 Hot Data Amazon ElastiCache AWS LambdaLonger Retention Amazon DynamoDB
  22. 22. Responding to Events  Can cover a huge variety of different data sources  Security events, application events, messaging events, user actions…  For IoT, often created when a sensor crosses a threshold  Lambda captures the incoming event data and routes it  Hot data to ElastiCache  Cold data to slower, disk-based databases  Lambda function to keep databases synchronized  Cost effective  Use the speed and flexibility of ElastiCache for hot data queries ‒ No incremental cost per query – price is driven by amount of memory available  Use the DynamoDB or RDS for longer data retention 22
  23. 23. Device Data AWS IoT AWS IoT Device Amazon EC2 (Dashboard) AWS Lambda Hot Data Amazon ElastiCache Amazon DynamoDB Longer Retention Historical | Data Lake Amazon S3 Amazon Glacier Cold Data Amazon Kinesis Firehose 23
  24. 24. Sensors and Data Bursts  “We are all glorified motion sensors. Some things only become visible to us when they undergo change” – Vera Nazarian, author  Sensors enable detailed understanding of conditions  Quickly perceive and react to changing conditions  Device data can be streaming (time-based delivery) or event driven (threshold-based delivery)  Or both! A temperature sensor might deliver regular readings and also send an alert when it becomes too hot or too cold  Sensor data solutions must scale to meet loads  Sensor clusters can generate large data bursts  AWS IoT auto scales  ElastiCache for Redis allows flexible sizing to meet changing workloads 24
  25. 25. Data Modeling for Sensors  Sensor data is usually of high variety  Different sensors, and different versions of the same sensor, collect different data in different formats  A single device can have many sensors  The flexible schema of ElastiCache for Redis allows non- disruptive evolution of data models  Quickly test and use new or modified data models with no need to migrate schemas, rebalance shards or other onerous administrative tasks 25
  26. 26. Sensor Demo
  27. 27. Sensor Demo AWS IoT AWS IoT Device Amazon EC2 (Dashboard) AWS Lambda Amazon ElastiCache Intel Edison – Grove Temperature Sensor 27
  28. 28. Sensor Demo - Data Model SensorData timestamp deviceId temperature humidity climate timestamp Pub/Sub deviceId Hash If (temperature> 95 F) publish notification deviceIP requestId Strings (INCR/DECR) climate:warm climate:cool climate:hot climate:cold Time-Series events Sort by timestamp (score) Aggregations For dashboards, analytics, etc. Event details Data structures updated within a transaction block using Redis MULTI Which events occurred at a particular time range? What were the last N events? Which devices match a particular pattern? What is the temperature for a specific device? What is all the data for a specific device? What are the totals for each climate type? get climate:warm get climate:cool get climate:cold get climate:hot zrevrangebyscore SensorData +inf -inf WITHSCORES LIMIT 0 5 zrangebyscore SensorData (1410000000000 14400000000000 zscan SensorData 0 MATCH 2* COUNT 100 hget 2221 temperature hgetall 2221 Sorted Set 28
  29. 29. Summary  Time series data can present high volume, velocity and variety challenges  ElastiCache for Redis is scalable, performant and cost effective service for sensor data and other time series data sets 29