Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Real Time Analytics On AWS: Optimized Architectures

2,602 views

Published on

Building a real-time analytics solution has never been faster or more cost-efficient. Most organizations are trying to find a way to improve customer experience and respond to business events in real time. Importantly, to do this quickly and at a fraction of the price of traditional approaches. In this session we will look at how to use the AWS services to best meet your real-time analytics needs.

Published in: Technology

Real Time Analytics On AWS: Optimized Architectures

  1. 1. Capturing windows of opportunity Optimized real-time analytics architectures Craig Stires Head, Big Data and Analytics, AWS APAC
  2. 2. Content • Building architectures to fit your windows of opportunity • Optimizing on performance vs cost
  3. 3. "Real-time" comes in all shapes and sizes
  4. 4. Zone detection window: 2 sec source: cctv Account debit window: 200 msec source: gate + db Train arrival window: 10 sec source: track sensor Balance alerts window: 3 sec source: db + gps Climate control window: 5 min source: thermo + bms Flow detection window: 2 min source: wifi
  5. 5. Different categories of real-time systems Continuous Feeds Contextual Events Massive Concurrency
  6. 6. A major Telco services provider in Singapore This telco services millions of customers in Singapore, and holds stake in numerous other telcos in the Asia Pacific region A significant portion of revenue comes from mobile subscribers, and the renewal period is critical in limiting customer churn The previous launch of iPhone 5 was considered far from successful, internally https://aws.amazon.com/solutions/case-studies/ Major Singapore Telco
  7. 7. Consume/ visualize Store Process/ analyze Data Amazon S3 Data storage Outcomes & Insights Provision all phones on launch Schedule pickup Concurrency at 100s thousand Major Singapore Telco Amazon EC2 Web servers Elastic Load Balancing Deliver/ serve Amazon ElastiCache Amazon DynamoDB Amazon S3 Site content Amazon CloudFront Amazon Route 53 Inventory Site orders Amazon CloudWatch
  8. 8. Telco successfully launches and clears all iPhone 6 in two minutes . The launch went smoothly. But, the system was actually too fast -- all phones were gone in two minutes! Launch Manager ” “ • All iPhone 6 allocated within 2 minutes • Launch concurrency prepared for 100s of thousands • No phones misallocated or unallocated • Management happy to not repeat failed launch of iPhone 5 • Even with significant over provisioning of infrastructure, the cost savings compared to on- premises was significant Major Singapore Telco
  9. 9. Contextual Events Different categories of real-time systems Continuous Feeds Massive Concurrency
  10. 10. A global leader in retargeting AdRoll services more than 20,000 advertisers over 100 countries AdRoll serves more than 60 billion impressions every day, each in less than 100ms The real-time bidding system is self service and enables ROI positive marketing initiatives https://aws.amazon.com/solutions/case-studies/adroll/
  11. 11. Ingest/ Collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Amazon S3 Data Storage Amazon Kinesis Outcomes & Insights <100ms ad service cross-site ID retargeting self-service real time bidding … Amazon DynamoDB Amazon EC2 Amazon Kinesis Amazon EC2 Apache HBase Amazon CloudWatch Real-time bidding Ad service Web logs / Cookies Bidding data Performance metrics
  12. 12. AdRoll Builds Bidding Platform on AWS and Cuts Costs by 83% AdRoll is a global leader in digital advertising retargeting products. We’ve been able to seamlessly scale our infrastructure and reduce our fixed costs by 75% and operational costs by 83%. Valentino Volonghi CTO, AdRoll ” “ • Reduced annual operational costs by 83% • Reduced fixed costs by 75% • System can now handle 2 million transactions / second at peak • Staff now 95% focused on new product development • 98% of visitors leave without converting, retargeting using AWS enables an 85+% increase in return conversions • AdRoll manages its Real-Time Bidding platform using Amazon EC2, Amazon Dynmo DB, and Amazon S3
  13. 13. Contextual Events Different categories of real-time systems Continuous Feeds Massive Concurrency
  14. 14. Provides video monitoring hardware and software Dropcam is the largest inbound video service on the internet, with more data uploaded per minute than YouTube Consumers and small businesses can use Dropcam’s video platform to monitor homes, offices, or pets Dropcam analyzes 8M activities / day with video activity recognition; this translates to processing 2.5 years of video every day -- adding camera video recording (CVR) to record sessions and add cuepoints https://aws.amazon.com/solutions/case-studies/dropcam/
  15. 15. Data Outcomes & Insights Upsell to recorded data product Video Analysis Activity Recognition A/B Testing - Wifi drivers … Web servers Amazon ELB Amazon DynamoDB Site content Amazon CloudFront Amazon Route 53 Video feeds Site / apps Amazon CloudWatch Amazon EC2 Nexus streaming servers Amazon EMR Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 CVR data Amazon S3 Formatted video Camera Vision Recording Recorded / streaming video
  16. 16. Dropcam Delivers Streaming Video Content in Milliseconds Using AWS Drocpam provides video monitoring hardware and software for customers to access over the internet Using AWS, we can add capacity in minutes instead of days. Greg Nelson VP of Software Engineering, Dropcam ” “ • Dropcam provides a video monitoring service to monitor homes and small businesses • As the company grew, storage for video feeds became its biggest issue • Dropcam has detected billions of motion events and is processing more than one PB of data every month • By using AWS, Dropcam can add capacity in minutes and reduced delivery time for video events from 10 seconds to less than 50 milliseconds
  17. 17. Content • Building architectures to fit your windows of opportunity • Optimizing on performance vs cost
  18. 18. Ingest/ Collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Outcomes& Insights START HERE WITH REQUIREMENTS • Website must serve 1000s concurrently, with apps and video streaming • Users will have on-demand views of their account and ratings • Clickstream data will trigger customized content for each user Business case determines platform design
  19. 19. Optimize content delivery ...offload your web servers
  20. 20. Data Outcomes & Insights Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 https://aws.amazon.com/pricing/services/ Amazon CloudWatch Monitoring Free - 10 Metrics, 10 Alarms, and 1,000,000 API requests + 3 dashboards of up to 50 metrics each per month at no additional charge Amazon S3 Object storage $0.03 / GB $0.004 / 10,000 GET Prices drop after 1TB Free transfer to CloudFront 99.999999999% durability Amazon CloudFront Content delivery $0.085 / GB $0.0075 / 10,000 HTTP Prices drop after 10TB Amazon Route 53 Traffic routing $0.50 / zone $0.60 / 1M latency- based routing queries Free health checks * prices shown are US East region and monthly Static content Video streaming Monitoring Account apps Behavioral modeling Customized content …
  21. 21. Data Outcomes & Insights Site content Amazon CloudFront Amazon Route 53 Site / apps Amazon CloudWatch Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 $0 $30 $425 Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … * prices shown are US East region and monthly $456 $1 Site content = 1TB Concurrent users = 1,000 Monthly visits = 1,000,000 Regions = 1, US East Monthly static d/l = 5TB
  22. 22. Data Outcomes & Insights Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 https://aws.amazon.com/pricing/services/* prices shown are US East region and monthly Web app server $0.105 / hr (c4.large - linux) $0.10 / GB Amazon EC2 Load balancing $0.025 / ELB-hr $0.008 / GB data Elastic Load Balancing
  23. 23. Data Outcomes & Insights Site content Amazon CloudFront Amazon Route 53 Site / apps Amazon CloudWatch Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 $19 $30 $425 $1 Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … * prices shown are US East region and monthly $817 Web servers Amazon ELB $0 Web app server = 2 x 2AZ * c4.large w/ 100GB SSD Site content = 1TB Concurrent users = 1,000 Monthly visits = 1,000,000 Regions = 1, US East Monthly static d/l = 5TB $342
  24. 24. Decouple storage everywhere ...S3 is cheap, available, and unlimited*
  25. 25. Data Outcomes & Insights Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 https://aws.amazon.com/pricing/services/* prices shown are US East region and monthly Data streaming $0.015 / shard-hr $0.014 / 1,000,000 PUT $0.020 / 7-day / shard-hr Amazon Kinesis Hadoop / MPP $0.03 / EMR-hr (m4.large) $0.12 / EC2-hr (m4.large) EMRFS direct read from S3 Amazon EMR Data warehouse $1,000 / TB / yr (3-yr RI) $0.0075 / 10,000 HTTP Amazon RedShift BI / data visualization $9 /user/mo (1-yr commit) Amazon QuickSight
  26. 26. Data Outcomes & Insights Site content Amazon CloudFront Amazon Route 53 Site / apps Amazon CloudWatch Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 $19 $30 $425 $1 Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … * prices shown are US East region and monthly $979 Web servers Amazon ELB $0 Account transactions = 1GB Clickstream data = 250GB EMR w/FS = 20hr x 16/mo * c4.large w/ 100GB SSD Business analysts = 4 pax Web app server = 2 x 2AZ * c4.large w/ 100GB SSD Site content = 1TB Concurrent users = 1,000 Monthly visits = 1,000,000 Regions = 1, US East Monthly static d/l = 5TB $342 Transactions Clickstream Amazon S3 User actions Amazon Kinesis Amazon EMR EMRFS Amazon RedShift $8 $48 $20 $50 Amazon QuickSight$36
  27. 27. Data Outcomes & Insights Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 https://aws.amazon.com/pricing/services/ Data caching $0.090 / hr (cache.m3.medium) * prices shown are US East region and monthly NoSQL / low latency $0.0065 /hr 10 units of write $0.0065 /hr 50 units of read $0.25 / GB-mo indexed data Amazon DynamoDB Amazon ElastiCache AWS Lambda Event-driven Free -1.6M sec/mo (256MB) $0.000000417 / sec add'l Simple queue $0.80 / GB (typically useful for 1:1 producer:consumer, and up to 1,000 requests/ sec Amazon SQS
  28. 28. Data Outcomes & Insights Site content Amazon CloudFront Amazon Route 53 Site / apps Amazon CloudWatch Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 $19 $30 $425 $1 Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … * prices shown are US East region and monthly $979++ Web servers Amazon ELB $0 $342 Transactions Clickstream Amazon S3 User actions Amazon Kinesis Amazon EMR EMRFS Amazon RedShift $8 $48 $20 $50 AWS Lambda Amazon DynamoDB Amazon ElastiCache Account transactions = 1GB Clickstream data = 250GB EMR w/FS = 20hr x 16/mo * c4.large w/ 100GB SSD Business analysts = 4 pax Web app server = 2 x 2AZ * c4.large w/ 100GB SSD Site content = 1TB Concurrent users = 1,000 Monthly visits = 1,000,000 Regions = 1, US East Monthly static d/l = 5TB Amazon QuickSight$36
  29. 29. $1,000 can do more than ever before Ingest/ Collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Outcomes & Insights START HERE WITH REQUIREMENTS For less than $1,000/mo, you can get started on your real-time journey Account transactions = 1GB Clickstream data = 250GB EMR w/FS = 20hr x 16/mo * c4.large w/ 100GB SSD Business analysts = 4 pax Web app server = 2 x 2AZ * c4.large w/ 100GB SSD Site content = 1TB Concurrent users = 1,000 Monthly visits = 1,000,000 Regions = 1, US East Monthly static d/l = 5TB Static content Video streaming Monitoring Account apps Behavioral modeling Customized content … ... and it's minutes to start, not months
  30. 30. Capture your windows of opportunity • Measure and constrain your real-time window - real-time can range from milliseconds to minutes. build to your expected latency and concurrency requirements • Decouple, decouple, decouple - achieve the same elasticity and cost savings as our best customers by separating processing and content (CloudFront backed by S3, EMR backed by S3, Kinesis streaming into S3) • Services, not servers - where possible, take advantage of cloud-native services and fully-managed services • Stay current on AWS - some of our 100s of new service features each year may materially change your ability to scale and save (e.g. EMRFS, Kinesis Firehose, Lambda)

×