Understanding AWS Database Options (DAT201) | AWS re:Invent 2013


Published on

With AWS you can choose the right database technology and software for the job. Given the myriad of choices, from relational databases to non-relational stores, this session provides details and examples of some of the choices available to you. This session also provides details about real-world deployments from customers using Amazon RDS, Amazon ElastiCache, Amazon DynamoDB, and Amazon Redshift.

Published in: Technology

Understanding AWS Database Options (DAT201) | AWS re:Invent 2013

  1. 1. DAT201- Understanding AWS Database Options Sundar Raghavan – Amazon RDS Zac Sprackett – Vice President of Operations with SugarCRM Michael Thomas – Principal Software Engineer with Scopely November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  2. 2. Today’s discussion AWS Database Options and Decision Factors Best Practice Tips and Techniques SugarCRM Scopely Q&A
  3. 3. Starting with the Customer • How many of you use databases on AWS? • How many of you use Amazon RDS, Amazon DynamoDB, Amazon Redshift, or Amazon ElastiCache? • How many of you have a well defined DR strategy for your databases? • How many of you are building geo-spatial and context sensitive applications? • We suggest that you attend Werner’s keynote!
  4. 4. Introducing: Cross Region Support US GovCloud (US ITAR Region -- Oregon) US West x 2 (N. California and Oregon) US East (Northern Virginia) LATAM (Sao Paola) Europe West (Dublin) >10 data centers In US East alone 9 AWS Regions including 25 Availability Zones and growing 46 world-wide points of presence Asia Pacific Region (Singapore) Asia Pacific Region (Tokyo) Australia Region (Australia) • RDS Snapshot Copy • All engines
  5. 5. Zoopla “We are very happy with RDS cross region snapshot copy feature as it gives us the ability to copy our data from one AWS region to another AWS region with minimal effort. Prior to this feature, it used to take 3 days and a number of manual steps to copy our snapshots. Now we have an automated process that helps us to achieve disaster recovery capabilities in just few steps.” Joel Callaway, IT Operations Manager Zoopla Property Group Ltd, UK
  6. 6. Your Mission is Clear 1. Zero to App in ____ Minutes 2. Zero to Millions of users in ____ Days 3. Zero to “Hero” in ____ Months
  7. 7. Focus on your App
  8. 8. Your Stack Load balancer Application tier Database tier
  9. 9. Your Stack of Worries Load balancer Security, Scale, Availability… Application tier Security, Innovation, Scale, Performance, Availability… Database tier Security, Innovation, Scale, Transactions, Performance, Durability, Availability, Skills..
  10. 10. Spectrum of Database Options SQL NoSQL Do-it Yourself Fully Managed  Low Cost Not available on AWS High Cost
  11. 11. Spectrum of Options SQL NoSQL Do-it Yourself Fully Managed
  12. 12. Spectrum of Options SQL NoSQL Do-it Yourself Fully Managed MySQL Oracle, SQL Server, MariaDB Vertica, Paraccell … MySQL, Oracle, SQL Server Amazon Redshift
  13. 13. Spectrum of Options SQL NoSQL Do-it Yourself Fully Managed MongoDB Cassandra Redis Memcache DynamoDB ElastiCache (Memcache) ElastiCache (Redis) SimpleDB
  14. 14. Thinking About the Questions Should I use MySQL or PostgreSQL? Should I use SQL or NoSQL? Should I use MongoDB, Cassandra, or DynamoDB? ? Should I use Redis, Memcache, or ElastiCache?
  15. 15. Actually, Thinking About the Right Questions What are my transactional and consistency needs? What are my scale and latency needs? What are my read/write, storage and IOPS needs? ? What are my time to market and server control needs?
  16. 16. Factors to Consider Factors SQL NoSQL Application • App with complex business logic? • Web app with lots of users? Transactions • Complex txns, joins, updates? • Simple data model, updates, queries? Scale • Developer managed • Automatic, on-demand scaling Performance • Developer architected • Consistent, high performance at scale Availability • Architected for fail-over • Seamless and transparent Core Skills • SQL + Java/Ruby/Python/PhP • NoSQL + Java/Ruby/Python/PhP Best of both worlds: Possible to Use SQL and NoSQL models in one App
  17. 17. Factors to Consider Self-Managed Service Managed Service • Full control over the instance, db and OS parameters • Upgrades, back-ups, fail-over are yours to manage • All aspects of security is managed by you • Complex replication topologies and data management • Off-load the infrastructure and software management • Automate database life-cycle with APIs • Focus on database access and app security • Limited control over replication topologies
  18. 18. Pace of Innovation – a Bonus RDS team launched 23+ features • • • • SQL Server TDE, Version upgrade Oracle TDE, Statspack, Fine grain access, 3TB/30K IOPS Cross Region Snapshot Copy, Parallel replica, Chained replica Multi-AZ SLA, Log access, VPC groups, … NoSQL team launched 10+ features • • • • Redis engine support Amazon DynamoDB Fine grain access control Amazon DynamoDB local, Geospatial indexing library Transaction library, Local secondary index, parallel scan Redshift team launched 20+ features • • • • Encryption with HSM support Audit logging, SNS notification, snapshot sharing COPY from Amazon EMR/HDFS/SSH Faster resize, improved concurrency, distributed tables, …
  19. 19. Amazon RDS is a managed SQL database service. Choice of Database engines Simple to deploy and scale Reliable and cost effective Without any operational burden
  20. 20. Optimizing for Developer Productivity Schema design Migration Backup and recovery Patching Query construction Configuration Query optimization Focus on the “innovation” Software upgrades Storage upgrades Frequent server upgrades Hardware crash Off load the “administration”
  21. 21. Optimizing for Developer Productivity  Multiple databases per instance MySQL Manual for Read Replica  Use MySQL tools & drivers  Quickly set up Read Replicas  High availability Multi-AZ option (99.95% SLA)  Ability to promote Read replicas, Rename as Master  Diagnostics OR Amazon RDS console  Native MySQL replication  SSL for encryption over the wire  Monitor metrics  Shell, super user or direct file system access (Think security!)
  22. 22. ElastiCache is a managed caching service. Easy to set up and operate cache clusters Supports Memcached and Redis engines Scale cache clusters with push button ease Ultra fast response time for read scaling Without any operational burden
  23. 23. ElastiCache is a Performance Booster Serve most read queries In-memory performance Read Replica (Redis) Master App Reads Cache Updates Clients Elastic Load Balancing EC2 App Instances Read/write queries SSD performance RDS MySQL DB Instance with PIOPS
  24. 24. Amazon DynamoDB is a managed NoSQL database service. Store and retrieve any amount of data Scale throughput to millions of IO Single digit millisecond latencies Without any operational burden
  25. 25. Optimizing for Developer Productivity CreateTable UpdateTable DeleteTable Manage tables PutItem GetItem UpdateItem DescribeTable ListTables DeleteItem Query Query specific items OR scan the full table BatchGetItem Scan BatchWriteItem “Select”, “insert”, “update” items Bulk select or update (max 1MB)
  26. 26. Amazon Redshift is a managed data warehouse service. Petabyte scale columnar database Fast response time (~10x that of typical relational stores) Under $1,000 per TB per year Without any operational burden
  27. 27. So, what are the tips and techniques for successful deployments?
  28. 28. Thousands of Successful Deployments Two Highlights SugarCRM CRM Software Gaming Platform Zac Sprackett Mike Thomas
  29. 29. Crafting Loyal Customers with SugarCRM Every Customer. Every User. Every Time. S. Zachariah Sprackett, VP of Operations, SugarCRM November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  30. 30. SugarCRM • Redefining Customer Relationship Management • Unique product bundling – On Premise and Hosted offerings • Manifest destiny – Source code access and SQL database per customer • Scale – From one seat customers to multi thousand seat customers • Globally distributed customer base
  31. 31. Deployment Models Traditional SaaS SugarCRM
  32. 32. Application Stack MySQL Apache PHP HTML5 & JavaScript Elastic Search Shadow Linux Email Archiving Background Jobs
  33. 33. Cloud Stacks Amazon SES ElastiCache RDS DB Instance Cloud Provider RDS DB Instance Read Replica EC2 Web Servers EC2 Job Servers Amazon S3 EC2 Elastic Search Amazon Glacier
  34. 34. Cloud Providers Route 53 EC2 HA Proxy Managed Elastic IP Cloud Stack EC2 HA Proxy
  35. 35. Management Console Globally Distributed Cloud Providers
  36. 36. Delivering On Time and On Budget • Amazon lets you easily spin up testing environments – Testing only works if you make use of it. Don’t make assumptions – Monitor everything • Change in cost model can surprise finance – Planned capital expenditures versus after the fact operational expenditures – Use reserved instances – Third party tools such as Cloudability can help alert you of issues early • Manage access keys effectively to control cost – Learn to love AWS Identity and Access Management (IAM)
  37. 37. Things to Watch Out For • Understand your IO requirements – • • Use the heck out of read replicas Snapshots are incredibly useful – • Don’t get stuck waiting for deployments in a forced failover scenario ElastiCache is not clustered across availability zones Watch out for the SLA – – • Unless you really like restarting databases Cold Standby is not instant on – • • But not available from a read replica Don’t use the default parameter group for Amazon RDS – • Make effective use of each of instance backed, Amazon EBS and Provisioned IOPS file systems 99.95% for a region even across two AZ’s This doesn’t include user error You still need DBAs and Ops but they get to do cooler stuff
  38. 38. We’re Hiring Email: zac@sugarcrm.com Free Trials: http://www.sugacrm.com/try-sugar
  39. 39. Scopely Michael Thomas – Principal Software Engineer with Scopely November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  40. 40. Our technical infrastructure allows developers to build games efficiently for both iOS and Android. Millions of Users Billions of Turns All titles have reached the Top 5 in the App Store, and the last three have been #1. ABOUT SCOPELY
  41. 41. Challenges • Build a single platform to support many different kinds of games – asynchronous turn based, single player, synchronous, etc. • Scale up and down as games are tested, launched, grow, and are retired. • We are not an infrastructure company – we must focus on building features that support game development.
  42. 42. Platform Features • • • • • • • • • • • Accounts / authentication • Gameplay / state persistence • Chat / messaging • In game economy • Facebook integration • Gifting • Single Player state tracking • Promotion / cross-promotion system • Statistics • Tournaments • Achievements Email targeting Suggested friends In game news system External partner integration Invitation attribution Push notifications Content management Generic storage API Application / device configuration AB Testing
  43. 43. Different Features/Different Requirements • • • • • Dynamic scaling (game launches, promotions, tests) High write/read ratio (playing turns) Transactional consistency (real money purchases) Indexed data (user accounts) Complex, real-time data (leaderboards)
  44. 44. Operational Data Storage Scopely Gaming Platform Memcached for performance, scalability, and cost savings ElastiCache Amazon S3 for asset and image storage. S3 Redis for fast, complex caching and message passing. Amazon DynamoDB for unbounded data with heavy write load. ElastiCache DynamoDB RDS MySQL for bounded, transactional, queryable data.
  45. 45. Analytics Data Pipeline Scopely Gaming Platform SQS: In-Flight Events Redshift Data Warehouse EC2: Message Loader S3: Staged Messages EMR: Transformer S3: Processed Data EC2: Redshift Loader RDS: Process / Job Tracking
  46. 46. Schema Mapping DSL from centipede.schema.table import Table from centipede.attributes import * class GemsTurn(Table): user_id = Integer, lambda message: message['Data']['GameData']['CurrentPlayerId'] current_turn = Integer, lambda message: message['Data']['Gamedata']['CurrentTurn'] end_date = Timestamp, lambda message: message['Data']['GameData']['EndDate'] expiration = Timestamp, lambda message: message['Data']['GameData']['Expiration'] game_id = Guid, lambda message: message['Data']['GameData']['GameId'] resigning_user_id = Integer, lambda message: message['Data']['GameData']['ResigningPlayerId'] start_context = Integer, lambda message: message['Data']['GameData']['StartContext'] start_date = Timestamp, lambda message: message['Data']['GameData']['StartDate'] status = Integer, lambda message: message['Data']['GameData']['Status'] tournament_id = Guid, lambda message: message['Data']['GameData']['TournamentId'] tournament_price_category = Integer, lambda message: message['Data']['GameData']['TournamentPriceCategory'] tournament_price_paid = Integer, lambda message: message['Data']['GameData']['TournamentPricePaid'] tutorial_type = Integer, lambda message: message['Data']['GameData']['TutorialType'] winning_user_id = Integer, lambda message: message['Data']['GameData']['WinningPlayerId'] awards = List, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['Awards'] coins_gathered = List, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['CoinsGathered'] custom_statistics = VarChar, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['CustomStatistics'] has_hidden_game = Boolean, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['HasHiddenGame'] last_nudge_date = Timestamp, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['LastNudgeDate'] score = Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['Score'] score_for_award = Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.current_user_index(message)]['ScoreForAward'] opponent_user_id = Integer, lambda message: message['Data']['GameData']['Players'][GemsTurnHelper.opponent_user_index(message)]['UserId']
  47. 47. Use Case: Leaderboards • “What is my rank in today’s tournament?” • Hard to cache since a single player getting a new high score changes everyone’s rank • Highly optimized schema required 4 m2.2xlarge RDS nodes • Latency for “what is my rank” could be above 100ms • Redis sorted sets provide exactly what we need. Two m2.xlarge instances are more than enough. Rank query is now in single digit milliseconds. Redis
  48. 48. Use Case: Game/Turn State • Extremely high throughput. Extremely large dataset. DynamoDB • Semi-structured data – each game models “state” differently. • Always queried by UserID or GameID. • Maxed out an Amazon RDS instance – instead of spending time sharding / optimizing Amazon RDS, we moved to Amazon DynamoDB. • Saves operational time and development time by not having to worry about growing games/adding new games/traffic spikes.
  49. 49. Use Case: User Accounts • Need to maintain uniqueness across multiple columns (email, username, etc.) MySQL (RDS) • Queryable on multiple facets (email, username, external identifier) • Entire table needs to be scanned regularly (promotions) • Bounded data size
  50. 50. Use Case: Global Caching • Cache everything possible in Memcached including both entities in Amazon DynamoDB and RDS. Memcached (ElastiCache) • Single interface providing session caching, memcached caching, and Amazon DynamoDB access encourages consistent use of caching.
  51. 51. Use Case: Global Caching public class CoherentStorage { public Cache L1Cache { get; set; } public Cache L2Cache { get; set; } public DynamoClient Dynamo { get; set; } private readonly Games _game; public CoherentStorage(Games game) { _game = game; L1Cache = Cache.Request; L2Cache = Cache.GetMemcached(String.Format("{0}GameState", game)); Dynamo = DynamoClient.Instance; } public void Save(object instance) { } public void Delete(object instance) { } public T Get<T>(object id, bool skipCache = false, bool consistentRead = true) { } } Memcached (ElastiCache)
  52. 52. Tips & Traps • Know your data – use reasonable heuristics for expected data growth. • Each data storage technology introduces some level of operational and engineering overhead. Choose wisely. • Get creative with Amazon DynamoDB. • Prepare for the unexpected with Metadata columns in MySQL.
  53. 53. Please give us your feedback on this presentation DAT201 As a thank you, we will select prize winners daily for completed surveys!