Your SlideShare is downloading. ×
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DAT101 Understanding AWS Database Options - AWS re: Invent 2012


Published on

When you're handling big data in the modern world, you will come to a point where you can't just pick a “one size fits all” approach anymore. However, to get the results you want, you also don’t have …

When you're handling big data in the modern world, you will come to a point where you can't just pick a “one size fits all” approach anymore. However, to get the results you want, you also don’t have to spend big money on fire breathing hardware, or expensive software. AWS offers a beautiful array of open and commercial database choices, from do-it-yourself to fully managed services which handle scaling, and gives you powerful tools to choose the right architecture. You could choose from MySQL, RDS, Oracle, SQL Server, MongoDB, DynamoDB, Cassandra, ElastiCache, Redis, and SimpleDB, and our customers use them for different use cases. Each has different strengths, and this session highlights when you would want to choose each, with examples of how we use each to solve our big data challenges and why we made those decisions. We profile the some of the choices available to you - MySQL, RDS, Elasticache, Redis, Cassandra, MongoDB and DynamoDB – and three customer case studies on RDS, Elasticache and DynamoDB.

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. AWS Database Options and Decision FactorsBest Practice Tips and Techniques • Optimizing for Manageability and Scale  Edmodo • Optimizing for App Velocity and Scale  Obama for America • Leveraging YesSQL and NoSQL  BrandVerityQ&A
  • 2. Before We Begin
  • 3. Easily and rapidly analyzepetabytes of data1/10 the cost of traditionaldata warehousesAutomated deployment &administrationCompatible with popularBI tools
  • 4. Common BI Tools Choose from 16TB local disk / 128 GB JDBC/ODBC RAM or 2TB local disk / 16GB RAM nodes Leader Node Configure up to 100 nodes for up to 1.6 Pb 10GigE MeshAmazon Redshift Data stored in columnar format for 10X Compute Compute Compute I/O efficiencies and fast queries Node Node Node Query with standard SQL and JDBC/ODBC
  • 5. YourAmazon Redshift BI Tools ODBC / JDBC PostgreSQL drivers
  • 6. 1. Zero to App in ____ Minutes2. Zero to Millions of users in ____ Days3. Zero to “IPO” in ____ Months
  • 7. 1. Zero to App in ____ Minutes2. Zero to Millions of users in ____ Days3. Zero to “IPO” in ____ Months
  • 8. Focus on your App
  • 9. Load balancerApplication tierDatabase tier
  • 10. Load balancer Security, Scale, Availability… Application tier Security, Innovation, Scale, Performance, Availability… Database tierSecurity, Innovation, Scale, Transactions, Performance, Durability, Availability, Skills..
  • 11. SQL NoSQLDo-it Yourself Fully Managed Not available on AWSLow Cost High Cost
  • 12. SQL NoSQLDo-it Yourself Fully Managed
  • 13. SQL NoSQLDo-it Yourself Fully ManagedMySQL MySQLOracle OracleSQL Server SQL ServerMariaDBPostgres…
  • 14. SQL NoSQL Do-it Yourself Fully ManagedMongoDB DynamoDBCassandra ElastiCacheRedis SimpleDBMemcache
  • 15. Should I useShould I use SQL MySQL on EC2 or or NoSQL? RDS? Should I use MongoDB, ? Should I use Redis, Cassandra, or Memcache, or DynamoDB? ElastiCache?
  • 16. What are myWhat are my scale transactional andand latency needs? consistency needs? What are my ? What are my time toread/write, storage market and server and IOPS needs? control needs?
  • 17. Factors SQL NoSQLApplication • App with complex business logic? • Web app with lots of users?Transactions • Complex txns, joins, updates? • Simple data model, updates, queries?Scale • Developer managed • Automatic, on-demand scalingPerformance • Developer architected • Consistent, high performance at scaleAvailability • Architected for fail-over • Seamless and transparentCore Skills • SQL + Java/Ruby/Python/PhP • NoSQL + Java/Ruby/Python/PhP Best of both worlds: Possible to Use SQL and NoSQL models in one App
  • 18. Factors Do it Yourself (DIY) Fully ManagedReplication • Granular, app managed • Transparent and configuredMonitoring • Specific agents and custom • Automated and API drivenSecurity • Root access, custom configs • Hardened by the serviceResources • Requires more DBA resources and time • Requires less DBA resources and timeTime to market • Sophistication vs. speed • Rapid iterationCore Skills • Systems, databases, monitoring • Applications, User focused Best of both worlds: Possible to manage different tiers differently
  • 19. Amazon RDS is a fully managed SQL database service. Choice of Database engines Simple to deploy and scale Reliable and cost effective Without any operational burden.
  • 20. Migration Backup and recoverySchema design PatchingQuery construction ConfigurationQuery optimization Software upgrades Storage upgrades Frequent server upgrades Focus on the “innovation” Hardware crash Off load the “administration”
  • 21.  Multiple databases per instance Standard user accounts Connect and query using common MySQL tools & drivers Tune engine parameters Import and export data using standard MySQL tools (mysqldump) Diagnostics Native MySQL replication SSL for encryption over the wire Monitor metrics Shell, super user or direct file system access (Think security!)
  • 22. ElastiCache is a fully managed Memcachecaching service.Easy to set up and operateScale cache clusters with push button easeUltra fast response time for read scalingWithout any operational burden.
  • 23. Amazon DynamoDB is a fully managed NoSQLdatabase service.Store and retrieve any amount of dataScale throughput to millions of IOSingle digit millisecond latenciesWithout any operational burden.
  • 24. CreateTable PutItem UpdateTable GetItem DeleteTable UpdateItem “Select”, “insert”, “update” DescribeTable itemsManage tables DeleteItem ListTables BatchGetItem Query Bulk select or update Query specific items OR Scan BatchWriteItem (max 1MB) scan the full table
  • 25. So, what are the tips and techniques forsuccessful deployments?
  • 26. Educates millions of students Amazon EC2 Amazon DynamoDB AmazonReaches millions of citizens Elasticache Amazon RDS AmazonAnalyzes billions of Ads S3
  • 27. KimoEducates millions of students RosenbaumReaches millions of citizensAnalyzes billions of Ads
  • 28. Kimo Rosenbaum – Data Architect, Edmodo
  • 29. Where learning happens. Kimo Rosenbaum AWS re: Invent 2012
  • 30. Learning 101• Largest, fastest growing social platform for education• Secure learning network for teachers and students• Browser, iOS, Android• Free for teachers and students
  • 31. Stats 101• 100,000 schools• 14 million users• 7 million new users in the last year• 1 million visits daily
  • 32. Web Instance Auto scaling Group Amazon CloudWatchAmazon Route 53 Elastic Load Balancer Cache Cache Instance InstanceAmazon Cloudfront Instances Amazon S3 RDS DB Instance RDS DB Instance RDS DB Instance Read Replica Read Replica Read Replica Availability Zone RDS DB Instance RDS DB Instance RDS DB Instance Read Replica MySQL DB Instance Read Replica MySQL DB Instance Read Replica MySQL DB Instance
  • 33. DBA 101• Restore from snapshot• Replica creation• Parameter tuning• Metrics collection• Know your app/data
  • 34. Educates millions of students JayReaches millions of citizens EdwardsAnalyzes billions of Ads
  • 35. Jay Edwards – Database Engineer, Obama Campaign
  • 36. Me.• Twitter: First dedicated DBA• OFA: Lead Database Engineer• PalominoDB: CTO & VP/Operations
  • 37. Obama for America.• Technically sophisticated for a campaign • Not “web-scale”• Hockey-stick++ growth• Downtime hurts. A lot…really, really, really a lot.
  • 38. Hockey-stick++
  • 39. OFA Architecture RDS Read Replica ElastiCache RDS with DynamoDB Multi-AZ ELB
  • 40. Problems!• You always need more databases • OFA had 24+ schemas & 100+ RDS instances• You never have enough DBAs • OFA had 1 – 2 x 0.5 fulltime MySQL DBAs
  • 41. Why RDS?• Makes operational issues very easy • Need more replicas? BAM! • Upsize hardware? KAPOW! • Point in time restore? BIF!
  • 42. Why not RDS?• Hardware cap (vertical v. horizontal)• Sophisticated use-cases • Frequent topology changes • Multi-region replication (on their roadmap)• DBAs need busy work
  • 43. Educates millions of studentsReaches millions of citizens AndyAnalyzes billions of Ads Skalet
  • 44. Andy Skalet - CTO, BrandVerity
  • 45. Managed Services Bias
  • 46. New Products/Markets – YesSQL!
  • 47. Big Data? Cast your problem
  • 48. AWS Options
  • 49. Case Study: Crawl history
  • 50.
  • 51. • Managed services let you focus on creating value• Amazon S3 - Very robust, handles large items, but you filter• Amazon DynamoDB - Extremely fast, scalable, good value • Must cast your problem as kvs or key + range• Amazon RDS - MySQL, without the headaches• Amazon ElastiCache - As memcached, fast kvs for small data• Multi column queries on big data? • Looking forward to the AWS solution
  • 52. Thank youFree raghavas@amazon
  • 53. We are sincerely eager to hear your feedback on thispresentation and on re:Invent. Please fill out an evaluation form when you have a chance.