Austin Scales - Nexus - Bazaarvoice's Cloud Infrastructure


Published on

Nexus is Bazaarvoice's next generation cloud infrastructure built on top of Amazon Web Services. Nexus is highly available and resilient, built with best practices on top of services such as VPC, Autoscaling, ELB, Cloudformation, and more.

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Austin Scales - Nexus - Bazaarvoice's Cloud Infrastructure

  1. 1. NexusBazaarvoice Cloud Infrastructure Victor Trac Cloud Architect, Bazaarvoice
  2. 2. welcome to cloudcomputing!!1
  3. 3. BazaarvoiceNot bizarre boysAustin-based company founded in 2005 Basic stats: Thousands of clientsSaaS serving software that collectsand displays user generated content, Hundreds of millions of pieces ofcrunches analytics, and extracts contentdata. Hundreds of millions of uniqueEngineering offices in Austin, NYC, visitors per monthLondon, and San Francisco Tens of billions of page-views per month
  4. 4. Edge Traffic
  5. 5. EC2, S3, VPC,Regions, Autoscale, CloudFormation, ELB...? Does this mean anything to you?
  6. 6. VPC & Subnets VPC allows us to choose our internal IP space. Public: Default route via IGW Default Route for All Subnets to IGW ● Lets call these subnets all "Public" ● Requires all instances to have EIPs before talking to the internet ● EIPs are a limited resource Private: Default route via instance(s) in Public Subnets Advantage: Most instances in the private subnet can talk to the internet without dealing with an EIP.
  7. 7. Security Groups
  8. 8. Autoscaling
  9. 9. Elastic Load Balancing ● Only Round Robin and Sticky Sessions ● Supports HTTP Response code or basic TCP connection Health Checks
  10. 10. { "AWSTemplateFormatVersion" : "2010-09-09", "Description" : "A text description for the template usage", "Parameters": { // A set of inputs used to customize the template perdeployment }, "Mappings": { // Mappings match a key to a corresponding set of named values }, "Resources" : { // The set of AWS resources and relationships between them }, "Outputs" : { // A set of values to be made visible to the stack creator }}
  11. 11. CloudFormation Instance Example{ "AWSTemplateFormatVersion" : "2010-09-09", "Description" : "Create an EC2 instance running the Amazon Linux 32 bit AMI.”, "Parameters" : { "KeyPair" : { "Description" : "The EC2 Key Pair to allow SSH access to the instance", "Type" : "String" } }, "Resources" : { "Ec2Instance" : { "Type" : "AWS::EC2::Instance", "Properties" : { "KeyName" : { "Ref" : "KeyPair" }, "ImageId" : "ami-75g0061f", "InstanceType" : "m1.small" } } }, "Outputs" : { "InstanceId" : { "Description" : "The InstanceId of the newly created EC2 instance", "Value" : { "Ref" : "Ec2Instance" } } }}
  12. 12. IAM and Console AccessSign-on Credentials ● IAM Console login ○ Username, Password, and MFA Time tokenAccess Credentials ● AWS has three API types: REST, Query, & SOAP. ● Each API uses one or more Access Credentials ○ Access Keys for REST and Query APIs ○ x.509 Certificates for SOAP API ○ EC2 KeyPairs for instance SSH authentication
  13. 13. In the beginning...A Java application server + a MySQLDatabaseScaled by adding in anotherapplication server.Then we just duplicated this entirestack, giving us two "clusters".Scaled more by adding more and moreclusters.
  14. 14. Add in AWS
  15. 15. Decentralization --> Everyone goes fast!
  16. 16. GoalsFull control over AWS resourcesEC2 resources, Autoscale, ELB, S3, etc.Team IsolationResources created by one team can only be modified/terminated bythat team
  17. 17. 3rd Party SolutionenStratus, RightScale, asgard, etcGood Bad ● enStratus & RightScale ● No AWS API Access provide cloud-agnostic ● No AWS CLI Tools & SDKs tools ● Locked into only supported services
  18. 18. Multiple AccountsGood Bad● Provides for full resource ● Inter-team network control with direct API communications can become access very complicated, relying● Protects teams from one on VPN between VPCs -> another Reduced Reliability● Allows for easy accounting ● Management of networking is on a per-team basis a possible bottleneck● May make it easier for ● Shared resources may need external auditors to to be redundantly built in determine which teams have every VPC: LDAP, DNS, "production" access Monitoring
  19. 19. Single Shared AccountGood Bad● Sharing of resources will ● No built-in protections be simple - just open between teams, even with access via security groups IAM between teams ● Creates a centralized● Reliable networking between resource that someone has teams without need for VPN to maintain● Possibly better performance ● Requires us to build tools due to fewer hops to use long-term● Certain resources can be shared: LDAP, DNS, Monitoring, etc.
  20. 20. Nexus, circa August 2012
  21. 21. In more detail...Nexus is:● AWS Infrastructure designed with best practices: ○ secure ○ highly available ○ multi-region ○ repeatable● Cloud building blocks and recipes for all of Engineering● A Single Account SolutionPhilosophy: Engineering teams at Bazaarvoice are free to choosetheir own stack, but we want to make Nexus so compelling that itis the default choice.
  22. 22. (some) Batteries IncludedIncluded: Dev teams provide● Bastion Hosts anything required to● NAT Instances run their app, which● VPN Connectivity between Regions probably means:● Internal DNS● Monitoring* ● Puppet/Chef/etc● Centralized Logging* ● Your actual app● Services Discovery* ● Deployment process● Scripts & CloudFormation in Github to create ephemeral VPCs that look like a Managed Environment
  23. 23. Nexus Regions
  24. 24. bas·tion (*/ˈbas-chən/*)
  25. 25. NAT Instances
  26. 26. Autoscaling with CloudFormation
  27. 27. Internal DNS
  28. 28. External DNS (Route53)Records in the zone are for you to use.
  29. 29. Badger
  30. 30. Cabertoss
  31. 31. Conformity -> Measurable Efficiency
  32. 32. Limitations & Risks● Danger! Single Shared Account ○ You can wipe out all of a region with a bad script.● Single NAT per AZ ○ Someone else downloading lots of data from the internet will affect all other instances sharing your private subnet.● Single VPN Instance per VPN Destination ○ Similar to NAT problem, but worse. ○ Avoid VPN when possible ○ If not possible, make your VPN dependency resilient to lack of bandwidth and network blips
  33. 33. Nexus is a catalyst: old and busted new and shiny waterfall agile centralized development distributed teams 8-10 week release cycle release anytime monolithic app services oriented architecture mysql cassandra solr elasticsearch java whatever dev + ops devops
  34. 34. Email: Twitter: @victortrac Thanks!