Scaling drupal horizontally and in cloud
Upcoming SlideShare
Loading in...5
×
 

Scaling drupal horizontally and in cloud

on

  • 2,643 views

Vancouver Drupal group presentation for April 25, 2013. ...

Vancouver Drupal group presentation for April 25, 2013.
How to deploy Drupal on
- multiple web servers,
- multiple web and database servers, and
- how to join all that together and make site deployed on Amazon Cloud (Virtual Private Cloud) inside
- one availability zone
- multiple availability zones deployment.

Session cover details about what you need in order to get Drupal deployed on separate servers, what are issues/concerns, and how to solve them.

Statistics

Views

Total Views
2,643
Slideshare-icon Views on SlideShare
2,610
Embed Views
33

Actions

Likes
0
Downloads
29
Comments
0

3 Embeds 33

http://burgerboydaddy.com 26
http://www.burgerboydaddy.com 5
https://twitter.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • At start we can add as many web servers as we want. One important part is to configure web servers to share files. This was done by using rsync replication and mounting on all servers /var/www/html/mysite/sites folder as shared one (expecting that Drupal core is same on all servers I didn't want to share it). I like this solution since we will have source code on both web instances, and not on the file storage. This makes it possible to release new "source code" (not database!) instances of Drupal modules. Or you can quickly change some lines on a PROD environment for debugging (as long as you block traffic from visitors to that web instance of course ;-)). Memcached Move the caching mechanism to Memcached. Memcached can store all caching data in memory. So it doesn't use the MySQL tables any longer. Also, Memcached can run in a clustered environment, so no need to manually flush the remote cache. The Memcached Drupal module and Memcached daemon would take care of it. Because of the movement of caching to Memcached, databases would not be on heavy load any longer.  Database server Database server can be just one or full MySQL cluster (depending on amount of available $$$). File storage replication Another (possible) improvement to above solution would be to store all data on NAS file storage. The NAS storage holds all data in sites/##YOUR_SITE_NAME##/files directory. Compared with the previous solution, we don’t need to sync data again. Again: one disadvantage here: if the NAS file storage goes out: no file in your files will be served. Nor by web-server1, nor by web-server 2. As previous solution, problem with this solution lay in some single points of failure, like only one load balancer and possible one MySQL server. 
  • Because mod_proxy makes Apache to become an (open) proxy server, and open proxy servers are dangerous both to your network and to the Internet at large, I completely disable this feature:  ProxyRequests Off Order deny,allow Deny from all
  • So this option is many aspects similar to the default one, with one big difference. In this case each web server will have it's own db server. Reason for this: Higher site availability; if one db server is down, second one can continue to serve customers. - Be sure to exclude some tables from replication DrupalDB.cache% DrupalDB.watchdog% DrupalDB.temp_search_sids DrupalDB.temp_search_results - And exclude all databases that are local mysql test
  • Almost zero upfront infrastructure investment: If you have to build a large-scale system it may cost a fortune to invest in real estate, physical security, hardware (racks, servers, routers, backup power supplies), hardware management (power management, cooling), and operations personnel. Because of the high upfront costs, the project would typically require several rounds of management approvals before the project could even get started. Now, with utility-style cloud computing, there is no fixed cost or startup cost. Just-in-time Infrastructure: In the past, if your application became popular and your systems or your infrastructure did not scale you became a victim of your own success. Conversely, if you invested heavily and did not get popular, you became a victim of your failure. By deploying applications in-the-cloud with just-in-time self-provisioning, you do not have to worry about pre-procuring capacity for large-scale systems. This increases agility, lowers risk and lowers operational cost because you scale only as you grow and only pay for what you use. More efficient resource utilization: System administrators usually worry about procuring hardware (when they run out of capacity) and higher infrastructure utilization (when they have excess and idle capacity). With the cloud, they can manage resources more effectively and efficiently by having the applications request and relinquish resources on-demand. Usage-based costing: With utility-style pricing, you are billed only for the infrastructure that has been used. You are not paying for allocated but unused infrastructure. This adds a new dimension to cost savings. You can see immediate cost savings (sometimes as early as your next month’s bill) when you deploy an optimization patch to update your cloud application. For example, if a caching layer can reduce your data requests by 70%, the savings begin to accrue immediately and you see the reward right in the next bill. Moreover, if you are building platforms on the top of the cloud, you can pass on the same flexible, variable usage-based cost structure to your own customers.
  • Automation – “Scriptable infrastructure”: You can create repeatable build and deployment systems by leveraging programmable (API-driven) infrastructure. Auto-scaling: You can scale your applications up and down to match your unexpected demand without any human intervention. Auto-scaling encourages automation and drives more efficiency. Proactive Scaling: Scale your application up and down to meet your anticipated demand with proper planning understanding of your traffic patterns so that you keep your costs low while scaling.
  • More Efficient Development lifecycle: Production systems may be easily cloned for use as development and test environments. Staging environments may be easily promoted to production. Improved Testability: Never run out of hardware for testing. Inject and automate testing at every stage during the development process. You can spawn up an “instant test lab” with pre-configured environments only for the duration of testing phase. Disaster Recovery and Business Continuity: The cloud provides a lower cost option for maintaining a fleet of DR servers and data storage. With the cloud, you can take advantage of geo-distribution and replicate the environment in other location within minutes.
  • AWS – Amazon Web Services Amazon Web Services (AWS) is a collection of remote computing services (also called web services) that together make up a cloud computing platform, offered over the Internet by Amazon.com. EC2     - Elastic Compute Cloud Amazon Elastic Compute Cloud (EC2) is a central part of Amazon.com's cloud computing platform, Amazon Web Services (AWS). EC2 allows users to rent virtual computers on which to run their own computer applications. EC2 allows scalable deployment of applications by providing a Web service through which a user can boot an Amazon Machine Image to create a virtual machine. A user can create, launch, and terminate server instances as needed, paying by the hour for active servers, hence the term "elastic". EC2 provides users with control over the geographical location of instances that allows for latency optimization and high levels of redundancy. S3       - Simple Storage Service Amazon S3 (Simple Storage Service) is an online storage web service offered by AWS. Amazon S3 provides storage through web services interfaces (REST, SOAP, and BitTorrent). AMI     - Amazon Machine Images An Amazon Machine Image (AMI) is a special type of virtual appliance which is used to instantiate (create) a virtual machine within the Amazon Elastic Compute Cloud ("EC2"). It serves as the basic unit of deployment for services delivered using EC2.
  • EBS     - Elastic Block Storage Amazon Elastic Block Storage (EBS) provides raw block devices that can be attached to Amazon EC2 instances. Can be used like any raw block device. In a typical use case, this would include formatting the device with a filesystem and mounting said filesystem. In addition EBS supports a number of advanced storage features, including snapshotting and cloning. VPC     - Virtual Private Cloud Amazon Virtual Private Cloud (VPC) is a commercial cloud computing service that provides a virtual private cloud, allowing enterprise customers to access the Amazon Elastic Compute Cloud over an IPsec based virtual private network. Unlike traditional EC2 instances which are allocated internal and external IP numbers by Amazon, the customer can assign IP numbers of their choosing from one or more subnets. By giving the user the option of selecting which AWS resources are public facing and which are not, VPC provides much more granular control over security. ELB     - Elastic Load Balancing AZ     - Amazon Availability Zones (Data Centers)
  • RDS     - Amazon Relational Database Service Amazon RDS is a distributed relational database service by Amazon.com. It is a web service running "in the cloud" and provides a relational database for use in applications. It is aimed at simplifying the set up, operation, and scaling a relational database. Supporting MySQL databases Oracle databases Microsoft SQL Server ECU     - Elastic Computational Unit One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. SQS     - Simple Queue Service
  • Note: Because the set of IP addresses associated with a Elastic IP can change over time, you should never create an "A" record with any specific IP address. If you want to use a friendly DNS name for your EIP/ELB instead of the name generated by the Elastic Load Balancing service, you should create a CNAME record for the LoadBalancer DNS name, or use Amazon Route 53 to create a hosted zone.
  • NAT Instance - By default the Private Subnets in a VPC do not have internet connectivity. They cannot be accessed over the internet and neither can they make outbound connections to internet resources. But let's say you have setup a database on an EC2 Instance in the Private Subnet and have implemented a backup mechanism. You would want to push the backups to Amazon S3. But the Private Subnet's cannot access S3 since there is no internet connectivity. You can achieve it by placing a NAT Instance in the VPC1. Through NAT Instance outbound connectivity for Private Subnet Instances can be achieved. The Instances will still not be reachable from the internet (inbound)2. You need to configure the VPC Routing Table to enable all outbound internet traffic for the Private Subnet to go through the NAT Instance3. AWS provides a ready NAT AMI (ami-f619c29f) which you can use to launch the NAT Instance4. You can have only one NAT Instance per VPC
  • Since you can have only one NAT Instance per VPC, you need to be aware that it becomes a Single Point Of Failure in the architecture. If the architecture depends on the NAT Instance for any critical connectivity, it is an area to be reviewed. 1. And you are limited by the bandwidth availability of a single NAT Instance. So do not build architecture that will have internet bandwidth requirements from the Private Subnet with NAT. 2. You can create network topology with multiple NAT servers
  • IP Ranges - When setting up a VPC you are essentially fixing the network of the VPC. And if the VPC requires VPN connectivity (as in most of the cases), care should be taken to choose the IP range of the VPC and avoid any IP conflicts. Public and Private Subnets - The VPC network can be divided further in to smaller network segments called as Subnets. Any VPC will have at least one SubnetYou can setup a Public Subnet which will have internet connectivity. Instances launched within a Public Subnet will have both outbound and inbound (through EIP) internet connectivity through the Internet Gateway attached to the Public SubnetPrivate Subnets are completely locked down. They do not have internet connectivity by defaultCreate number of Public and Private Subnets depending upon your architecture. Place all public facing servers such as web servers, search servers in the public subnet. Keep DB servers, cache nodes, application servers in the private subnet
  • - Use Simple GUI to build SG's - Divide your resources Public Web DB Network File Server (separated or inside Web group? VPC Security Groups are different from normal EC Security Groups. With EC2 Security Groups you can control the ingress into your EC2 Instance. With VPC Security Groups, you have the option to control both inbound and outbound traffic. When something is not accessible you have to check both inbound and outbound rules set in the VPC Security Group ELB Security Group - When you launch an ELB within VPC, you have the option to specify a VPC Security Group to be attached with the ELB. This is not available for ELB launched outside VPC in normal EC2. With this additional option, you can control access to specific ELB ports from specific IP sources. On the backend EC2 Instances' Security Group, you can allow access to the VPC Security Group that you associated with the ELB 7. Internal ELB - When you launch an ELB within VPC, you also have additional option to launch it as an "Internal Load Balancer". You can use an "Internal Load Balancer" to load balance your application tier from the web tier above.

Scaling drupal horizontally and in cloud Scaling drupal horizontally and in cloud Presentation Transcript

  • scaling Drupalhorizontally and in thecloud
  • about mename: Vladimir Ilicemail: burger.boy.daddy@gmail.comtwitter: @burgerboydaddyhttp://burgerboydaddy.com
  • agendawhy all of this?step 1: test locally -> from one server to the server farm,step 2: multiple web and database servers,step 3: how to join all that together and make site deployed on AmazonCloud and inside Virtual Private CloudAmazon termbenefitssingle availability zonemultiple availability zones.
  • why?if you want to increase site speedif you want your site to be responsive and to workunder heavy stressif you want to be in control what goes on your server
  • get it divided / decoupleEasy to do inside local development/hostingenvironmentJust separate web, database and cache serversProblemswe can increase resources only verticallyNot all resources are used same way (web serverwill probably die before cache or MySQL)Multiple “single points of failure”
  • multiple web servers –one dbApache load balancer in frontof2-3 web servers; eachserver with integratedAPC cacheMultiple cache serversPowerful MySQL serverIn real life you can use someother LB solution (this one isgreat for proof of conceptmoments).Without dedicated file server;used bi-directional rsyncreplication
  • configuring Apache loadbalancerApache web server ships a load balancer module calledmod_proxy_balancer (since version 2.2).All you need to do is to enable this module and themodules mod_proxy and mod_proxy_http. Please notethat without mod_proxy_http, balancer just wont work.LoadModule proxy_module mod_proxy.soLoadModule proxy_http_module mod_proxy_http.soLoadModule proxy_balancer_modulemod_proxy_balancer.so
  • many to manyIn this case each webserver will have itsown db server.Reason for this:Higher siteavailability; if onedb server is down,second one cancontinue to servecustomers.
  • Amazon AWSWhy Amazon (business point of view)Most complete cloud solution on the market.Almost zero upfront infrastructure investmentJust-in-time infrastructurePay as you go – pay what you useConstant price dropEasy to deploy and scale….
  • why Amazon (technicalbenefits)Automation – “Scriptable infrastructure”: You cancreate repeatable build and deployment systems byleveraging programmable (API-driven) infrastructure.Auto-scaling: You can scale your applications up anddown to match your unexpected demand without anyhuman intervention.Proactive Scaling: Scale your application up and downto meet your anticipated demand; Elasticity
  • why Amazon (technicalbenefits)More Efficient Development lifecycle: Productionsystems may be easily cloned for use as developmentand test environments.Improved Testability: Never run out of hardware fortesting. Inject and automate testing at every stageduring the development process.Disaster Recovery and Business Continuity: The cloudprovides a lower cost option for maintaining a fleet ofDR servers and data storage.
  • understanding elasticity
  • key Amazon terms – #1AWS – Amazon Web ServicesAmazon Web Services (AWS) is a collection of remote computing services (also called webservices) that together make up a cloud computing platform.EC2     - Elastic Compute CloudEC2 allows users to rent virtual computers on which to run their own computer applications.EC2 allows scalable deployment of applications by providing a Web service through which auser can boot an Amazon Machine Image to create a virtual machine.A user can create, launch, and terminate server instances as needed, paying by the hour foractive servers, hence the term "elastic".S3       - Simple Storage ServiceAmazon S3 (Simple Storage Service) is an online storage web service offered by AWS.AMI     - Amazon Machine ImagesAn Amazon Machine Image (AMI) is a special type of virtual appliance which is used toinstantiate (create) a virtual machine within the Amazon Elastic Compute Cloud ("EC2").
  • key Amazon terms - #2EBS     - Elastic Block StorageAmazon Elastic Block Storage (EBS) provides raw block devices that can be attachedto Amazon EC2 instances.Can be used like any raw block device. In a typical use case, this would includeformatting the device with a filesystem and mounting said filesystem.VPC     - Virtual Private CloudAmazon Virtual Private Cloud (VPC) is a commercial cloud computing service thatprovides a virtual private cloud.Unlike traditional EC2 instances which are allocated internal and external IP numbersby Amazon, the customer can assign IP numbers of their choosing from one or moresubnets.VPC provides much more granular control over security.ELB     - Elastic Load BalancingAZ     - Amazon Availability Zones (Data Centers)
  • key Amazon terms - #3RDS     - Amazon Relational Database ServiceAmazon RDS is a distributed relational database service by Amazon.com.It is a web service running "in the cloud" and provides a relational database for usein applications.SupportingMySQL databasesOracle databasesMicrosoft SQL ServerECU     - Elastic Computational UnitOne EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz2007 Opteron or 2007 Xeon processor.SQS     - Simple Queue Service
  • list of services goes on…
  • humor“We will launch site on EC2 with EBS behind ELB withdomain registered on Route 53Your images will come from CloudFront,backup will go to S3and your DB on RDS with Multi-AZ availability”
  • first step first – createaccountGo to aws.amazon.comand just use youramazon.com account forstartAfter login go to IAM(Identity AccessManagement) to add multi-factor authentication; notto your root account, butcreate new account,assign privileges to it andadd MFA. After that useonly new account to loginto your AWS (with givenalias)
  • easy one – useCloudFormationFastest way to get Drupal on AWSis using predefined templates insideCloudFormation service.In this moment you can find 4(Drupal specific) templatesDrupal_Simple.templateDrupal_Single_Instance.templateDrupal_Single_Instance_With_RDS.templateDrupal_Multi_AZ.templateYou can use any other template asstarting point and customize it toyour needs.
  • steps afterCreate KeyPairAdd your home/corporate IP to be only allowed to access server over port 22(SSH).Create AMI from existing machineDrop original machineCreate new EC2 instance using just created AMI and your key-pairAdd Elastic IP and associate to your instanceConnect to instanceAdd DNS CNAME record using given Amazon DNS name:ec2-54-225-110-202.compute-1.amazonaws.com
  • demoMandatoryclouds image :-)
  • Amazon VPC - ultimategoalWe can install complete infrastructure required for Drupal using public set of serversELB (load balancer)AMI (servers images)RDS (Amazon relational database service)Elastic Cache...BUTAmazon VPC is a way to setup an isolated partition of AWS and control the networktopology.ServicesDynamodb, ElastiCache, SQS, SES, and CloudSearch are not yet available in VPC(things change on daily basis)RDS instances launched in VPC cannot be accessed over the internet (through theend point). You will need bastion server to access it
  • EC2 / VPC instances
  • EC2 - NAT Instance
  • VPC subnetsIP Ranges - When setting up a VPC you are essentially fixing the network of the VPC.Public and Private Subnets - The VPC network can be divided further in to smaller networksegments called as Subnets. Any VPC will have at least one SubnetYou can setup a Public Subnet which will have internet connectivity. Instances launched withina Public Subnet will have both outbound and inbound (through EIP) internet connectivitythrough the Internet Gateway attached to the Public SubnetPrivate Subnets are completely locked down. They do not have internet connectivity by defaultCreate number of Public and Private Subnets depending upon your architecture.
  • VPC security groups
  • AMI images
  • EBS volumes
  • autoscalingholy grailKey toElasticity is inautoscaling
  • how to autoscaleInstall AWS Command Line Tools from AmazonDownloadsDownload from: http://aws.amazon.com/developertools/2535Note: AWS Auto scaling needs Amazon CloudWatchmonitoring service to function. Amazon CloudWatch isbilled on usage basis.
  • step 1Configuring AWS Auto Scaling with AWS ELBelb-create-lb my-load-balancer --headers--listener "lb-port=80,instance-port=8080, protocol=HTTP"--availability-zones us-west-2clb-port -- load balancer portinstance-port -- app server port to which requestneeds to be forwardedmy-load-balancer -- name for my load balancer
  • step 2Create a launch configurationas-create-launch-config my-lconfig --image-id ami-e38823c8a--instance-type m1.small --key my-key-pair--group my-security-groupmy-lconfig -- name for launch configurationami-e38823c8a -- name for Amazon Machine Image (AMI) to belaunched during scalingm1.small -- Amazon EC2 instance sizemy-key-pair -- Key pair / security group settings for the AmazonEC2 instancesmy-security-group -- security group for instance
  • step 3Create an AWS Auto Scale Groupas-create-auto-scaling-group my-as-group --availability-zones us-west-2c--launch-configuration my-lconfig --max-size 11 --min-size 3 --cooldown 180--desired-capacity 2 --load-balancers my-load-balancermy-load-balancer -- LB name in which the new Amazon EC2 instances launched willbe attachedmy-as-group -- Name Auto Scale groupus-west-2c -- availability zone in which the auto scaled amazon EC2 instances willbe launched11/3 -- Maximum/Minimum number of Amazon EC2 instances maintained by AutoScaleDesired capacity is an important component of the as-create-auto-scaling-groupcommand. Although it is an optional parameter, desired capacity tells Auto Scalingthe number of instances you want to run initially.To adjust the number of instances you want running in your Auto Scaling group, youchange the value of --desired-capacity. If you dont specify --desired-capacity, itsvalue is the same as minimum group size
  • step 4this step is not available in Auto ScalingAPIConfigure the Auto scaling Triggers / Alarmsas-create-or-update-trigger my-as-trigger--auto-scaling-group my-as-group --namespace "AWS/EC2"--measure CPUUtilization --statistic Average--dimensions "AutoScalingGroupName=my-as-group"--period 60 --lower-threshold 20 --upper-treshold 80--load-breach-increment"=-2" --upper-breach-increment 4--breach-duration 180Measure the average CPU of the Auto Scale GroupScale out by 4 Amazon EC2 instances. Scale down by 2Amazon EC instancesLower CPU Limit is 20% and Upper CPU Limit is 80%
  • shutdown auto scalinggroupShutdown auto-scaling group - require 3 commandsas-update-auto-scaling-group bbd4me-as-group --min-size 0--max-size 0 --region us-west-2as-describe-auto-scaling-groups bbd4me-as-group --headers--region us-west-2as-delete-auto-scaling-group bbd4me-as-group--force-delete --region us-west-2
  • Thank you foryour attention.Questions?