Successfully reported this slideshow.
Your SlideShare is downloading. ×

Scaling drupal horizontally and in cloud

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 38 Ad

Scaling drupal horizontally and in cloud

Download to read offline

Vancouver Drupal group presentation for April 25, 2013.
How to deploy Drupal on
- multiple web servers,
- multiple web and database servers, and
- how to join all that together and make site deployed on Amazon Cloud (Virtual Private Cloud) inside
- one availability zone
- multiple availability zones deployment.

Session cover details about what you need in order to get Drupal deployed on separate servers, what are issues/concerns, and how to solve them.

Vancouver Drupal group presentation for April 25, 2013.
How to deploy Drupal on
- multiple web servers,
- multiple web and database servers, and
- how to join all that together and make site deployed on Amazon Cloud (Virtual Private Cloud) inside
- one availability zone
- multiple availability zones deployment.

Session cover details about what you need in order to get Drupal deployed on separate servers, what are issues/concerns, and how to solve them.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Viewers also liked (17)

Advertisement

Similar to Scaling drupal horizontally and in cloud (20)

Recently uploaded (20)

Advertisement

Scaling drupal horizontally and in cloud

  1. 1. scaling Drupal horizontally and in the cloud
  2. 2. about me name: Vladimir Ilic email: burger.boy.daddy@gmail.com twitter: @burgerboydaddy http://burgerboydaddy.com
  3. 3. agenda why all of this? step 1: test locally -> from one server to the server farm, step 2: multiple web and database servers, step 3: how to join all that together and make site deployed on Amazon Cloud and inside Virtual Private Cloud Amazon term benefits single availability zone multiple availability zones.
  4. 4. why? if you want to increase site speed if you want your site to be responsive and to work under heavy stress if you want to be in control what goes on your server
  5. 5. get it divided / decouple Easy to do inside local development/hosting environment Just separate web, database and cache servers Problems we can increase resources only vertically Not all resources are used same way (web server will probably die before cache or MySQL) Multiple “single points of failure”
  6. 6. multiple web servers – one dbApache load balancer in front of 2-3 web servers; each server with integrated APC cache Multiple cache servers Powerful MySQL server In real life you can use some other LB solution (this one is great for proof of concept moments). Without dedicated file server; used bi-directional rsync replication
  7. 7. configuring Apache load balancer Apache web server ships a load balancer module called mod_proxy_balancer (since version 2.2). All you need to do is to enable this module and the modules mod_proxy and mod_proxy_http. Please note that without mod_proxy_http, balancer just won't work. LoadModule proxy_module mod_proxy.so LoadModule proxy_http_module mod_proxy_http.so LoadModule proxy_balancer_module mod_proxy_balancer.so
  8. 8. many to many In this case each web server will have it's own db server. Reason for this: Higher site availability; if one db server is down, second one can continue to serve customers.
  9. 9. Amazon AWS Why Amazon (business point of view) Most complete cloud solution on the market. Almost zero upfront infrastructure investment Just-in-time infrastructure Pay as you go – pay what you use Constant price drop Easy to deploy and scale ….
  10. 10. why Amazon (technical benefits) Automation – “Scriptable infrastructure”: You can create repeatable build and deployment systems by leveraging programmable (API-driven) infrastructure. Auto-scaling: You can scale your applications up and down to match your unexpected demand without any human intervention. Proactive Scaling: Scale your application up and down to meet your anticipated demand; Elasticity
  11. 11. why Amazon (technical benefits) More Efficient Development lifecycle: Production systems may be easily cloned for use as development and test environments. Improved Testability: Never run out of hardware for testing. Inject and automate testing at every stage during the development process. Disaster Recovery and Business Continuity: The cloud provides a lower cost option for maintaining a fleet of DR servers and data storage.
  12. 12. understanding elasticity
  13. 13. key Amazon terms – #1 AWS – Amazon Web Services Amazon Web Services (AWS) is a collection of remote computing services (also called web services) that together make up a cloud computing platform. EC2     - Elastic Compute Cloud EC2 allows users to rent virtual computers on which to run their own computer applications. EC2 allows scalable deployment of applications by providing a Web service through which a user can boot an Amazon Machine Image to create a virtual machine. A user can create, launch, and terminate server instances as needed, paying by the hour for active servers, hence the term "elastic". S3       - Simple Storage Service Amazon S3 (Simple Storage Service) is an online storage web service offered by AWS. AMI     - Amazon Machine Images An Amazon Machine Image (AMI) is a special type of virtual appliance which is used to instantiate (create) a virtual machine within the Amazon Elastic Compute Cloud ("EC2").
  14. 14. key Amazon terms - #2 EBS     - Elastic Block Storage Amazon Elastic Block Storage (EBS) provides raw block devices that can be attached to Amazon EC2 instances. Can be used like any raw block device. In a typical use case, this would include formatting the device with a filesystem and mounting said filesystem. VPC     - Virtual Private Cloud Amazon Virtual Private Cloud (VPC) is a commercial cloud computing service that provides a virtual private cloud. Unlike traditional EC2 instances which are allocated internal and external IP numbers by Amazon, the customer can assign IP numbers of their choosing from one or more subnets. VPC provides much more granular control over security. ELB     - Elastic Load Balancing AZ     - Amazon Availability Zones (Data Centers)
  15. 15. key Amazon terms - #3 RDS     - Amazon Relational Database Service Amazon RDS is a distributed relational database service by Amazon.com. It is a web service running "in the cloud" and provides a relational database for use in applications. Supporting MySQL databases Oracle databases Microsoft SQL Server ECU     - Elastic Computational Unit One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. SQS     - Simple Queue Service
  16. 16. list of services goes on…
  17. 17. humor “We will launch site on EC2 with EBS behind ELB with domain registered on Route 53 Your images will come from CloudFront, backup will go to S3 and your DB on RDS with Multi-AZ availability”
  18. 18. first step first – create accountGo to aws.amazon.com and just use your amazon.com account for start After login go to IAM (Identity Access Management) to add multi- factor authentication; not to your root account, but create new account, assign privileges to it and add MFA. After that use only new account to login to your AWS (with given alias)
  19. 19. easy one – use CloudFormation Fastest way to get Drupal on AWS is using predefined templates inside CloudFormation service. In this moment you can find 4 (Drupal specific) templates Drupal_Simple.template Drupal_Single_Instance.template Drupal_Single_Instance_With_R DS.template Drupal_Multi_AZ.template You can use any other template as starting point and customize it to your needs.
  20. 20. steps after Create KeyPair Add your home/corporate IP to be only allowed to access server over port 22 (SSH). Create AMI from existing machine Drop original machine Create new EC2 instance using just created AMI and your key-pair Add Elastic IP and associate to your instance Connect to instance Add DNS CNAME record using given Amazon DNS name: ec2-54-225-110-202.compute-1.amazonaws.com
  21. 21. demo Mandatory clouds image :-)
  22. 22. Amazon VPC - ultimate goal We can install complete infrastructure required for Drupal using public set of servers ELB (load balancer) AMI (servers images) RDS (Amazon relational database service) Elastic Cache... BUT Amazon VPC is a way to setup an isolated partition of AWS and control the network topology. Services Dynamodb, ElastiCache, SQS, SES, and CloudSearch are not yet available in VPC (things change on daily basis) RDS instances launched in VPC cannot be accessed over the internet (through the end point). You will need bastion server to access it
  23. 23. EC2 / VPC instances
  24. 24. EC2 - NAT Instance
  25. 25. VPC subnets IP Ranges - When setting up a VPC you are essentially fixing the network of the VPC. Public and Private Subnets - The VPC network can be divided further in to smaller network segments called as Subnets. Any VPC will have at least one Subnet You can setup a Public Subnet which will have internet connectivity. Instances launched within a Public Subnet will have both outbound and inbound (through EIP) internet connectivity through the Internet Gateway attached to the Public Subnet Private Subnets are completely locked down. They do not have internet connectivity by default Create number of Public and Private Subnets depending upon your architecture.
  26. 26. VPC security groups
  27. 27. AMI images
  28. 28. EBS volumes
  29. 29. autoscaling holy grail Key to Elasticity is in autoscaling
  30. 30. how to autoscale Install AWS Command Line Tools from Amazon Downloads Download from: http://aws.amazon.com/developertools/2535 Note: AWS Auto scaling needs Amazon CloudWatch monitoring service to function. Amazon CloudWatch is billed on usage basis.
  31. 31. step 1 Configuring AWS Auto Scaling with AWS ELB elb-create-lb my-load-balancer --headers --listener "lb-port=80,instance-port=8080, protocol=HTTP" --availability-zones us-west-2c lb-port -- load balancer port instance-port -- app server port to which request needs to be forwarded my-load-balancer -- name for my load balancer
  32. 32. step 2 Create a launch configuration as-create-launch-config my-lconfig --image-id ami-e38823c8a --instance-type m1.small --key my-key-pair --group my-security-group my-lconfig -- name for launch configuration ami-e38823c8a -- name for Amazon Machine Image (AMI) to be launched during scaling m1.small -- Amazon EC2 instance size my-key-pair -- Key pair / security group settings for the Amazon EC2 instances my-security-group -- security group for instance
  33. 33. step 3 Create an AWS Auto Scale Group as-create-auto-scaling-group my-as-group --availability-zones us-west-2c --launch-configuration my-lconfig --max-size 11 --min-size 3 --cooldown 180 --desired-capacity 2 --load-balancers my-load-balancer my-load-balancer -- LB name in which the new Amazon EC2 instances launched will be attached my-as-group -- Name Auto Scale group us-west-2c -- availability zone in which the auto scaled amazon EC2 instances will be launched 11/3 -- Maximum/Minimum number of Amazon EC2 instances maintained by Auto Scale Desired capacity is an important component of the as-create-auto-scaling-group command. Although it is an optional parameter, desired capacity tells Auto Scaling the number of instances you want to run initially. To adjust the number of instances you want running in your Auto Scaling group, you change the value of --desired-capacity. If you don't specify --desired-capacity, its value is the same as minimum group size
  34. 34. step 4 this step is not available in Auto Scaling API Configure the Auto scaling Triggers / Alarms as-create-or-update-trigger my-as-trigger --auto-scaling-group my-as-group --namespace "AWS/EC2" --measure CPUUtilization --statistic Average --dimensions "AutoScalingGroupName=my-as-group" --period 60 --lower-threshold 20 --upper-treshold 80 --load-breach-increment"=-2" --upper-breach-increment 4 --breach-duration 180 Measure the average CPU of the Auto Scale Group Scale out by 4 Amazon EC2 instances. Scale down by 2 Amazon EC instances Lower CPU Limit is 20% and Upper CPU Limit is 80%
  35. 35. shutdown auto scaling group Shutdown auto-scaling group - require 3 commands as-update-auto-scaling-group bbd4me-as-group --min-size 0 --max-size 0 --region us-west-2 as-describe-auto-scaling-groups bbd4me-as-group --headers --region us-west-2 as-delete-auto-scaling-group bbd4me-as-group --force-delete --region us-west-2
  36. 36. Thank you for your attention. Questions?

Editor's Notes

  • At start we can add as many web servers as we want. One important part is to configure web servers to share files. This was done by using rsync replication and mounting on all servers /var/www/html/mysite/sites folder as shared one (expecting that Drupal core is same on all servers I didn't want to share it). I like this solution since we will have source code on both web instances, and not on the file storage. This makes it possible to release new "source code" (not database!) instances of Drupal modules. Or you can quickly change some lines on a PROD environment for debugging (as long as you block traffic from visitors to that web instance of course ;-)). Memcached Move the caching mechanism to Memcached. Memcached can store all caching data in memory. So it doesn't use the MySQL tables any longer. Also, Memcached can run in a clustered environment, so no need to manually flush the remote cache. The Memcached Drupal module and Memcached daemon would take care of it. Because of the movement of caching to Memcached, databases would not be on heavy load any longer.  Database server Database server can be just one or full MySQL cluster (depending on amount of available $$$). File storage replication Another (possible) improvement to above solution would be to store all data on NAS file storage. The NAS storage holds all data in sites/##YOUR_SITE_NAME##/files directory. Compared with the previous solution, we don’t need to sync data again. Again: one disadvantage here: if the NAS file storage goes out: no file in your files will be served. Nor by web-server1, nor by web-server 2. As previous solution, problem with this solution lay in some single points of failure, like only one load balancer and possible one MySQL server. 
  • Because mod_proxy makes Apache to become an (open) proxy server, and open proxy servers are dangerous both to your network and to the Internet at large, I completely disable this feature:  ProxyRequests Off <Proxy> Order deny,allow Deny from all </Proxy>
  • So this option is many aspects similar to the default one, with one big difference. In this case each web server will have it's own db server. Reason for this: Higher site availability; if one db server is down, second one can continue to serve customers. - Be sure to exclude some tables from replication DrupalDB.cache% DrupalDB.watchdog% DrupalDB.temp_search_sids DrupalDB.temp_search_results - And exclude all databases that are local mysql test
  • Almost zero upfront infrastructure investment: If you have to build a large-scale system it may cost a fortune to invest in real estate, physical security, hardware (racks, servers, routers, backup power supplies), hardware management (power management, cooling), and operations personnel. Because of the high upfront costs, the project would typically require several rounds of management approvals before the project could even get started. Now, with utility-style cloud computing, there is no fixed cost or startup cost. Just-in-time Infrastructure: In the past, if your application became popular and your systems or your infrastructure did not scale you became a victim of your own success. Conversely, if you invested heavily and did not get popular, you became a victim of your failure. By deploying applications in-the-cloud with just-in-time self-provisioning, you do not have to worry about pre-procuring capacity for large-scale systems. This increases agility, lowers risk and lowers operational cost because you scale only as you grow and only pay for what you use. More efficient resource utilization: System administrators usually worry about procuring hardware (when they run out of capacity) and higher infrastructure utilization (when they have excess and idle capacity). With the cloud, they can manage resources more effectively and efficiently by having the applications request and relinquish resources on-demand. Usage-based costing: With utility-style pricing, you are billed only for the infrastructure that has been used. You are not paying for allocated but unused infrastructure. This adds a new dimension to cost savings. You can see immediate cost savings (sometimes as early as your next month’s bill) when you deploy an optimization patch to update your cloud application. For example, if a caching layer can reduce your data requests by 70%, the savings begin to accrue immediately and you see the reward right in the next bill. Moreover, if you are building platforms on the top of the cloud, you can pass on the same flexible, variable usage-based cost structure to your own customers.
  • Automation – “Scriptable infrastructure”: You can create repeatable build and deployment systems by leveraging programmable (API-driven) infrastructure. Auto-scaling: You can scale your applications up and down to match your unexpected demand without any human intervention. Auto-scaling encourages automation and drives more efficiency. Proactive Scaling: Scale your application up and down to meet your anticipated demand with proper planning understanding of your traffic patterns so that you keep your costs low while scaling.
  • More Efficient Development lifecycle: Production systems may be easily cloned for use as development and test environments. Staging environments may be easily promoted to production. Improved Testability: Never run out of hardware for testing. Inject and automate testing at every stage during the development process. You can spawn up an “instant test lab” with pre-configured environments only for the duration of testing phase. Disaster Recovery and Business Continuity: The cloud provides a lower cost option for maintaining a fleet of DR servers and data storage. With the cloud, you can take advantage of geo-distribution and replicate the environment in other location within minutes.
  • AWS – Amazon Web Services Amazon Web Services (AWS) is a collection of remote computing services (also called web services) that together make up a cloud computing platform, offered over the Internet by Amazon.com. EC2     - Elastic Compute Cloud Amazon Elastic Compute Cloud (EC2) is a central part of Amazon.com's cloud computing platform, Amazon Web Services (AWS). EC2 allows users to rent virtual computers on which to run their own computer applications. EC2 allows scalable deployment of applications by providing a Web service through which a user can boot an Amazon Machine Image to create a virtual machine. A user can create, launch, and terminate server instances as needed, paying by the hour for active servers, hence the term "elastic". EC2 provides users with control over the geographical location of instances that allows for latency optimization and high levels of redundancy. S3       - Simple Storage Service Amazon S3 (Simple Storage Service) is an online storage web service offered by AWS. Amazon S3 provides storage through web services interfaces (REST, SOAP, and BitTorrent). AMI     - Amazon Machine Images An Amazon Machine Image (AMI) is a special type of virtual appliance which is used to instantiate (create) a virtual machine within the Amazon Elastic Compute Cloud ("EC2"). It serves as the basic unit of deployment for services delivered using EC2.
  • EBS     - Elastic Block Storage Amazon Elastic Block Storage (EBS) provides raw block devices that can be attached to Amazon EC2 instances. Can be used like any raw block device. In a typical use case, this would include formatting the device with a filesystem and mounting said filesystem. In addition EBS supports a number of advanced storage features, including snapshotting and cloning. VPC     - Virtual Private Cloud Amazon Virtual Private Cloud (VPC) is a commercial cloud computing service that provides a virtual private cloud, allowing enterprise customers to access the Amazon Elastic Compute Cloud over an IPsec based virtual private network. Unlike traditional EC2 instances which are allocated internal and external IP numbers by Amazon, the customer can assign IP numbers of their choosing from one or more subnets. By giving the user the option of selecting which AWS resources are public facing and which are not, VPC provides much more granular control over security. ELB     - Elastic Load Balancing AZ     - Amazon Availability Zones (Data Centers)
  • RDS     - Amazon Relational Database Service Amazon RDS is a distributed relational database service by Amazon.com. It is a web service running "in the cloud" and provides a relational database for use in applications. It is aimed at simplifying the set up, operation, and scaling a relational database. Supporting MySQL databases Oracle databases Microsoft SQL Server ECU     - Elastic Computational Unit One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. SQS     - Simple Queue Service
  • Note: Because the set of IP addresses associated with a Elastic IP can change over time, you should never create an "A" record with any specific IP address. If you want to use a friendly DNS name for your EIP/ELB instead of the name generated by the Elastic Load Balancing service, you should create a CNAME record for the LoadBalancer DNS name, or use Amazon Route 53 to create a hosted zone.
  • NAT Instance - By default the Private Subnets in a VPC do not have internet connectivity. They cannot be accessed over the internet and neither can they make outbound connections to internet resources. But let's say you have setup a database on an EC2 Instance in the Private Subnet and have implemented a backup mechanism. You would want to push the backups to Amazon S3. But the Private Subnet's cannot access S3 since there is no internet connectivity. You can achieve it by placing a NAT Instance in the VPC1. Through NAT Instance outbound connectivity for Private Subnet Instances can be achieved. The Instances will still not be reachable from the internet (inbound)2. You need to configure the VPC Routing Table to enable all outbound internet traffic for the Private Subnet to go through the NAT Instance3. AWS provides a ready NAT AMI (ami-f619c29f) which you can use to launch the NAT Instance4. You can have only one NAT Instance per VPC
  • Since you can have only one NAT Instance per VPC, you need to be aware that it becomes a Single Point Of Failure in the architecture. If the architecture depends on the NAT Instance for any critical connectivity, it is an area to be reviewed. 1. And you are limited by the bandwidth availability of a single NAT Instance. So do not build architecture that will have internet bandwidth requirements from the Private Subnet with NAT. 2. You can create network topology with multiple NAT servers
  • IP Ranges - When setting up a VPC you are essentially fixing the network of the VPC. And if the VPC requires VPN connectivity (as in most of the cases), care should be taken to choose the IP range of the VPC and avoid any IP conflicts. Public and Private Subnets - The VPC network can be divided further in to smaller network segments called as Subnets. Any VPC will have at least one SubnetYou can setup a Public Subnet which will have internet connectivity. Instances launched within a Public Subnet will have both outbound and inbound (through EIP) internet connectivity through the Internet Gateway attached to the Public SubnetPrivate Subnets are completely locked down. They do not have internet connectivity by defaultCreate number of Public and Private Subnets depending upon your architecture. Place all public facing servers such as web servers, search servers in the public subnet. Keep DB servers, cache nodes, application servers in the private subnet
  • - Use Simple GUI to build SG's - Divide your resources Public Web DB Network File Server (separated or inside Web group? VPC Security Groups are different from normal EC Security Groups. With EC2 Security Groups you can control the ingress into your EC2 Instance. With VPC Security Groups, you have the option to control both inbound and outbound traffic. When something is not accessible you have to check both inbound and outbound rules set in the VPC Security Group ELB Security Group - When you launch an ELB within VPC, you have the option to specify a VPC Security Group to be attached with the ELB. This is not available for ELB launched outside VPC in normal EC2. With this additional option, you can control access to specific ELB ports from specific IP sources. On the backend EC2 Instances' Security Group, you can allow access to the VPC Security Group that you associated with the ELB 7. Internal ELB - When you launch an ELB within VPC, you also have additional option to launch it as an "Internal Load Balancer". You can use an "Internal Load Balancer" to load balance your application tier from the web tier above.

×