AWS migration: getting to Data Center heaven with AWS and Chef

Juan Vicente Herrera Ruiz de Alejo
@jvicenteherrera

About Me
●
@jvicenteherrera
●
juan.vicente.herrera@gmail.com
●
http://www.linkedin.com/in/jvherrera
●
http://juanvicenteherrera.eu

Description
●
Social feed aggregation/recommendation app
●
Client developed by a Global Fortune 500
company that makes video consoles, TVs
and many years ago Walkmans…
●
Expected at the end of 2013 around
1.000.000 new users registered in the
platform and 170.000 DAU
●
All servers are running in AWS and the
deployments and configuration
management are handled by Chef.

System stats
Main Components
– Custom
API(Java)
– Beanstalk
– RabbitMQ
– Redis
– MongoDB
(Sharding)
EC2
– Production env: Reserved
instances for the mininum
configuration. On demand
instances for scale out.
– Staging env: Reserved instances
for ½ day
– Elastic Load Balancers
– Security Groups and ACLs
– Key Pairs per each subnet
– Current EC2 region is US east

VPC Subnet
VPC Subnet VPC Subnet VPC Subnet VPC Subnet VPC Subnet
DEV
Stage
APP
Stage
DB
Prod
APP
Stage
DB
DNS
VPN
DEV-
NAT
Public-
Nexus
Public
Git
server
Public-
Chef
Public-
Jenkins
Stage
NAT
Prod
NAT
Prod
NAT
Nagios
forwarder
ELB 1 Web Servers
Stage ELB 1 Web Servers
Prod
Security Group Security Group Security Group Security Group Security Group
Architecture/Infrastructure

APP and DB VPC
VPC Subnet
VPC Subnet
Prod
APP
Prod
DB
Security Group Security Group
Mongodb
Config1
Mongodb
Config2
Mongodb
Config2
Mongodb1
set1
Mongodb2
set1
Mongoarb
set1
Mongodb1
set2
Mongodb2
set2
Mongoarb
set2
MySQL master MySQL slave
LDAP master LDAP slave
ELB1 App1
APP1 servers
App2
master
App2
slave
Solr
master
Solr
slave
Varnish
master
Varnish
slave
Alfresco
APP2
servers
APP3
servers
APP3
servers
Redis Master Redis slave Logs
RabbitMQ
servers

Improvements achieved (I)
●
APIs are state-less so you can scale out very easily. Nodes
are created by Chef(Knife).
●
Fine integration with Chef. Ensure that you have the same
configuration in all of the environments and avoid
misconfigurations in production environment. Chef Bootstrap
ec2 instances works fully integrated with knife.
●
Get a quick and confident way to create an exact production
mirror (staging) environment with Chef and Cloudformation
– Before AWS/Chef → create a staging env took 6 weeks
– After AWS/Chef → create a staging env takes less than 1
day

● Save costs managing non-production environments
– Before AWS/Chef → environments up 24*7
– After AWS/Chef → environments up 8 hours / working
days (scripts in cron which use API Tools)
– Python Script example
● Outage recovery plan handled with nodes snapshots
(MongoDB) or Chef (other nodes stateless)
● Very quick response and customized consulting for the
project provided by Amazon Team.
Improvements achieved (II)

Staging example with dynamic ip (dhcp)
knife ec2 server create -I ami-af71f8c6 -r "role[apache]" -f
m1.medium --region us-east-1 -S scp-staging -i
/Users/juanvi/keypairs/scp-staging.pem -g sg-2418e54b -s
subnet-919cecfc -x ec2-user -N stapp-apache-Test -E staging
Staging example with static ip
ec2-run-instances ami-af71f8c6 -k vpc-public-10-234-1 -g sg-
379e6d58 -s subnet-cb9596a0 -t m1.xlarge --private-ip-
address 10.234.2.204
knife bootstrap 10.234.2.204 -/Users/juanvi/keypairs/scp-
staging.pem -r "role[webserver]" -N STAGING-public-
webserver2 -x ec2-user -E staging --sudo
Example Create a new node

What we have learned
●
Strongly recommended run servers in more than one availability
zone for avoid a total downtime in case of outage
us-east-1a us-east-1d

●For certain services balanced use TCP instead of
HTTP. The balancing of requests to different nodes of
our APIs by TCP internally solved some problems with
HTTP requests without closing sessions. We only use
HTTP balancing for requests that come to the public
Apache.
We noticed that a lot of Apache connections were not
closed properly with HTTP balance mode and in
some hours we reached the limit connections
Solved with TCP balance mode in ELB
What we have learned (II)

●Use Cloudformation to create network
resources automatically.
–Before Cloudformation→ create
one by one all of the resources
–After Cloudformation →create
automatically all the nodes and
network resources of an entire
environment in one execution
–Cloudformation Example
What we have learned (III)

●Analyze performance tests for choose the
minimum number of nodes that will be running
24 * 7 and sizes to reserve instances.
Reserved instances reduce the cost to 2/3.
–Before AWS/Chef→ limits in the
performance tests caused by non
available servers due to their costs. Test
simulated.
–After AWS/Chef →High-powerful
Instances available per use only for
some hours or days with a reduced cost
What we have learned (IV)

●Advisable to use a large number of small
servers instances close to 100% CPU usage,
instead of having few powerful machines with
their resources wasted, and launch new
nodes and balancing requests among them
when load increase.
●Pre balancers warming if you expect a
exponential increase of the requests
●Request to support increasing the initial
limitations of instances that can run on a
simultaneous EC2 (20)
What we have learned (V)

• You must adapt to the size of the instances
whose resources(CPU, RAM...) are predefined
and not customizable
• You have no control over the evolution of the
products that your service depends
• You don't have access to the logs of some
instances (for example load balancers)
• Danger engaging AWS services and consequent
difficulty migrating to another DC.
Things to consider

●
@jvicenteherrera
●
juan.vicente.herrera@gmail.com
●
http://www.linkedin.com/in/jvherrera
●
http://juanvicenteherrera.eu
for your attention

AWS migration: getting to Data Center heaven with AWS and Chef

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to AWS migration: getting to Data Center heaven with AWS and Chef

Similar to AWS migration: getting to Data Center heaven with AWS and Chef (20)

More from Juan Vicente Herrera Ruiz de Alejo

More from Juan Vicente Herrera Ruiz de Alejo (20)

Recently uploaded

Recently uploaded (20)

AWS migration: getting to Data Center heaven with AWS and Chef