OSOM Operations in the Cloud
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
432
On Slideshare
432
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Operations in the Cloud Marius Stuparu DevOps @ SDL Language Weaver
  • 2. SummaryThe Cloud, AWS Web ServicesBest Practices in the CloudHigh Availability and Scalability in the CloudAlternative Open Source Solutions
  • 3. What is cloud computing? IaaS, PaaS, SaaS
  • 4. Who is using it?
  • 5. Why?- Zynga thought 200 thousand daily activeusers on Farmville would count as a success(2009).- 1 million new net users every single week- Within a year, FarmVille had more than 50million monthly active users- CityVille on AWS was able to scale up to ~60million active monthly users in the first 2months!!!
  • 6. AWS● Amazon started really simple● Ec2 - Virtual Machines service● S3 - Storage service
  • 7. Elastic Block Store(EBS)● local storage is volatile● use EBS for persistent storage(network accessible block storage volumes).● try to keep persistent data on S3 or RDS. EBS performance varies
  • 8. Elastic Block Store(EBS)
  • 9. Elastic Load BalancersCool things:● ELB- Availability zones● SSL termination
  • 10. Relational Database Service● RDS (Multi AZ availability, fail-over ~5min)● Easy to launch replicas and offload read traffic (3 clicks away)● Backup using PITR, Snapshots
  • 11. ● Infrastructure as code● Configuration management● Orchestration● Automated Provisioning/Auto Scaling● Repeatable/Reproducible Servers (cloning your servers)
  • 12. April 21 2011 Server Down
  • 13. Eliminate Single Point of Failures● architect around these problems● decouple your components (queues)● build asynchronous systems and scale horizontally● make your applications as stateless as possible● use multiple cloud providers (AWS, RackSpace, GoGrid, Linode.)
  • 14. Netflix Chaos Monkey
  • 15. All about being fast● Cache database responses, objects, fully formed html (ElastiCache)● CDN● Follow the Sun● Try to touch metal only when necessary, use local storage or SAN, avoid NFS
  • 16. DNS Management● Route 53 LBR● LBR = Latency Based Routingwhat it does:● route your user to the closest server which runs your application
  • 17. Monitor and graph everything● RightScale collectd, AWS CloudWatch● NewRelic● PingDom, Catch Point, Uptrends● Nagios, Cacti, Zabbix● Splunk
  • 18. AWS Cloudwatch
  • 19. New Relic - RUM
  • 20. New Relic - Application Monitoring
  • 21. Open Source AlternativesForeman (no logo) - web ui for puppet
  • 22. Other useful tools● Git (Github)● Vagrant● If you have a python stack look at boto (Python interface to Amazon Web Services)
  • 23. Q&A
  • 24. Thank You!
  • 25. Slides: http://www.slideshare.net/mstuparu/osom-operations-in-the-cloud Contact information: marius@ec2.ro / mstuparu@sdl.com