Rails infrastructure
Upcoming SlideShare
Loading in...5
×
 

Rails infrastructure

on

  • 534 views

 

Statistics

Views

Total Views
534
Views on SlideShare
532
Embed Views
2

Actions

Likes
0
Downloads
10
Comments
0

2 Embeds 2

http://www.hanrss.com 1
http://feeds.feedburner.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Rails infrastructure Rails infrastructure Presentation Transcript

  • Rails Infrastructurehttp://omarqureshi.net@omarqureshi 1
  • Topics Covered 2
  • Topics Covered• Lots of facepalm 2
  • Topics Covered• Lots of facepalm• Rackspace 2
  • Topics Covered• Lots of facepalm• Rackspace• Linux distribution choices 2
  • Topics Covered• Lots of facepalm• Rackspace• Linux distribution choices• Automation and Orchestration 2
  • Topics Covered• Lots of facepalm• Rackspace• Linux distribution choices• Automation and Orchestration• Logging 2
  • Edison Nation 3
  • Edison Nation• Distributed team (US/Canada/UK) 3
  • Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained application 3
  • Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained application• Rails 2.3 app 3
  • Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained application• Rails 2.3 app• (previous) focus on churn 3
  • Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained application• Rails 2.3 app• (previous) focus on churn• 3 Rails developers (+ 1 designer and an intern) 3
  • Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained application• Rails 2.3 app• (previous) focus on churn• 3 Rails developers (+ 1 designer and an intern)• >100,000 members 3
  • Edison Nation• Distributed team (US/Canada/UK)• Old (2009) and poorly maintained application• Rails 2.3 app• (previous) focus on churn• 3 Rails developers (+ 1 designer and an intern)• >100,000 members• Little inhouse sysadmin experience 3
  • Additional Quirks 4
  • Additional Quirks• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3 4
  • Additional Quirks• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3• Provisioning process was terribly slow 4
  • Additional Quirks• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3• Provisioning process was terribly slow• Very little caching 4
  • Additional Quirks• Used 1.8.7 since God does not play nicely with Ruby Enterprise Edition and we couldn’t use 1.9 because of Rails 2.3• Provisioning process was terribly slow• Very little caching• Quite a lot of server generated JS 4
  • SURPRISE! 5
  • Featured on Nightline 6
  • Featured on Nightline• No warning (announced pretty late EST) 6
  • Featured on Nightline• No warning (announced pretty late EST)• No preparation time (engineers already signed off for the night) 6
  • Featured on Nightline• No warning (announced pretty late EST)• No preparation time (engineers already signed off for the night)• Couldn’t provision servers to deal with the traffic spike in time (and we would have needed a lot of them) 6
  • 7
  • Load balancer recorded3000 concurrent requests including assets or around 300 excluding assets 8
  • The Stack 9
  • Figuring out the bottlenecks 10
  • Nginx kept serving -though these were 502 errors 11
  • Post-mortem of therequests that did make itthrough made it look like the application servers were to blame 12
  • Database was underheavy load but by nomeans the bottleneck 13
  • Make better use of theapplication server pool 14
  • Got some quick wins inthe code by caching more and moving jQuery to Google 15
  • <script src="// ajax.googleapis.com/ ajax/libs/jquery/1.6.2/jquery.min.js"></script> 16
  • Get rid of any server generated JS 17
  • Pretty much re-trainedmyself to be a systems administrator 18
  • Completely re-think the way we do Operations 19
  • What components makeup a solid multi-server setup? 20
  • Load balancing 21
  • TLS SNI Extension 22
  • Theoretically only have two load balancers for ALL domains 23
  • Simplified SSL Nginx config server { listen 443; server_name www.edisonnation.com; ssl on; ssl_certificate /path/to/cert/en.com.cert; ssl_certificate_key /path/to/cert/en.com.key; } server { listen 443; server_name www.edisonnation.vn; ssl on; ssl_certificate /path/to/cert/en.vn.cert; ssl_certificate_key /path/to/cert/en.vn.key; } 24
  • Windows XP + Internet Explorer 25
  • Windows XP• Internet Explorer 6-8 on Windows XP would not work compared to modern OS + Browser combinations• Ignores the server name for HTTPS• Will give you an invalid SSL certificate error when browsing 26
  • Rackspace (v2) Load Balancer 27
  • Rackspace Load Balancer• SSL termination at the Load Balancer • No need to serve HTTPS traffic from Nginx any more - X-Forwarded-Proto tells Rails if page is supposed to be encrypted • Less processing required here • Less complexity managing certificates and Nginx configs 28
  • Split up the application servers 29
  • Move Nginx to it’s own machine and reverseproxy back to Unicorn app servers 30
  • New stack 31
  • Switch Unicorn to useTCP sockets rather than Unix 32
  • Linux 33
  • Debian Squeeze 34
  • Why Debian? 35
  • Why Debian?• Pick the most stable distribution 35
  • Why Debian?• Pick the most stable distribution• Debian is pretty stable, plus you can use Lucid Lynx packages for anything that you need which is cutting edge 35
  • Why Debian?• Pick the most stable distribution• Debian is pretty stable, plus you can use Lucid Lynx packages for anything that you need which is cutting edge• However, God requires you to use a custom kernel before it will work properly http://bugs.debian.org/cgi-bin/ bugreport.cgi?bug=609004 35
  • Ubuntu LTS also viable as a choice as is any RHEL 36
  • Basically, anything wherethe packages aren’t crazy and support is still there (not Arch/Fedora/ Ubuntu) 37
  • Packaging 38
  • We don’t image servers(but may start doing so) 39
  • Provisioning tools should be able to build a server on any hardware 40
  • Never build from source 41
  • Never build from source• Either package yourself or get from a reliable source 41
  • Never build from source• Either package yourself or get from a reliable source• Ditch RVM (though they now have binary rubies - anyone tried?) 41
  • Never build from source• Either package yourself or get from a reliable source• Ditch RVM (though they now have binary rubies - anyone tried?)• Check out Brightbox Next Generation Ubuntu packages http://wiki.brightbox.co.uk/docs:ruby-ng 41
  • Pin everything elsePackage: *Pin: release a=squeeze-backportsPin-Priority: 200Package: puppetPin: release a=squeeze-backportsPin-Priority: 900Package: puppet-commonPin: release a=squeeze-backportsPin-Priority: 900 42
  • Server build time decreased from 45minutes to < 15 minutes 43
  • How do we provision servers? 44
  • A small bash script + Puppet 45
  • Bash script does basic pinning and installsessential packages (Ruby + Emacs + Puppet + puppet-el) 46
  • Works very well since we use Hetzner EX4S’s for non-critical systems 47
  • Hetzner + (Xen/OpenVZ) == FANTASTIC 48
  • (See me at the end if you want to talk aboutprovisioning some more) 49
  • Managing Puppet 50
  • Always running Puppet rather than run on demand 51
  • Encourage developers todocument infrastructure changes 52
  • Still unsure about how togo about Puppet testing 53
  • Campfire reporting 54
  • Orchestration 55
  • MCollective 56
  • STOMP server connectsall of our servers together 57
  • MCollective executesRemote Procedure Calls 58
  • Great for pushing outurgent Puppet updates 59
  • Also great for Munin#!/bin/bashstr="includedir /etc/munin/munin-conf.d"for addr in `/usr/bin/mco facts ipaddress | awk {gsub("found", "");print $1} | grep "^[0-9]"`do fqdn=`/usr/bin/mco facts fqdn -F ipaddress=$addr | grep "^W" |awk {print $1}` str="$str[$fqdn] address $addr use_node_name yes"doneecho "$str" > /etc/munin/munin.conf/usr/sbin/service munin-node restart 60
  • No longer have tomanually maintain Munin 61
  • Can be used for other painful tasks - such asmaking sure packages are up to date on all the servers 62
  • RPC libraries are written in Ruby 63
  • Service management 64
  • M/Monit 65
  • Not free - however,extremely worthwhile. Can hook into shell scripts 66
  • Log management 67
  • Graylog2 68
  • Java JAR with a Rails frontend andElasticsearch + Mongo backend 69
  • Deals with exception management 70
  • Can do analytics on logs 71
  • Specify streams of logs (i.e 404 errors) 72
  • No longer have to jugglelots of files which exist on different machines 73
  • A little tricky to set-up 74
  • Use the gelf-rb gemsparingly in your Rails app and NOT as your main logger 75
  • Found out, that the log requests were not threaded 76
  • For us, gelf-rb ONLY sends exception notifications 77
  • Introducing Logstashd 78
  • Written by the awesome Jordan Sissel (FPM) 79
  • Nginx doesn’t support sending to Graylog straight out 80
  • Logstashd acts as a logtailing and transporting mechanism 81
  • Runs in its own process - so threading doesnt matter so much 82
  • Whats left? 83
  • Upgrade to Rails 3 84
  • Great benefits with Rails 3 such as Dalli formemcached failovers and Lograge 85
  • Oh yeah - assets pipeline! 86
  • Implement read slaves for backups 87
  • Make Jenkins do our deployment 88
  • Better caching solutions - maybe Varnish / conditional GET 89
  • Re-implement TLS SNI once Windows XP security updates stop 90
  • Handle large spikes better 91
  • Autoscaling? 92
  • Using AWS as anadditional cloud failover 93
  • Hybrid Dedicated andCloud for production 94