SlideShare a Scribd company logo
spkr8.com/t/13191



                        Ops for Developers
                Or: How I Learned To Stop Worrying And Love The Shell

                              Ben Klang
                        bklang@mojolingo.com




Friday, August 10, 12
Prologue

                        Introductions



Friday, August 10, 12
Who Am I?
                              Ben Klang
                        @bklang Github/Twitter
                        bklang@mojolingo.com




Friday, August 10, 12
What are my passions?

                        • Telephony Applications
                        • Information Security
                        • Performance and Availability Design
                        • Open Source

Friday, August 10, 12
What do I do?


                        • Today I write code and
                          run Mojo Lingo
                        • But Yesterday...


Friday, August 10, 12
This was my world




Friday, August 10, 12
Ops Culture



Friday, August 10, 12
“I am allergic to downtime”



Friday, August 10, 12
It’s About Risk

                        • If something breaks, it will be my pager that
                          goes off at 2am
                        • New software == New ways to break
                        • If I can’t see it, I can’t manage it or monitor
                          it and it will break



Friday, August 10, 12
Agenda
                        •   9:00 - 10:30              •   1:30 - 3:00

                            •   Operating Systems &       •   Autopsy of an HTTP
                                Hardware                      Request

                            •   All About Bootup          •   Dealing with Murphy

                        •   10:30 - 11:00: Break      •   3:00 - 3:30: Break

                        •   11:00 - 12:30             •   3:30 - 5:00

                            •   Observing a Running       •   Scaling Up
                                System
                                                          •   Deploying Apps
                            •   Optimization/Tuning
                                                          •   Audience Requests
                        •   12:30 - 1:30 Lunch

Friday, August 10, 12
Part I

                        Operating Systems &
                            Hardware


Friday, August 10, 12
OS History Lesson
                         BSD, System V, Linux and Windows




Friday, August 10, 12
UNICS             Soon renamed “Unix
                                           (Sep. 1969)       Time Sharing System
                                                                 Version 1”
                        UNIX Time Sharing System Version 5
                                           (Jun. 1974)


                          UNIX Sys III                    1BSD
                             (Nov. 1981)                 (Mar. 1978)



                           UNIX Sys V                    4.3BSD
                             (Jan. 1983)                 (Jun. 1986)




Friday, August 10, 12
Friday, August 10, 12
Hardware Components



Friday, August 10, 12
Common Architectures
                        • Intel x86 (i386, x86_64)
                        • SPARC
                        • POWER
                        • ARM

                        • But none of this really matters anymore
Friday, August 10, 12
CPU Configurations

                        • Individual CPU
                        • SMP: Symmetric Multi-Processing
                        • Multiple Cores
                        • Hyperthreading/Virtual Cores

Friday, August 10, 12
(Virtual) Memory
                        • RAM + Swap = Available Memory
                        • Swapping strategies vary across OSes
                        • What your code sees is a complete
                          virtualization of this
                        • x86/32-bit processes can only “see” 3GB of
                          RAM from a 4GB address space


Friday, August 10, 12
Storage Types

                        • Local Storage (SATA, SAS, USB, Firewire)
                        • Network Storage (NFS, SMB, iSCSI, AOE)
                        • Storage Network (FibreChannel, Fabrics)


Friday, August 10, 12
Networking
                        • LAN (100Mb still common; 1Gbit standard;
                          10Gb and 100Gb on horizon)
                        • WAN (T-1, Frame Relay, ATM, MetroE)
                        • Important Characteristics
                         • Throughput
                         • Loss
                         • Delay
Friday, August 10, 12
Part II

                        All About Bootup



Friday, August 10, 12
Phases

                        • BIOS
                        • Kernel Bootstrap
                        • Hardware Detection
                        • Init System

Friday, August 10, 12
System Services
                        • Varies by OS
                        • Common: SysV Init Scripts; /etc/inittab; rc.local
                        • Solaris: SMF
                        • Ubuntu: Upstart
                        • Debian: SysV default; Upstart optional
                        • OSX: launchd
                        • RedHat/CentOS: SysV Init Scripts
Friday, August 10, 12
SysV Init Scripts
                        • Created in /etc/init.d; Symlinked into
                          runlevel directories
                        • Symlinks prefixed with special characters to
                          control startup/shutdown order
                          • Prefixed with “S” or “K” to start or stop
                            service in each level
                          • Numeric prefix determines order
                        • /etc/rc3.d/S10sshd -> /etc/init.d/sshd
Friday, August 10, 12
rc.local

                        • Single “dumb” startup script
                        • Run at end of system startup
                        • Quick/dirty mechanism to start something
                          at bootup




Friday, August 10, 12
/etc/inittab

                        • The original process supervisor
                        • Not (easily) scriptable
                        • Starts a process in a given runlevel
                        • Restarts the process when it dies

Friday, August 10, 12
Supervisor Processes

                        • Solaris SMF
                        • Ubuntu Upstart
                        • OSX launchd
                        • daemontools

Friday, August 10, 12
Ruby Integrations

                        • Supervisor Processes
                         • Bluepill
                         • God
                        • Startup Script Generator
                         • Foreman

Friday, August 10, 12
Choosing a Boot
                               Mechanism
                        • Is automatic recovery desirable?
                          (Hint: sometimes it’s not)
                        • Does it integrate with monitoring?
                        • Is it a one-off that will get forgotten?
                        • Does it integrate into OS startup/shutdown?
                        • How much work to integrate with your app?
Friday, August 10, 12
Part III

   Observing a Running System



Friday, August 10, 12
Common Tools
                        • top
                        • free
                        • vmstat
                        • netstat
                        • fuser
                        • ps
                        • sar (not always installed by default)
Friday, August 10, 12
Power Tools
                        • lsof
                        • iostat
                        • iftop
                        • pstree
                        • Tracing tools
                         • strace
                         • tcpdump/wireshark
Friday, August 10, 12
Observing CPU
                        • Go-to tools: top, ps
                        • CPU is not just about computation
                        • Most Important:
                          %user, %system, %nice, %idle, %wait
                        • Other: hardware/software interrupts,
                          “stolen” time (especially on EC2)


Friday, August 10, 12
The Mystical Load Avg.
                        • Broken into 1, 5 and 15 minute averages
                        • Gives a coarse view of overall system load
                        • Based on # processes waiting for CPU time
                        • Rule of thumb: stay below the number of
                          CPUs in a system (eg. a 4 CPU host should
                          be below a 4.00 load average)


Friday, August 10, 12
When am I CPU
                                bound?

                        • 15 minute load average exceeding the
                          number of non-HT processors
                        • %user + %system consistently above 90%


Friday, August 10, 12
Observing RAM

                        • Go-to tools: top, vmstat
                        • Available memory isn’t just “Free”
                        • Buffers + Cache fill to consume available
                          RAM (this is a good thing!)




Friday, August 10, 12
RAM vs. Swap

                        • RAM is the amount of physical memory
                        • Swap is disk used to augment RAM
                        • Swap is orders of magnitude slower
                        • Some VM types have no meaningful swap
                        • Rule of thumb: pretend swap doesn’t exist

Friday, August 10, 12
Paging Strategies

                        • Solaris: Page in advance
                        • Linux: Page on demand (last resort)
                        • Windows: Craziness


Friday, August 10, 12
When am I memory
                               bound?
                        • Free + buffers + cache < 15% of RAM
                        • Swap utilization above 10% avail. swap
                          (Linux only)
                        • Check for high disk utilization to confirm
                          “thrashing”



Friday, August 10, 12
Observing Disk

                        • Go-to tools: iostat, top
                        • Disk is usually hardest thing to observe
                        • Better in recent Linux kernels (> 2.6.20)


Friday, August 10, 12
RAID
                        • Redundant Array of Inexpensive Drives
                        • Different strategies have different
                          performance/durability tradeoffs
                         • RAID-0
                         • RAID-1
                         • RAID-10
                         • RAID-5
Friday, August 10, 12
                         • RAID-6
When am I disk bound?

                        • %wait is consistently above 10% to 20%
                        • ... though %wait can be network too
                        • SCSI and FC command queues are long
                        • Known failure mode: disk more than 85%
                          full causes tremendous VFS overhead



Friday, August 10, 12
Observing Network

                        • Go-to tools: netstat, iftop, wireshark
                        • Be wary of choke-points
                         • Switch interconnects
                         • WAN links
                         • Firewalls

Friday, August 10, 12
Link Optimization
                        • Use Jumbo Frames for Gbit+ links
                        • Port aggregation for throughput:
                         • Best: many-to-many
                         • Good: one-to-many
                         • Useless: one-to-one
                         • ... but still useful for HA
Friday, August 10, 12
When am I network
                               bound?
                        • This one is easy: 99% of the time this is link
                          saturation
                        • Gotchas: which link?
                        • Addendum: loss/delay (especially for TCP)
                          can wreak havoc on throughput
                        • ... but usually only a problem across WAN

Friday, August 10, 12
Part IV

                          Optimization &
                        Performance Tuning


Friday, August 10, 12
Hardware Options
                        • A.K.A. “Throw hardware at it”
                        • Not the first thing to try
                         • Are the services tuned? SQL queries,
                            application behavior, caching options
                         • Is something broken, causing
                            performance degradation?


Friday, August 10, 12
Hardware Options
                        • RAM is usually the single biggest
                          performance win (cost/benefit tradeoff)
                        • Faster disk is next best
                        • Then look at CPU and/or Network
                        • ...but do the work to figure out why your
                          performance is limited in the first place


Friday, August 10, 12
Kernel Tunables
                        • Not as necessary as in the “old days”
                        • Almost all settings can be adjusted at
                          runtime on Linux, Solaris
                        • Most valuable settings are buffer limits or
                          counters/timers
                        • There be dragons! Read carefully before
                          twisting these knobs


Friday, August 10, 12
Environment Settings
                        • ulimits
                         • max files
                         • stack size
                         • memory limits
                         • core dumps
                         • others
                        • Still subject to system-wide (kernel) limits
Friday, August 10, 12
Environment limits


                        • Hard limits cannot be raised by
                          unprivileged users
                        • PAM configuration may also be in effect


Friday, August 10, 12
Application Tunables
                        • There are not many for C-Ruby
                        • JVM has many
                         • Mostly related to how RAM is allocated
                            and garbage collected
                         • Very dependent on application
                         • Any time an “xVM” is involved, there is
                            probably a tunable (JVM, CLR)
                         • But we are developers! Tune/profile your
                            app before looking to the environment
Friday, August 10, 12
Performance
                            Management Tools
                        • sysstat (sar)
                        • SNMP (and related tools like Cacti)
                        • Integrated Monitoring + Trending tools
                         • Zabbix
                         • OpenNMS
                         • and a plethora of commercial tools
Friday, August 10, 12
Part V

                        Putting It All Together
                        Autopsy of a single HTTP request, end-to-end




Friday, August 10, 12
Live Demo/Whiteboard



Friday, August 10, 12
Part VI

                        Pulling It All Apart
                          Anticipating Murphy and his Law




Friday, August 10, 12
Most Common Pitfalls
                        • Disk Full
                        • DNS Unavailable/Slow
                        • Insufficient RAM
                        • Suboptimal Service Configuration
                        • Firewall misconfiguration
                        • Archaic: Network mismatch (Full/Half Duplex)
Friday, August 10, 12
DNS and Performance

                        • Possibly most-overlooked perf. impact
                        • Everything uses DNS
                        • If you make nothing else redundant, make
                          this redundant!




Friday, August 10, 12
Part VII

                        Scaling Up



Friday, August 10, 12
Horizontal or Vertical?

                        • Vertical: Making one server/instance go
                          faster
                        • Horizontal: Parallelizing requests to get
                          more things done in the same amount of
                          time




Friday, August 10, 12
Clustering
                        • Parallelizing requests to increase overall
                          throughput: horizontal scaling
                        • Techniques to make information more
                          available:
                          • Caching (memcache, file-based caching)
                          • Distribute data sets
                          • Replication
Friday, August 10, 12
Distributing Data

                        • Replication
                         • Split Reads (One writer/master; multiple
                            slaves/readers)
                         • Multiple Masters (dangerous!)
                         • Sharding (must consider HA)

Friday, August 10, 12
Failover/HA

                        • Consistency requires concept of Quorum
                        • Losing partition gets killed: STONITH
                        • Multi-master systems ignore this at the
                          cost of potential non-determinisim




Friday, August 10, 12
Tuning Services
                        • Some VM types (especially JVM or CLR)
                          have tunables for memory consumption
                        • Databases usually have memory settings
                         • These can make dramatic differences
                         • Very workload dependent
                        • Deep troubleshooting: strace, wireshark
Friday, August 10, 12
Part VIII

                        Deploying Applications



Friday, August 10, 12
12 Factor Application
                        • Deployability starts with application design
                         • Clear line between configuration and logic
                         • Permit easy horizontal scaling
                         • Are OS-agnostic (yay Ruby!)
                         • Minimize differences between dev and prod
                         • http://12factor.net - by Heroku cofounder
Friday, August 10, 12
Deployment Tools

                        • Capistrano
                         • The de facto standard
                         • Requires effort to set up, test
                         • Requires integration with system startup
                         • Most flexible

Friday, August 10, 12
Deployment Tools

                        • “Move it to the cloud”
                         • Heroku
                         • Cloud Foundry


Friday, August 10, 12

More Related Content

Similar to Ops for Developers

Inside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private CloudInside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private Cloud
Atlassian
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
Ernie Souhrada
 
Odnoklassniki.ru Architecture
Odnoklassniki.ru ArchitectureOdnoklassniki.ru Architecture
Odnoklassniki.ru Architecture
Dmitry Buzdin
 
Red Dirt Ruby Conference
Red Dirt Ruby ConferenceRed Dirt Ruby Conference
Red Dirt Ruby Conference
John Woodell
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
changehee lee
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
JP Lee
 
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Guillaume Laforge
 
Noit ocon-2010
Noit ocon-2010Noit ocon-2010
Noit ocon-2010
Theo Schlossnagle
 
Cloud-Friendly Hadoop and Hive - StampedeCon 2013
Cloud-Friendly Hadoop and Hive - StampedeCon 2013Cloud-Friendly Hadoop and Hive - StampedeCon 2013
Cloud-Friendly Hadoop and Hive - StampedeCon 2013
StampedeCon
 
Ignite@DevOpsDays - Why devs need ops
Ignite@DevOpsDays - Why devs need opsIgnite@DevOpsDays - Why devs need ops
Ignite@DevOpsDays - Why devs need ops
Michael Brunton-Spall
 
ZFS and FreeBSD Jails
ZFS and FreeBSD JailsZFS and FreeBSD Jails
ZFS and FreeBSD Jails
apeiron
 
DevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on WindowsDevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on Windows
Rob Reynolds
 
Node.js, toy or power tool?
Node.js, toy or power tool?Node.js, toy or power tool?
Node.js, toy or power tool?
Ovidiu Dimulescu
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Koan-Sin Tan
 
GemStone/S Update
GemStone/S UpdateGemStone/S Update
GemStone/S Update
ESUG
 
Node and SocketIO
Node and SocketIONode and SocketIO
Jenkins (war)stories
Jenkins (war)storiesJenkins (war)stories
Jenkins (war)stories
Toomas Römer
 
Linux basics 1/2
Linux basics 1/2Linux basics 1/2
Linux basics 1/2
Claudio Montoya
 
Node.js Patterns and Opinions
Node.js Patterns and OpinionsNode.js Patterns and Opinions
Node.js Patterns and Opinions
IsaacSchlueter
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
Patrick McFadin
 

Similar to Ops for Developers (20)

Inside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private CloudInside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private Cloud
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
 
Odnoklassniki.ru Architecture
Odnoklassniki.ru ArchitectureOdnoklassniki.ru Architecture
Odnoklassniki.ru Architecture
 
Red Dirt Ruby Conference
Red Dirt Ruby ConferenceRed Dirt Ruby Conference
Red Dirt Ruby Conference
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
 
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume LaforgeGaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
 
Noit ocon-2010
Noit ocon-2010Noit ocon-2010
Noit ocon-2010
 
Cloud-Friendly Hadoop and Hive - StampedeCon 2013
Cloud-Friendly Hadoop and Hive - StampedeCon 2013Cloud-Friendly Hadoop and Hive - StampedeCon 2013
Cloud-Friendly Hadoop and Hive - StampedeCon 2013
 
Ignite@DevOpsDays - Why devs need ops
Ignite@DevOpsDays - Why devs need opsIgnite@DevOpsDays - Why devs need ops
Ignite@DevOpsDays - Why devs need ops
 
ZFS and FreeBSD Jails
ZFS and FreeBSD JailsZFS and FreeBSD Jails
ZFS and FreeBSD Jails
 
DevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on WindowsDevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on Windows
 
Node.js, toy or power tool?
Node.js, toy or power tool?Node.js, toy or power tool?
Node.js, toy or power tool?
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
 
GemStone/S Update
GemStone/S UpdateGemStone/S Update
GemStone/S Update
 
Node and SocketIO
Node and SocketIONode and SocketIO
Node and SocketIO
 
Jenkins (war)stories
Jenkins (war)storiesJenkins (war)stories
Jenkins (war)stories
 
Linux basics 1/2
Linux basics 1/2Linux basics 1/2
Linux basics 1/2
 
Node.js Patterns and Opinions
Node.js Patterns and OpinionsNode.js Patterns and Opinions
Node.js Patterns and Opinions
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
 

More from Mojo Lingo

ConnectJS 2015: Video Killed the Telephone Star
ConnectJS 2015: Video Killed the Telephone StarConnectJS 2015: Video Killed the Telephone Star
ConnectJS 2015: Video Killed the Telephone Star
Mojo Lingo
 
AstriCon 2015: WebRTC: How it Works, and How it Breaks
AstriCon 2015: WebRTC: How it Works, and How it BreaksAstriCon 2015: WebRTC: How it Works, and How it Breaks
AstriCon 2015: WebRTC: How it Works, and How it Breaks
Mojo Lingo
 
FreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In Sight
FreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In SightFreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In Sight
FreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In Sight
Mojo Lingo
 
Now Hear This! Putting Voice, Video, and Text into Ruby on Rails
Now Hear This! Putting Voice, Video, and Text into Ruby on RailsNow Hear This! Putting Voice, Video, and Text into Ruby on Rails
Now Hear This! Putting Voice, Video, and Text into Ruby on Rails
Mojo Lingo
 
Using Asterisk to Create "Her"
Using Asterisk to Create "Her"Using Asterisk to Create "Her"
Using Asterisk to Create "Her"
Mojo Lingo
 
Tipping the Scales: Measuring and Scaling Asterisk
Tipping the Scales: Measuring and Scaling AsteriskTipping the Scales: Measuring and Scaling Asterisk
Tipping the Scales: Measuring and Scaling Asterisk
Mojo Lingo
 
WebRTC Overview by Dan Burnett
WebRTC Overview by Dan BurnettWebRTC Overview by Dan Burnett
WebRTC Overview by Dan Burnett
Mojo Lingo
 
AdhearsionConf 2013 Keynote
AdhearsionConf 2013 KeynoteAdhearsionConf 2013 Keynote
AdhearsionConf 2013 Keynote
Mojo Lingo
 
Speech-Enabling Web Apps
Speech-Enabling Web AppsSpeech-Enabling Web Apps
Speech-Enabling Web Apps
Mojo Lingo
 
WebRTC: What? How? Why? - ClueCon 2013
WebRTC: What? How? Why? - ClueCon 2013WebRTC: What? How? Why? - ClueCon 2013
WebRTC: What? How? Why? - ClueCon 2013
Mojo Lingo
 
Infiltrando Telecoms Usando Ruby
Infiltrando Telecoms Usando RubyInfiltrando Telecoms Usando Ruby
Infiltrando Telecoms Usando Ruby
Mojo Lingo
 
Enhancing FreePBX with Adhearsion
Enhancing FreePBX with AdhearsionEnhancing FreePBX with Adhearsion
Enhancing FreePBX with Adhearsion
Mojo Lingo
 
Connecting Adhearsion
Connecting AdhearsionConnecting Adhearsion
Connecting Adhearsion
Mojo Lingo
 
Testing Adhearsion Applications
Testing Adhearsion ApplicationsTesting Adhearsion Applications
Testing Adhearsion Applications
Mojo Lingo
 
Testing Telephony: It's Not All Terrible
Testing Telephony: It's Not All TerribleTesting Telephony: It's Not All Terrible
Testing Telephony: It's Not All Terrible
Mojo Lingo
 
Rayo for XMPP Folks
Rayo for XMPP FolksRayo for XMPP Folks
Rayo for XMPP Folks
Mojo Lingo
 
Building Real Life Applications with Adhearsion
Building Real Life Applications with AdhearsionBuilding Real Life Applications with Adhearsion
Building Real Life Applications with Adhearsion
Mojo Lingo
 
Keeping It Realtime!
Keeping It Realtime!Keeping It Realtime!
Keeping It Realtime!
Mojo Lingo
 
Integrating Voice Through Adhearsion
Integrating Voice Through AdhearsionIntegrating Voice Through Adhearsion
Integrating Voice Through Adhearsion
Mojo Lingo
 
Infiltrating Telecoms Using Ruby
Infiltrating Telecoms Using RubyInfiltrating Telecoms Using Ruby
Infiltrating Telecoms Using Ruby
Mojo Lingo
 

More from Mojo Lingo (20)

ConnectJS 2015: Video Killed the Telephone Star
ConnectJS 2015: Video Killed the Telephone StarConnectJS 2015: Video Killed the Telephone Star
ConnectJS 2015: Video Killed the Telephone Star
 
AstriCon 2015: WebRTC: How it Works, and How it Breaks
AstriCon 2015: WebRTC: How it Works, and How it BreaksAstriCon 2015: WebRTC: How it Works, and How it Breaks
AstriCon 2015: WebRTC: How it Works, and How it Breaks
 
FreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In Sight
FreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In SightFreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In Sight
FreeSWITCH, FreeSWITCH Everywhere, and Not A Phone In Sight
 
Now Hear This! Putting Voice, Video, and Text into Ruby on Rails
Now Hear This! Putting Voice, Video, and Text into Ruby on RailsNow Hear This! Putting Voice, Video, and Text into Ruby on Rails
Now Hear This! Putting Voice, Video, and Text into Ruby on Rails
 
Using Asterisk to Create "Her"
Using Asterisk to Create "Her"Using Asterisk to Create "Her"
Using Asterisk to Create "Her"
 
Tipping the Scales: Measuring and Scaling Asterisk
Tipping the Scales: Measuring and Scaling AsteriskTipping the Scales: Measuring and Scaling Asterisk
Tipping the Scales: Measuring and Scaling Asterisk
 
WebRTC Overview by Dan Burnett
WebRTC Overview by Dan BurnettWebRTC Overview by Dan Burnett
WebRTC Overview by Dan Burnett
 
AdhearsionConf 2013 Keynote
AdhearsionConf 2013 KeynoteAdhearsionConf 2013 Keynote
AdhearsionConf 2013 Keynote
 
Speech-Enabling Web Apps
Speech-Enabling Web AppsSpeech-Enabling Web Apps
Speech-Enabling Web Apps
 
WebRTC: What? How? Why? - ClueCon 2013
WebRTC: What? How? Why? - ClueCon 2013WebRTC: What? How? Why? - ClueCon 2013
WebRTC: What? How? Why? - ClueCon 2013
 
Infiltrando Telecoms Usando Ruby
Infiltrando Telecoms Usando RubyInfiltrando Telecoms Usando Ruby
Infiltrando Telecoms Usando Ruby
 
Enhancing FreePBX with Adhearsion
Enhancing FreePBX with AdhearsionEnhancing FreePBX with Adhearsion
Enhancing FreePBX with Adhearsion
 
Connecting Adhearsion
Connecting AdhearsionConnecting Adhearsion
Connecting Adhearsion
 
Testing Adhearsion Applications
Testing Adhearsion ApplicationsTesting Adhearsion Applications
Testing Adhearsion Applications
 
Testing Telephony: It's Not All Terrible
Testing Telephony: It's Not All TerribleTesting Telephony: It's Not All Terrible
Testing Telephony: It's Not All Terrible
 
Rayo for XMPP Folks
Rayo for XMPP FolksRayo for XMPP Folks
Rayo for XMPP Folks
 
Building Real Life Applications with Adhearsion
Building Real Life Applications with AdhearsionBuilding Real Life Applications with Adhearsion
Building Real Life Applications with Adhearsion
 
Keeping It Realtime!
Keeping It Realtime!Keeping It Realtime!
Keeping It Realtime!
 
Integrating Voice Through Adhearsion
Integrating Voice Through AdhearsionIntegrating Voice Through Adhearsion
Integrating Voice Through Adhearsion
 
Infiltrating Telecoms Using Ruby
Infiltrating Telecoms Using RubyInfiltrating Telecoms Using Ruby
Infiltrating Telecoms Using Ruby
 

Recently uploaded

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 

Recently uploaded (20)

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 

Ops for Developers

  • 1. spkr8.com/t/13191 Ops for Developers Or: How I Learned To Stop Worrying And Love The Shell Ben Klang bklang@mojolingo.com Friday, August 10, 12
  • 2. Prologue Introductions Friday, August 10, 12
  • 3. Who Am I? Ben Klang @bklang Github/Twitter bklang@mojolingo.com Friday, August 10, 12
  • 4. What are my passions? • Telephony Applications • Information Security • Performance and Availability Design • Open Source Friday, August 10, 12
  • 5. What do I do? • Today I write code and run Mojo Lingo • But Yesterday... Friday, August 10, 12
  • 6. This was my world Friday, August 10, 12
  • 8. “I am allergic to downtime” Friday, August 10, 12
  • 9. It’s About Risk • If something breaks, it will be my pager that goes off at 2am • New software == New ways to break • If I can’t see it, I can’t manage it or monitor it and it will break Friday, August 10, 12
  • 10. Agenda • 9:00 - 10:30 • 1:30 - 3:00 • Operating Systems & • Autopsy of an HTTP Hardware Request • All About Bootup • Dealing with Murphy • 10:30 - 11:00: Break • 3:00 - 3:30: Break • 11:00 - 12:30 • 3:30 - 5:00 • Observing a Running • Scaling Up System • Deploying Apps • Optimization/Tuning • Audience Requests • 12:30 - 1:30 Lunch Friday, August 10, 12
  • 11. Part I Operating Systems & Hardware Friday, August 10, 12
  • 12. OS History Lesson BSD, System V, Linux and Windows Friday, August 10, 12
  • 13. UNICS Soon renamed “Unix (Sep. 1969) Time Sharing System Version 1” UNIX Time Sharing System Version 5 (Jun. 1974) UNIX Sys III 1BSD (Nov. 1981) (Mar. 1978) UNIX Sys V 4.3BSD (Jan. 1983) (Jun. 1986) Friday, August 10, 12
  • 16. Common Architectures • Intel x86 (i386, x86_64) • SPARC • POWER • ARM • But none of this really matters anymore Friday, August 10, 12
  • 17. CPU Configurations • Individual CPU • SMP: Symmetric Multi-Processing • Multiple Cores • Hyperthreading/Virtual Cores Friday, August 10, 12
  • 18. (Virtual) Memory • RAM + Swap = Available Memory • Swapping strategies vary across OSes • What your code sees is a complete virtualization of this • x86/32-bit processes can only “see” 3GB of RAM from a 4GB address space Friday, August 10, 12
  • 19. Storage Types • Local Storage (SATA, SAS, USB, Firewire) • Network Storage (NFS, SMB, iSCSI, AOE) • Storage Network (FibreChannel, Fabrics) Friday, August 10, 12
  • 20. Networking • LAN (100Mb still common; 1Gbit standard; 10Gb and 100Gb on horizon) • WAN (T-1, Frame Relay, ATM, MetroE) • Important Characteristics • Throughput • Loss • Delay Friday, August 10, 12
  • 21. Part II All About Bootup Friday, August 10, 12
  • 22. Phases • BIOS • Kernel Bootstrap • Hardware Detection • Init System Friday, August 10, 12
  • 23. System Services • Varies by OS • Common: SysV Init Scripts; /etc/inittab; rc.local • Solaris: SMF • Ubuntu: Upstart • Debian: SysV default; Upstart optional • OSX: launchd • RedHat/CentOS: SysV Init Scripts Friday, August 10, 12
  • 24. SysV Init Scripts • Created in /etc/init.d; Symlinked into runlevel directories • Symlinks prefixed with special characters to control startup/shutdown order • Prefixed with “S” or “K” to start or stop service in each level • Numeric prefix determines order • /etc/rc3.d/S10sshd -> /etc/init.d/sshd Friday, August 10, 12
  • 25. rc.local • Single “dumb” startup script • Run at end of system startup • Quick/dirty mechanism to start something at bootup Friday, August 10, 12
  • 26. /etc/inittab • The original process supervisor • Not (easily) scriptable • Starts a process in a given runlevel • Restarts the process when it dies Friday, August 10, 12
  • 27. Supervisor Processes • Solaris SMF • Ubuntu Upstart • OSX launchd • daemontools Friday, August 10, 12
  • 28. Ruby Integrations • Supervisor Processes • Bluepill • God • Startup Script Generator • Foreman Friday, August 10, 12
  • 29. Choosing a Boot Mechanism • Is automatic recovery desirable? (Hint: sometimes it’s not) • Does it integrate with monitoring? • Is it a one-off that will get forgotten? • Does it integrate into OS startup/shutdown? • How much work to integrate with your app? Friday, August 10, 12
  • 30. Part III Observing a Running System Friday, August 10, 12
  • 31. Common Tools • top • free • vmstat • netstat • fuser • ps • sar (not always installed by default) Friday, August 10, 12
  • 32. Power Tools • lsof • iostat • iftop • pstree • Tracing tools • strace • tcpdump/wireshark Friday, August 10, 12
  • 33. Observing CPU • Go-to tools: top, ps • CPU is not just about computation • Most Important: %user, %system, %nice, %idle, %wait • Other: hardware/software interrupts, “stolen” time (especially on EC2) Friday, August 10, 12
  • 34. The Mystical Load Avg. • Broken into 1, 5 and 15 minute averages • Gives a coarse view of overall system load • Based on # processes waiting for CPU time • Rule of thumb: stay below the number of CPUs in a system (eg. a 4 CPU host should be below a 4.00 load average) Friday, August 10, 12
  • 35. When am I CPU bound? • 15 minute load average exceeding the number of non-HT processors • %user + %system consistently above 90% Friday, August 10, 12
  • 36. Observing RAM • Go-to tools: top, vmstat • Available memory isn’t just “Free” • Buffers + Cache fill to consume available RAM (this is a good thing!) Friday, August 10, 12
  • 37. RAM vs. Swap • RAM is the amount of physical memory • Swap is disk used to augment RAM • Swap is orders of magnitude slower • Some VM types have no meaningful swap • Rule of thumb: pretend swap doesn’t exist Friday, August 10, 12
  • 38. Paging Strategies • Solaris: Page in advance • Linux: Page on demand (last resort) • Windows: Craziness Friday, August 10, 12
  • 39. When am I memory bound? • Free + buffers + cache < 15% of RAM • Swap utilization above 10% avail. swap (Linux only) • Check for high disk utilization to confirm “thrashing” Friday, August 10, 12
  • 40. Observing Disk • Go-to tools: iostat, top • Disk is usually hardest thing to observe • Better in recent Linux kernels (> 2.6.20) Friday, August 10, 12
  • 41. RAID • Redundant Array of Inexpensive Drives • Different strategies have different performance/durability tradeoffs • RAID-0 • RAID-1 • RAID-10 • RAID-5 Friday, August 10, 12 • RAID-6
  • 42. When am I disk bound? • %wait is consistently above 10% to 20% • ... though %wait can be network too • SCSI and FC command queues are long • Known failure mode: disk more than 85% full causes tremendous VFS overhead Friday, August 10, 12
  • 43. Observing Network • Go-to tools: netstat, iftop, wireshark • Be wary of choke-points • Switch interconnects • WAN links • Firewalls Friday, August 10, 12
  • 44. Link Optimization • Use Jumbo Frames for Gbit+ links • Port aggregation for throughput: • Best: many-to-many • Good: one-to-many • Useless: one-to-one • ... but still useful for HA Friday, August 10, 12
  • 45. When am I network bound? • This one is easy: 99% of the time this is link saturation • Gotchas: which link? • Addendum: loss/delay (especially for TCP) can wreak havoc on throughput • ... but usually only a problem across WAN Friday, August 10, 12
  • 46. Part IV Optimization & Performance Tuning Friday, August 10, 12
  • 47. Hardware Options • A.K.A. “Throw hardware at it” • Not the first thing to try • Are the services tuned? SQL queries, application behavior, caching options • Is something broken, causing performance degradation? Friday, August 10, 12
  • 48. Hardware Options • RAM is usually the single biggest performance win (cost/benefit tradeoff) • Faster disk is next best • Then look at CPU and/or Network • ...but do the work to figure out why your performance is limited in the first place Friday, August 10, 12
  • 49. Kernel Tunables • Not as necessary as in the “old days” • Almost all settings can be adjusted at runtime on Linux, Solaris • Most valuable settings are buffer limits or counters/timers • There be dragons! Read carefully before twisting these knobs Friday, August 10, 12
  • 50. Environment Settings • ulimits • max files • stack size • memory limits • core dumps • others • Still subject to system-wide (kernel) limits Friday, August 10, 12
  • 51. Environment limits • Hard limits cannot be raised by unprivileged users • PAM configuration may also be in effect Friday, August 10, 12
  • 52. Application Tunables • There are not many for C-Ruby • JVM has many • Mostly related to how RAM is allocated and garbage collected • Very dependent on application • Any time an “xVM” is involved, there is probably a tunable (JVM, CLR) • But we are developers! Tune/profile your app before looking to the environment Friday, August 10, 12
  • 53. Performance Management Tools • sysstat (sar) • SNMP (and related tools like Cacti) • Integrated Monitoring + Trending tools • Zabbix • OpenNMS • and a plethora of commercial tools Friday, August 10, 12
  • 54. Part V Putting It All Together Autopsy of a single HTTP request, end-to-end Friday, August 10, 12
  • 56. Part VI Pulling It All Apart Anticipating Murphy and his Law Friday, August 10, 12
  • 57. Most Common Pitfalls • Disk Full • DNS Unavailable/Slow • Insufficient RAM • Suboptimal Service Configuration • Firewall misconfiguration • Archaic: Network mismatch (Full/Half Duplex) Friday, August 10, 12
  • 58. DNS and Performance • Possibly most-overlooked perf. impact • Everything uses DNS • If you make nothing else redundant, make this redundant! Friday, August 10, 12
  • 59. Part VII Scaling Up Friday, August 10, 12
  • 60. Horizontal or Vertical? • Vertical: Making one server/instance go faster • Horizontal: Parallelizing requests to get more things done in the same amount of time Friday, August 10, 12
  • 61. Clustering • Parallelizing requests to increase overall throughput: horizontal scaling • Techniques to make information more available: • Caching (memcache, file-based caching) • Distribute data sets • Replication Friday, August 10, 12
  • 62. Distributing Data • Replication • Split Reads (One writer/master; multiple slaves/readers) • Multiple Masters (dangerous!) • Sharding (must consider HA) Friday, August 10, 12
  • 63. Failover/HA • Consistency requires concept of Quorum • Losing partition gets killed: STONITH • Multi-master systems ignore this at the cost of potential non-determinisim Friday, August 10, 12
  • 64. Tuning Services • Some VM types (especially JVM or CLR) have tunables for memory consumption • Databases usually have memory settings • These can make dramatic differences • Very workload dependent • Deep troubleshooting: strace, wireshark Friday, August 10, 12
  • 65. Part VIII Deploying Applications Friday, August 10, 12
  • 66. 12 Factor Application • Deployability starts with application design • Clear line between configuration and logic • Permit easy horizontal scaling • Are OS-agnostic (yay Ruby!) • Minimize differences between dev and prod • http://12factor.net - by Heroku cofounder Friday, August 10, 12
  • 67. Deployment Tools • Capistrano • The de facto standard • Requires effort to set up, test • Requires integration with system startup • Most flexible Friday, August 10, 12
  • 68. Deployment Tools • “Move it to the cloud” • Heroku • Cloud Foundry Friday, August 10, 12