SlideShare a Scribd company logo
Artur Bergman
          sky@crucially.net
• Wikia Inc
  – We are hiring
  – Community/Bizdev in Germany
  – Engineers in Poland
  – http://www.wikia.com/wiki/hiring
• O’Reilly Radar
  – http://radar.oreilly.com/artur/
The value of operations
•   Google
•   Orkut
•   Friendster
•   Myspace
Benefits
•   Users trust your brand
•   They rely on you
•   They spend more time on your site
•   Bad operations wastes R&D money

• Fixed amount of time + faster site =
  more page views
Stepchild of Engineering
• Product development
• Engineering
• Operations
  – Sysadmins?
• Why?
Operations Engineering
• It is engineering
• Google terminology -
  – Site Reliability Engineer
• Sure there are sysadmins too, people
  mananing NOCs and datacenters
• Provide career growth
Good Engineers
•   Detail Oriented
•   Aspire to be operational engineers
•   Stubborn
•   Can steer their inner ADD
    – Interrupt driven
• Not the same as good developers
Danger signs
• Thinks operation is a path to
  development engineering
  – Fire them
• Want people dedicated to the task
• A good operations engineer should
  spend some time in development
• A good development engineer MUST
  spend some time in operations
Debugging
• 9 Rules of debugging
• http://www.debuggingrules.com/Poster_
  download.html
  – Yes the font is horrible
Rule 1:
       Understand the system
•   Complexity Kills
•   No excuse
•   If you write it, you must know it
•   If you run it, you must know it
•   If you buy it, you must know it
Rule 3:
      Quit thinking and look
• quot;It is a capital mistake to theorize before
  one has data. Insensibly one begins to
  twist facts to suit theories, instead of
  theories to suit facts.”
Rule 3:
        Quit thinking and look
•   What do you look at?
•   The importance of monitoring
•   Monitoring
•   Monitoring
•   Monitoring
My my, confusing term
• Monitoring
• Alerting
• Trending
Monitoring
•   Collects data
•   Puts into databases
•   Makes it available for you
•   Active collection
•   Passive interaction
Alerting
• Acts on monitoring data
• Severe alerts
  – Active
  – Needs action
• Passive alerts
  – Things that need to be done but not right now
• DO NOT OVER ALERT
• DO NOT CRY WOLF
Wikia alerting strategy
•   When the site is slow
•   Or down
•   We send emails and do phone calls
•   Europe and US West coast
•   Looking to hire in East Asia
•   No night time
Trending
• Long term
• Capacity planning
Monitor Tools
•   Nagios
•   Cacti
•   MRTG
•   Hyperic
•   Cricket
•   Ganglia
External Monitoring
• Use one, tells you what your clients see
  every x minutes
• Keynote
• Gomez
• Websitepulse (cheap - easy - I like
  them; no annoying salesforce)
Nagios
•   Alerting
•   Hassle
•   C CGI??
•   Doesn’t
    scale
Hyperic
• Most exciting open source tool
• Agent base - self configured
• Baseline alerting
Cricket MRTG Cacti
• Impossible to configure
• You need to write tools to do it
• Especially Cacti
  – Somewhat more pleasant than clawing out
    your eyes
Ganglia
• We love ganglia
• Automatically graphs everything you
  want - just works
• Large scale clusters
• Multicast
• Zero config
• RRD
http://ganglia.wikimedia.org/
•   270 hosts
•   880 CPU
•   2 clusters
•   1.2 TB of Memory
http://ganglia.wikimedia.org
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Or Learn Unix

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Custom Ganglia Gmetrics
• Write your own

gmetric --name='Oldest query' --type=int32
--units='sec' --dmax=65 --value=`echo '
show processlist' | mysql -uroot -ppass |
grep -v Sleep | grep -v 'system user' | head -2 |
tail -1 | cut -f 6`
Something is wrong

• Don’t worry, data warehouse




                      QuickTime™ and a
            TIFF (Uncompressed) decompressor
               are needed to see this picture.
tcpdump / waveshark
•   If you suspect the network
•   Don’t just suspect
•   LOOK AT IT
•   Tcpdump / waveshark will tell you
    – If your packets are lost, delayed or
      corrupted
    – Your windowing is wrong
Rule 4: Divde and Conquer
• Look at the problems in turn
• Split between people
• Go in the order you suspect is the most
  likely
Rule 5:
 Change one thing at a time
• I cannot stress this enough
• IF YOU DO NOT THEN YOU HAVE
  FAILED TO IDENTIFY THE PROBLEM
Rule 6:
        Keep an audit trail
• You might be making things worse
• Good for the root cause analysis
• Have your shell log all commands
  – Good practice anyway
• Version control
Rule 9:
    If you didn’t fix it, it ain’t fixed
•   You must do something to fix a problem
•   Or it will bite you again
•   And again
•   And again
•   They don’t just appear and disappear
•   Except BGP route convergence :)
Process
• You need a little
• Don’t worry
Don’t forget
Complexity kills
•   Design against it
•   Reuse components
•   Define standards
•   Have a few images that all machines
    look like - reimage machines every now
    and then for the heck of it.
    – EC2 forces you to do this
MTBF
Meduim Time Between Failure
• Actually mostly irrelevant
• Dealing with failure is more important
• Target the right uptime
  – Complexity scales exponatially with
    required uptime
• Don’t kid yourself, you don’t need 5
  nines
MTTR
  Medium Time To Recovery
• Important
• Noone cares if you fail once a minute
  – If you recover in 50 ms
• If you are down 1 minute a week, you
  are still going to hit 4 nines (99.99%)
• Failures happen, plan how to deal with
  them
Problem found
• If it is critical, start a phone conversation
• Use IRC to communicate technical data
• One person liasons with non technical
  staff
• One person specifically in command
• Sleep scheduling ( audit log important )
Post crisis
• Root cause analysis
  – Just find out what went wrong
  – And how to avoid it
  – Or fix it faster next time if you can’t
• Keep track of your uptime
Automation
•   All machines are created equal
•   Seriously
•   If you manually make changes
•   You are wrong
    – Unless you know what you are doing
Best practices
•   Version control
•   Gold images
•   Centralised authentication
•   Time Sync ( NTP )
•   Central logging
•   ( All of this applies for virtual machines
    too!)
cfengine
•   Standard automation tool
•   Written in C
•   Not much support
•   Very good
•   Very annoying
contro :
      l
  s te
   i      = ( mys te )
                 i        domain = (
  mysite .count y )
               r
  sysadm = (mark )          netmask = (
  255.255.255.0 )          ac i
                             t onsequence =
  (         mounta ll       mount nfo
                                  i
      addmounts          mounta l
                                l        lnks
                                          i
  )        mountpat rn = / ie) (
                      te     $(s t /$ host))
 homepat r = ( u? )
          te n
Puppet
•   New hip kid on the block
•   Written in ruby
•   Better support?
•   Much nicer syntax
•   Easier to extend
def ne yumrepo (enab
   i                 led = true)
{c i i
    onf gfle
{ /e c
 quot; t /yum.repos /
               .d $name.repo”: mode
  => 644,
source => quot; yum/repos
             /        /$name. repoquot;,
ensure => $enab led ? {
true => fl ,
         ie
defau t=> absent
      l                  }
}}
cobb er
                        l
• Automatic PXE Installer
    – Uses kickstart files
•   Redhat Enterprise
•   Centos
•   Fedora
•   Some support for debian
cobbler
cobbler system add
  --name=xen8
  --mac=00:19:B9:EE:6D:0A
  --ip=10.10.30.208
  --profile=Centos-5-x86_64
  --kopts='ksdevice=00:19:B9:EE:6D:0A
      console=ttyS1,57600 console=tty0'
cobbler
cobbler system add
  --name=xen8
  --mac=00:19:B9:EE:6D:0A
  --ip=10.10.30.208
  --profile=Centos-5-x86_64
  --kopts='ksdevice=00:19:B9:EE:6D:0A
      console=ttyS1,57600 console=tty0’
koan
• Client install tool
  – Xen
  – Or OS re-image


koan --server=10.10.30.205 --virt --
  profile=virt_fc6 --virt-name=otrs
Your datacenter
• Keep it tidy
   – Label things, keep cables as short as possible
   – Have a switch in each rack
• If you are small without dedicated DC staff
  you need
   – Remote control power switches
   – Remote console!
Virtualization
•   Please use it
•   Managing becomes much easier
•   Power consumption
•   Need a new test box
    – The requestor can have it in minutes
Power consumption
• Maybe not as important in Europe
• 8 core machines are more efficient than
  1 core
• But memcache uses 1 core and all RAM
• Get more RAM and virtualise
Our network admin boxes
•   1 Xen CPU for Vyatta
•   1 Xen CPU for LVS
•   1 Xen CPU for Squid - Carp
•   1 Xen CPU for Squid
•   1 Xen CPU for Monitoring
•   1 Xen CPU for network tasks

• We can have more of these and a loss of one
  affects us less
Vyatta
• Opensource router
  – Really like it
  – No need to use Cisco
LVS
•   Linux Virtual Server
•   Low level load balancer
•   HA
•   Fast
•   Doesn’t inspire people to put things in
    the only place that is hard to scale
Squid Carp
• Squids configured to hash the urls and
  send them to specific backend
• Very little configuration done
• Logging of UDP - no disk IO
Squid
• As a reverse web accelerator
• 90 % of our hits served from RAM in less than
  1 ms
• Same as wikipedia
• We only use RAM cache ( unlike wikipedia)
• Cached per user
• If not cacheable - cache for a second to
  redue backend effect
App servers
• 1 xen cpu for memcache ( 5 GB Ram)
• 1 xen cpu for squid ( 5GB Ram )
• 6 xen cpus for apache (6 GB Ram )

• More power efficient, less affected by
  loss
• Applications can’t affect each other
Databases
• Keep developers on short leash
• Report bad queries
• Fear object relational mappers
Outsourcing
• As much as possible
• The younger you are as a company the
  less risk
  – When you have no users, you have no
    value
• VCs don’t like having their money go
  into Capex
What I want from Vendors
• They do what they tell me
• They do what I tell them

• No annoying up sells, no premium
  services
  – I know more about what you are selling
    than you
Services we use
• Amazon EC2 and S3
• Panther-Express
Panther Express
• Fantastic Content Distribution Network
• Cheap, simple price list
  – Take note akamai
• Cut delivery time to Europe by 70%
• We let our images be cached 1 second
  to redue load
EC2 and S3
•   We save all our binlogs to S3
•   We save database dumps to S3
•   We have monitors running from EC2
•   We plan to build a datawarehouse
    cluster on EC2
EC2 Requires Automation
• Machine is blank when you bring it up
• Download database dump from S3 and
  replicate up - automatically
• Use puppet
• Amazon saves you hardware
  headaches
  – But complexity is still a problem
Thank you

More Related Content

Viewers also liked

Web engineering - An overview about HTML
Web engineering -  An overview about HTMLWeb engineering -  An overview about HTML
Web engineering - An overview about HTML
Nosheen Qamar
 
Web Engineering - Web Application Testing
Web Engineering - Web Application TestingWeb Engineering - Web Application Testing
Web Engineering - Web Application Testing
Nosheen Qamar
 
Web application testing with Selenium
Web application testing with SeleniumWeb application testing with Selenium
Web application testing with Selenium
Kerry Buckley
 
Web App Testing - A Practical Approach
Web App Testing - A Practical ApproachWeb App Testing - A Practical Approach
Web App Testing - A Practical Approach
Walter Mamed
 
Testing Web Applications
Testing Web ApplicationsTesting Web Applications
Testing Web Applications
Seth McLaughlin
 
Web Application Testing
Web Application TestingWeb Application Testing
Web Application TestingRicha Goel
 
Selenium Testing Project report
Selenium Testing Project reportSelenium Testing Project report
Selenium Testing Project report
Kapil Rajpurohit
 
Software testing basic concepts
Software testing basic conceptsSoftware testing basic concepts
Software testing basic conceptsHưng Hoàng
 
Testing concepts ppt
Testing concepts pptTesting concepts ppt
Testing concepts pptRathna Priya
 
Software Testing Fundamentals
Software Testing FundamentalsSoftware Testing Fundamentals
Software Testing FundamentalsChankey Pathak
 
Software testing ppt
Software testing pptSoftware testing ppt
Software testing ppt
Heritage Institute Of Tech,India
 

Viewers also liked (12)

Web engineering - An overview about HTML
Web engineering -  An overview about HTMLWeb engineering -  An overview about HTML
Web engineering - An overview about HTML
 
Web Engineering - Web Application Testing
Web Engineering - Web Application TestingWeb Engineering - Web Application Testing
Web Engineering - Web Application Testing
 
Web application testing with Selenium
Web application testing with SeleniumWeb application testing with Selenium
Web application testing with Selenium
 
Web App Testing - A Practical Approach
Web App Testing - A Practical ApproachWeb App Testing - A Practical Approach
Web App Testing - A Practical Approach
 
Testing Web Applications
Testing Web ApplicationsTesting Web Applications
Testing Web Applications
 
Web Application Testing
Web Application TestingWeb Application Testing
Web Application Testing
 
Testing web application
Testing web applicationTesting web application
Testing web application
 
Selenium Testing Project report
Selenium Testing Project reportSelenium Testing Project report
Selenium Testing Project report
 
Software testing basic concepts
Software testing basic conceptsSoftware testing basic concepts
Software testing basic concepts
 
Testing concepts ppt
Testing concepts pptTesting concepts ppt
Testing concepts ppt
 
Software Testing Fundamentals
Software Testing FundamentalsSoftware Testing Fundamentals
Software Testing Fundamentals
 
Software testing ppt
Software testing pptSoftware testing ppt
Software testing ppt
 

Similar to Web 2.0 Performance and Reliability: How to Run Large Web Apps

Make Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMake Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMySQLConference
 
Drizzle Talk
Drizzle TalkDrizzle Talk
Drizzle Talk
Brian Aker
 
How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08
kingsfleet
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little Pieces
Andrei Zmievski
 
Tips on High Performance Server Programming
Tips on High Performance Server ProgrammingTips on High Performance Server Programming
Tips on High Performance Server Programming
Joshua Zhu
 
Becoming a Power User
Becoming a Power UserBecoming a Power User
Becoming a Power User
Michigan Nonprofit Association
 
The Automation Factory
The Automation FactoryThe Automation Factory
The Automation Factory
Nathan Milford
 
Understanding and hiding your operations
Understanding and hiding your operationsUnderstanding and hiding your operations
Understanding and hiding your operations
Daniel López Jiménez
 
The Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With RubyThe Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With Ruby
mattmatt
 
Nevmug Lighthouse Automation7.1
Nevmug   Lighthouse   Automation7.1Nevmug   Lighthouse   Automation7.1
Nevmug Lighthouse Automation7.1
csharney
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Long Nguyen
 
Practical project automation
Practical project automationPractical project automation
Practical project automation
Reinout van Rees
 
Securing Rails
Securing RailsSecuring Rails
Securing Rails
Alex Payne
 
Secure Programming With Static Analysis
Secure Programming With Static AnalysisSecure Programming With Static Analysis
Secure Programming With Static AnalysisConSanFrancisco123
 
When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)
Nate Lawson
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crash
Fabio Pignatti
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcachedelliando dias
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Anne Nicolas
 
Tools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance IssuesTools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance Issues
Claudio Miranda
 

Similar to Web 2.0 Performance and Reliability: How to Run Large Web Apps (20)

Make Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMake Your Life Easier With Maatkit
Make Your Life Easier With Maatkit
 
Drizzle Talk
Drizzle TalkDrizzle Talk
Drizzle Talk
 
How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08How the JDeveloper team test JDeveloper at UKOUG'08
How the JDeveloper team test JDeveloper at UKOUG'08
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little Pieces
 
Tips on High Performance Server Programming
Tips on High Performance Server ProgrammingTips on High Performance Server Programming
Tips on High Performance Server Programming
 
Becoming a Power User
Becoming a Power UserBecoming a Power User
Becoming a Power User
 
The Automation Factory
The Automation FactoryThe Automation Factory
The Automation Factory
 
Understanding and hiding your operations
Understanding and hiding your operationsUnderstanding and hiding your operations
Understanding and hiding your operations
 
The Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With RubyThe Current State of Asynchronous Processing With Ruby
The Current State of Asynchronous Processing With Ruby
 
Nevmug Lighthouse Automation7.1
Nevmug   Lighthouse   Automation7.1Nevmug   Lighthouse   Automation7.1
Nevmug Lighthouse Automation7.1
 
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
Just In Time Scalability  Agile Methods To Support Massive Growth PresentationJust In Time Scalability  Agile Methods To Support Massive Growth Presentation
Just In Time Scalability Agile Methods To Support Massive Growth Presentation
 
Practical project automation
Practical project automationPractical project automation
Practical project automation
 
Securing Rails
Securing RailsSecuring Rails
Securing Rails
 
Os Wilhelm
Os WilhelmOs Wilhelm
Os Wilhelm
 
Secure Programming With Static Analysis
Secure Programming With Static AnalysisSecure Programming With Static Analysis
Secure Programming With Static Analysis
 
When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)When Crypto Attacks! (Yahoo 2009)
When Crypto Attacks! (Yahoo 2009)
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crash
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcached
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
 
Tools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance IssuesTools and Tips to Diagnose Performance Issues
Tools and Tips to Diagnose Performance Issues
 

More from adunne

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overview
adunne
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Preview
adunne
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networks
adunne
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigation
adunne
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Community
adunne
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spider
adunne
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Preview
adunne
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
adunne
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industry
adunne
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centers
adunne
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
adunne
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Data
adunne
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
adunne
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...
adunne
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacy
adunne
 
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data SetUnder the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
adunne
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
adunne
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketing
adunne
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storage
adunne
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibility
adunne
 

More from adunne (20)

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overview
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Preview
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networks
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigation
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Community
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spider
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Preview
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industry
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centers
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Data
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacy
 
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data SetUnder the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketing
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storage
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibility
 

Recently uploaded

一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
taqyed
 
VAT Registration Outlined In UAE: Benefits and Requirements
VAT Registration Outlined In UAE: Benefits and RequirementsVAT Registration Outlined In UAE: Benefits and Requirements
VAT Registration Outlined In UAE: Benefits and Requirements
uae taxgpt
 
The-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic managementThe-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic management
Bojamma2
 
Skye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto AirportSkye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto Airport
marketingjdass
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
RajPriye
 
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n PrintAffordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Navpack & Print
 
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
BBPMedia1
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
dylandmeas
 
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdfModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
fisherameliaisabella
 
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdfikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
agatadrynko
 
Sustainability: Balancing the Environment, Equity & Economy
Sustainability: Balancing the Environment, Equity & EconomySustainability: Balancing the Environment, Equity & Economy
Sustainability: Balancing the Environment, Equity & Economy
Operational Excellence Consulting
 
What are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdfWhat are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdf
HumanResourceDimensi1
 
What is the TDS Return Filing Due Date for FY 2024-25.pdf
What is the TDS Return Filing Due Date for FY 2024-25.pdfWhat is the TDS Return Filing Due Date for FY 2024-25.pdf
What is the TDS Return Filing Due Date for FY 2024-25.pdf
seoforlegalpillers
 
20240425_ TJ Communications Credentials_compressed.pdf
20240425_ TJ Communications Credentials_compressed.pdf20240425_ TJ Communications Credentials_compressed.pdf
20240425_ TJ Communications Credentials_compressed.pdf
tjcomstrang
 
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBdCree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
creerey
 
Attending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learnersAttending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learners
Erika906060
 
The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...
balatucanapplelovely
 
LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024
Lital Barkan
 
Memorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.pptMemorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.ppt
seri bangash
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
LR1709MUSIC
 

Recently uploaded (20)

一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
VAT Registration Outlined In UAE: Benefits and Requirements
VAT Registration Outlined In UAE: Benefits and RequirementsVAT Registration Outlined In UAE: Benefits and Requirements
VAT Registration Outlined In UAE: Benefits and Requirements
 
The-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic managementThe-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic management
 
Skye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto AirportSkye Residences | Extended Stay Residences Near Toronto Airport
Skye Residences | Extended Stay Residences Near Toronto Airport
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
 
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n PrintAffordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n Print
 
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
 
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdfModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
ModelingMarketingStrategiesMKS.CollumbiaUniversitypdf
 
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdfikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
 
Sustainability: Balancing the Environment, Equity & Economy
Sustainability: Balancing the Environment, Equity & EconomySustainability: Balancing the Environment, Equity & Economy
Sustainability: Balancing the Environment, Equity & Economy
 
What are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdfWhat are the main advantages of using HR recruiter services.pdf
What are the main advantages of using HR recruiter services.pdf
 
What is the TDS Return Filing Due Date for FY 2024-25.pdf
What is the TDS Return Filing Due Date for FY 2024-25.pdfWhat is the TDS Return Filing Due Date for FY 2024-25.pdf
What is the TDS Return Filing Due Date for FY 2024-25.pdf
 
20240425_ TJ Communications Credentials_compressed.pdf
20240425_ TJ Communications Credentials_compressed.pdf20240425_ TJ Communications Credentials_compressed.pdf
20240425_ TJ Communications Credentials_compressed.pdf
 
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBdCree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
 
Attending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learnersAttending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learners
 
The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...
 
LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024
 
Memorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.pptMemorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.ppt
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
 

Web 2.0 Performance and Reliability: How to Run Large Web Apps

  • 1. Artur Bergman sky@crucially.net • Wikia Inc – We are hiring – Community/Bizdev in Germany – Engineers in Poland – http://www.wikia.com/wiki/hiring • O’Reilly Radar – http://radar.oreilly.com/artur/
  • 2. The value of operations • Google • Orkut • Friendster • Myspace
  • 3. Benefits • Users trust your brand • They rely on you • They spend more time on your site • Bad operations wastes R&D money • Fixed amount of time + faster site = more page views
  • 4. Stepchild of Engineering • Product development • Engineering • Operations – Sysadmins? • Why?
  • 5. Operations Engineering • It is engineering • Google terminology - – Site Reliability Engineer • Sure there are sysadmins too, people mananing NOCs and datacenters • Provide career growth
  • 6. Good Engineers • Detail Oriented • Aspire to be operational engineers • Stubborn • Can steer their inner ADD – Interrupt driven • Not the same as good developers
  • 7. Danger signs • Thinks operation is a path to development engineering – Fire them • Want people dedicated to the task • A good operations engineer should spend some time in development • A good development engineer MUST spend some time in operations
  • 8.
  • 9. Debugging • 9 Rules of debugging • http://www.debuggingrules.com/Poster_ download.html – Yes the font is horrible
  • 10. Rule 1: Understand the system • Complexity Kills • No excuse • If you write it, you must know it • If you run it, you must know it • If you buy it, you must know it
  • 11. Rule 3: Quit thinking and look • quot;It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
  • 12. Rule 3: Quit thinking and look • What do you look at? • The importance of monitoring • Monitoring • Monitoring • Monitoring
  • 13. My my, confusing term • Monitoring • Alerting • Trending
  • 14. Monitoring • Collects data • Puts into databases • Makes it available for you • Active collection • Passive interaction
  • 15. Alerting • Acts on monitoring data • Severe alerts – Active – Needs action • Passive alerts – Things that need to be done but not right now • DO NOT OVER ALERT • DO NOT CRY WOLF
  • 16. Wikia alerting strategy • When the site is slow • Or down • We send emails and do phone calls • Europe and US West coast • Looking to hire in East Asia • No night time
  • 17. Trending • Long term • Capacity planning
  • 18. Monitor Tools • Nagios • Cacti • MRTG • Hyperic • Cricket • Ganglia
  • 19. External Monitoring • Use one, tells you what your clients see every x minutes • Keynote • Gomez • Websitepulse (cheap - easy - I like them; no annoying salesforce)
  • 20. Nagios • Alerting • Hassle • C CGI?? • Doesn’t scale
  • 21. Hyperic • Most exciting open source tool • Agent base - self configured • Baseline alerting
  • 22. Cricket MRTG Cacti • Impossible to configure • You need to write tools to do it • Especially Cacti – Somewhat more pleasant than clawing out your eyes
  • 23. Ganglia • We love ganglia • Automatically graphs everything you want - just works • Large scale clusters • Multicast • Zero config • RRD
  • 24. http://ganglia.wikimedia.org/ • 270 hosts • 880 CPU • 2 clusters • 1.2 TB of Memory
  • 26. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 27. Custom Ganglia Gmetrics • Or Learn Unix gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 28. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 29. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 30. Custom Ganglia Gmetrics • Write your own gmetric --name='Oldest query' --type=int32 --units='sec' --dmax=65 --value=`echo ' show processlist' | mysql -uroot -ppass | grep -v Sleep | grep -v 'system user' | head -2 | tail -1 | cut -f 6`
  • 31. Something is wrong • Don’t worry, data warehouse QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.
  • 32. tcpdump / waveshark • If you suspect the network • Don’t just suspect • LOOK AT IT • Tcpdump / waveshark will tell you – If your packets are lost, delayed or corrupted – Your windowing is wrong
  • 33. Rule 4: Divde and Conquer • Look at the problems in turn • Split between people • Go in the order you suspect is the most likely
  • 34. Rule 5: Change one thing at a time • I cannot stress this enough • IF YOU DO NOT THEN YOU HAVE FAILED TO IDENTIFY THE PROBLEM
  • 35. Rule 6: Keep an audit trail • You might be making things worse • Good for the root cause analysis • Have your shell log all commands – Good practice anyway • Version control
  • 36. Rule 9: If you didn’t fix it, it ain’t fixed • You must do something to fix a problem • Or it will bite you again • And again • And again • They don’t just appear and disappear • Except BGP route convergence :)
  • 37. Process • You need a little • Don’t worry
  • 39. Complexity kills • Design against it • Reuse components • Define standards • Have a few images that all machines look like - reimage machines every now and then for the heck of it. – EC2 forces you to do this
  • 40. MTBF Meduim Time Between Failure • Actually mostly irrelevant • Dealing with failure is more important • Target the right uptime – Complexity scales exponatially with required uptime • Don’t kid yourself, you don’t need 5 nines
  • 41. MTTR Medium Time To Recovery • Important • Noone cares if you fail once a minute – If you recover in 50 ms • If you are down 1 minute a week, you are still going to hit 4 nines (99.99%) • Failures happen, plan how to deal with them
  • 42. Problem found • If it is critical, start a phone conversation • Use IRC to communicate technical data • One person liasons with non technical staff • One person specifically in command • Sleep scheduling ( audit log important )
  • 43. Post crisis • Root cause analysis – Just find out what went wrong – And how to avoid it – Or fix it faster next time if you can’t • Keep track of your uptime
  • 44. Automation • All machines are created equal • Seriously • If you manually make changes • You are wrong – Unless you know what you are doing
  • 45. Best practices • Version control • Gold images • Centralised authentication • Time Sync ( NTP ) • Central logging • ( All of this applies for virtual machines too!)
  • 46. cfengine • Standard automation tool • Written in C • Not much support • Very good • Very annoying
  • 47. contro : l s te i = ( mys te ) i domain = ( mysite .count y ) r sysadm = (mark ) netmask = ( 255.255.255.0 ) ac i t onsequence = ( mounta ll mount nfo i addmounts mounta l l lnks i ) mountpat rn = / ie) ( te $(s t /$ host)) homepat r = ( u? ) te n
  • 48. Puppet • New hip kid on the block • Written in ruby • Better support? • Much nicer syntax • Easier to extend
  • 49. def ne yumrepo (enab i led = true) {c i i onf gfle { /e c quot; t /yum.repos / .d $name.repo”: mode => 644, source => quot; yum/repos / /$name. repoquot;, ensure => $enab led ? { true => fl , ie defau t=> absent l } }}
  • 50. cobb er l • Automatic PXE Installer – Uses kickstart files • Redhat Enterprise • Centos • Fedora • Some support for debian
  • 51. cobbler cobbler system add --name=xen8 --mac=00:19:B9:EE:6D:0A --ip=10.10.30.208 --profile=Centos-5-x86_64 --kopts='ksdevice=00:19:B9:EE:6D:0A console=ttyS1,57600 console=tty0'
  • 52. cobbler cobbler system add --name=xen8 --mac=00:19:B9:EE:6D:0A --ip=10.10.30.208 --profile=Centos-5-x86_64 --kopts='ksdevice=00:19:B9:EE:6D:0A console=ttyS1,57600 console=tty0’
  • 53. koan • Client install tool – Xen – Or OS re-image koan --server=10.10.30.205 --virt -- profile=virt_fc6 --virt-name=otrs
  • 54. Your datacenter • Keep it tidy – Label things, keep cables as short as possible – Have a switch in each rack • If you are small without dedicated DC staff you need – Remote control power switches – Remote console!
  • 55. Virtualization • Please use it • Managing becomes much easier • Power consumption • Need a new test box – The requestor can have it in minutes
  • 56. Power consumption • Maybe not as important in Europe • 8 core machines are more efficient than 1 core • But memcache uses 1 core and all RAM • Get more RAM and virtualise
  • 57. Our network admin boxes • 1 Xen CPU for Vyatta • 1 Xen CPU for LVS • 1 Xen CPU for Squid - Carp • 1 Xen CPU for Squid • 1 Xen CPU for Monitoring • 1 Xen CPU for network tasks • We can have more of these and a loss of one affects us less
  • 58. Vyatta • Opensource router – Really like it – No need to use Cisco
  • 59. LVS • Linux Virtual Server • Low level load balancer • HA • Fast • Doesn’t inspire people to put things in the only place that is hard to scale
  • 60. Squid Carp • Squids configured to hash the urls and send them to specific backend • Very little configuration done • Logging of UDP - no disk IO
  • 61. Squid • As a reverse web accelerator • 90 % of our hits served from RAM in less than 1 ms • Same as wikipedia • We only use RAM cache ( unlike wikipedia) • Cached per user • If not cacheable - cache for a second to redue backend effect
  • 62. App servers • 1 xen cpu for memcache ( 5 GB Ram) • 1 xen cpu for squid ( 5GB Ram ) • 6 xen cpus for apache (6 GB Ram ) • More power efficient, less affected by loss • Applications can’t affect each other
  • 63. Databases • Keep developers on short leash • Report bad queries • Fear object relational mappers
  • 64. Outsourcing • As much as possible • The younger you are as a company the less risk – When you have no users, you have no value • VCs don’t like having their money go into Capex
  • 65. What I want from Vendors • They do what they tell me • They do what I tell them • No annoying up sells, no premium services – I know more about what you are selling than you
  • 66. Services we use • Amazon EC2 and S3 • Panther-Express
  • 67. Panther Express • Fantastic Content Distribution Network • Cheap, simple price list – Take note akamai • Cut delivery time to Europe by 70% • We let our images be cached 1 second to redue load
  • 68. EC2 and S3 • We save all our binlogs to S3 • We save database dumps to S3 • We have monitors running from EC2 • We plan to build a datawarehouse cluster on EC2
  • 69. EC2 Requires Automation • Machine is blank when you bring it up • Download database dump from S3 and replicate up - automatically • Use puppet • Amazon saves you hardware headaches – But complexity is still a problem