SlideShare a Scribd company logo
1 of 69
Download to read offline
Scalable, Good,
    Cheap
 a tale of sexiness, puppets, shell
         scripts, and python
From this...
...to this!
Get your infrastructure
     started right!

  (not just preparing for incident and
         rapid event response)
Who we are?

Avleen Vig (@avleen)
 ● Senior Systems Engineer at Etsy
 ● Good at: Scaling frontends, python
 ● Previous companies: WooMe, Google, Earthlink

Marc Cluet (@lynxman)
● Senior Systems Engineer at WooMe
● Good at: Backend scaling, bash/python, languages
● Previous companies: RTFX, Tiscali, World Online
Overview
● Workflow

● Why planning for scaling is important

● How do you choose your software

● Setting up your infrastructure

● Managing your infrastructure
The background

● Larger startup, $32m in funding
● 6 million+ active users
● Dozens of developers
● 6 systems administrators
● 4 DBAs
● 10+ code releases every day
● Geographically distributed employees
   ○ Brooklyn HQ
   ○ Satellites in Berlin, San Francisco
   ○ Small number of remote employees
The background

● Small, funded start up
● 6 python developers
● 2 front end developers
● 3 systems administrators
● 1 DBA (moustache included)
● Multiple code releases every day
● Geographically distributed employees
   ○ Berlin, Copenhagen, Leeds, London, Los Angeles,
     Oakland, Paris, Portland, Zagreb
Workflow

● Ticket systems
   ○ Ticket, or it didn't happen!

● Documentation
   ○ Wikis are good

● Don't Repeat Yourself
   ○ If you keep doing the same thing manually, automate

● Version control everything
   ○ All of your scripts
   ○ All of your configurations
Workflow

● Everything will change

● Technical debt vs Premature optimisation
   ○ If you try to be too accurate too early, you'll fail
Team integration

● Be sure to hire the right people
   ○ Beer recruitment interview

● Encourage speed
   ○ Release soon and release often

● Embrace mistakes as part of your day to day
   ○ Learn to work with it

● Ask for peer reviews for important components
   ○ Helps sanity checking your logic

● Developers, Sysadmins, DBAs, one team
Team communication

● Team communication is the most critical factor

● Make sure everyone is in the loop

● Useful applications
   ○ IRC
   ○ Skype
   ○ email
   ○ shout!

● Don't be afraid to use the phone to avoid miscommunication
Layering! Not just for haircuts.

Separate your systems

● Front end

● Application

● Database

● Caching
Choosing your software

● What does your software need to do?
  ○ FastCGI / HTTP proxy? Use nginx
  ○ PHP processing? Use apache

● What expertise do you already have?
  ○ Stick to what you're 100% good at

● Don't rewrite everything
  ○ If it does 70% of what you need it's good for you
Release management

● Fast and furious

● Automate, automate, automate

● Script your deploys and rollbacks

● Continuous deployment

● MTTR vs MTBF
MTTR vs MTBF
Logging

● Centralize your logging

  ○ syslog-ng

● Parsing web logs - the secret troubleshooting weapon

  ○ SQL

  ○ Splunk
Web logs in a database!
CREATE TABLE access (
    ip inet,
    hostname text,
    username text,
    date timestamp without time zone,
    method text,
    path text,
    protocol text,
    status integer,
    size integer,
    referrer text,
    useragent text,
    clienttime double precision,
    backendtime double precision,
    backendip inet,
    backendport integer,
    backendstatus integer,
    ssl_cipher text,
    ssl_protocol text,
    scheme text
);
Web logs in a database!
Monitoring

● Alerting vs Trend analysis
Monitoring

● Alerting vs Trend analysis
   ○ Nagios is great for raising alerts on problems
Monitoring

● Alerting vs Trend analysis

  ○ Nagios is great for raising alerts on problems

  ○ Ganglia is great at long term trend analysis

  ○ Know when something is out of the "ordinary"
Monitoring

● Alerting vs Trend analysis

  ○ Nagios is great for raising alerts on problems

  ○ Ganglia is great at long term trend analysis

  ○ Know when something is out of the "ordinary"

● What should you monitor?

  ○ Anything which breaks once

  ○ Customer facing services
Monitoring

● Alerting vs Trend analysis

  ○ Nagios is great for raising alerts on problems

  ○ Ganglia is great at long term trend analysis

  ○ Know when something is out of the "ordinary"

● What should you graph?

  ○ Everything! If it moves, graph it.

  ○ Customer facing rates and statistics
Monitoring

Get statistics from your logs:

● PostgreSQL: pgfouine

● MySQL: mk-query-digest

● Web servers: webalizer, awstats, urchin

● Custom applications: Do it yourself! Integrate with Ganglia
Monitoring
Caching

● Caches are disposable
Caching

● Caches are disposable

● But what about the thundering herd?
The importance of scaling
The importance of scaling
 ● August 2003 Northeastern US and Canada blackout

   ○ Caused by poor process execution

   ○ Lack of good monitoring

   ○ Poor scaling
The importance of scaling
The importance of scaling
 ● Massive destruction avoided!

   ○ 256 power stations automatically shut down

   ○ 85% after disconnecting from the grid

   ○ Power lost but plants saved!
Caching

● Caches are disposable

● But what about the thundering herd?

  ○ Increase backend capacity along with cache capacity

  ○ Plan for cache failure

  ○ Reduce demand when cache fails
Caching

● Find out how your caching software works

  ○ Memcache + peep!

  ○ Is it better with lots of keys and small objects?

  ○ Or fewer keys and large objects?

  ○ How is memory allocated?
Caching

● Caches are disposable
   ○ Solved!

● But what about the thundering herd?
   ○ Solved!

● Now we get into database scaling!
   ○ Over to Marc...
Databases

                Databases...


      or how to live and die dangerously
Databases

            SQL or NoSQL?
Databases

● SQL
   ○ Gives you transactional consistency
   ○ Good known system
   ○ Hard to scale

● NoSQL
   ○ Transactionally consistent "eventually"
   ○ New cool system
   ○ Easy to scale
Databases

● SQL
   ○ Gives you transactional consistency
   ○ Good known system
   ○ Hard to scale

● NoSQL
   ○ Transactionally consistent "eventually"
   ○ New cool system
   ○ Easy to scale

               You may end up using BOTH!
Databases

● Be smart about your table design
Databases

● Be smart about your table design
   ○ Keep it simple but modular to avoid surprises
You need to design your database right!
Databases

● Be smart about your table design
   ○ Keep it simple but modular to avoid surprises
   ○ Don't abuse many-to-many tables, they will just give you
     hell
Databases

● Be smart about your table design
   ○ Keep it simple but modular to avoid surprises
   ○ Don't abuse many-to-many tables, they will just give you
     hell

● YOU WILL GET IT WRONG
   ○ You'll need to redesign parts of your DB semi-regularly
   ○ Be prepared for the unexpected
Databases

The read dilemma

● As the tables grow so do read times and memory.
  Several options:

  ○ Check your slow query log, tune indexes

  ○ Partition to read smaller numbers of rows

  ○ Master / Slave, but this adds replication lag!
Databases

The read dilemma

● As the tables grow so do read times and memory.
  Several options:

  ○ Check your slow query log, tune indexes
     ■ Single most common problem with slow queries and
       capacity
     ■ Be careful about foreign keys
Databases

The read dilemma

● As the tables grow so do read times and memory.
  Several options:

  ○ Check your slow query log, tune indexes

  ○ Partition to read smaller numbers of rows
     ■ By range (date, id)
     ■ By hash (usernames)
     ■ By anything you can imagine!
Databases

The write conundrum

● As the database grows so do writes

● Writes are bound by disk I/O
  ○ RAID1+0 helps

● Don't shoot yourself in the foot!
   ○ Don't try to solve this early
   ○ Have monitoring ready to foresee this issue
   ○ Bring pizza
Databases

Divide writes!
● Remember about modular? This is it
Databases

How to give a consistent view to the servers?

Use a query director!

● pgbouncer on Postgres

● gizzard on MySQL
Web frontend

● Hardware load balancers - Good but expensive!

● Software load balancers - Good and cheap! (more pizza)

  ○ Web server frontends
    ■ nginx, lighttpd, apache

  ○ Reverse proxies
     ■ varnish, squid

  ○ Kernel stuff
     ■ Linux ipvs
Web frontend

Which way should I go?

● Web servers as load balancers
  ○ Gives you nice add on features
  ○ You can offload some process in the frontend
  ○ Buffering problems

● Reverse proxies
   ○ Caching stuff is good
   ○ Fast reaction time
   ○ No buffering problems
Web frontend

Divide your web clusters!

● You can send different requests to different clusters

● You can use an API call to connect between them
Configuration management

● Be ready to mass scale
   ○ Keep all your machines in line

● Automated server installs
   ○ Use it to install new software
   ○ Also to rapidly deploy new versions
Writing tools

● If you do something more than 2 times it's worth scripting

● Write small tools when you need them

● Stick to one or two languages
   ○ And be good at them
Writing tools

● Even better

● Have your scripts repo in a cvs and push it everywhere
Backups

● It's important to have backups
Backups

● It's important to have backups

● It's even more important to exercise them!
   ○ Having backups without testing recovery is like having no
      backups
Backups

● It's important to have backups

● It's even more important to exercise them!
   ○ Having backups without testing recovery is like having no
      backups

● How can we exercise backups for cheap?
Backups

● It's important to have backups

● It's even more important to exercise them!
   ○ Having backups without testing recovery is like having no
      backups

● How can we exercise backups for cheap?
  ○ Cloud computing!
Cloud computing

● Cloud computing help us recreate our platform on the cloud

● Giving us a more than credible recovery scenario

● Also very useful to spawn more instances if we run into
  problems
Interesting things to read

Wikipedia
● http://en.wikipedia.org/wiki/DevOps

Web Operations and Capacity Planning
● http://kitchensoap.com

High scalability (if you get there)
 ● http://highscalability.com/

If you really fancy databases, explain extended
 ● http://explainextended.com/
Questions?


   Work at Etsy!           Work at WooMe!
http://etsy.com/jobs   http://bit.ly/work4woome




     @avleen                @lynxman

More Related Content

Similar to Scalable, good, cheap

Piano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processingPiano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processingMartinStrycek
 
Austin bdug 2011_01_27_small_and_big_data
Austin bdug 2011_01_27_small_and_big_dataAustin bdug 2011_01_27_small_and_big_data
Austin bdug 2011_01_27_small_and_big_dataAlex Pinkin
 
Running Neo4j in Production: Tips, Tricks and Optimizations
Running Neo4j in Production:  Tips, Tricks and OptimizationsRunning Neo4j in Production:  Tips, Tricks and Optimizations
Running Neo4j in Production: Tips, Tricks and OptimizationsNick Manning
 
Running Neo4j in Production: Tips, Tricks and Optimizations
Running Neo4j in Production:  Tips, Tricks and OptimizationsRunning Neo4j in Production:  Tips, Tricks and Optimizations
Running Neo4j in Production: Tips, Tricks and OptimizationsNick Manning
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversKibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversHYS Enterprise
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Demi Ben-Ari
 
Distributed systems and consistency
Distributed systems and consistencyDistributed systems and consistency
Distributed systems and consistencyseldo
 
What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...Stefano Fago
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dan Lynn
 
My talk at Linux Piter 2015
My talk at Linux Piter 2015My talk at Linux Piter 2015
My talk at Linux Piter 2015Alex Chistyakov
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automationRicardo Bánffy
 
Aws uk ug #8 not everything that happens in vegas stay in vegas
Aws uk ug #8   not everything that happens in vegas stay in vegasAws uk ug #8   not everything that happens in vegas stay in vegas
Aws uk ug #8 not everything that happens in vegas stay in vegasPeter Mounce
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseHakan Ilter
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dan Lynn
 
Solving the Database Problem
Solving the Database ProblemSolving the Database Problem
Solving the Database ProblemJay Gordon
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software houseParis Apostolopoulos
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty
 

Similar to Scalable, good, cheap (20)

Piano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processingPiano Media - approach to data gathering and processing
Piano Media - approach to data gathering and processing
 
Austin bdug 2011_01_27_small_and_big_data
Austin bdug 2011_01_27_small_and_big_dataAustin bdug 2011_01_27_small_and_big_data
Austin bdug 2011_01_27_small_and_big_data
 
Running Neo4j in Production: Tips, Tricks and Optimizations
Running Neo4j in Production:  Tips, Tricks and OptimizationsRunning Neo4j in Production:  Tips, Tricks and Optimizations
Running Neo4j in Production: Tips, Tricks and Optimizations
 
Running Neo4j in Production: Tips, Tricks and Optimizations
Running Neo4j in Production:  Tips, Tricks and OptimizationsRunning Neo4j in Production:  Tips, Tricks and Optimizations
Running Neo4j in Production: Tips, Tricks and Optimizations
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversKibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Distributed systems and consistency
Distributed systems and consistencyDistributed systems and consistency
Distributed systems and consistency
 
What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...What drives Innovation? Innovations And Technological Solutions for the Distr...
What drives Innovation? Innovations And Technological Solutions for the Distr...
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
 
My talk at Linux Piter 2015
My talk at Linux Piter 2015My talk at Linux Piter 2015
My talk at Linux Piter 2015
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automation
 
Aws uk ug #8 not everything that happens in vegas stay in vegas
Aws uk ug #8   not everything that happens in vegas stay in vegasAws uk ug #8   not everything that happens in vegas stay in vegas
Aws uk ug #8 not everything that happens in vegas stay in vegas
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
 
Solving the Database Problem
Solving the Database ProblemSolving the Database Problem
Solving the Database Problem
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software house
 
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysQuick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 

More from Marc Cluet

Your Kernel and You
Your Kernel and YouYour Kernel and You
Your Kernel and YouMarc Cluet
 
Managing DevOps teams, staying alive
Managing DevOps teams, staying aliveManaging DevOps teams, staying alive
Managing DevOps teams, staying aliveMarc Cluet
 
The DevOps journey - How to get there painlessly
The DevOps journey - How to get there painlesslyThe DevOps journey - How to get there painlessly
The DevOps journey - How to get there painlesslyMarc Cluet
 
Elastic Beanstalk, usos prácticos y conceptos
Elastic Beanstalk, usos prácticos y conceptosElastic Beanstalk, usos prácticos y conceptos
Elastic Beanstalk, usos prácticos y conceptosMarc Cluet
 
Service discovery and puppet
Service discovery and puppetService discovery and puppet
Service discovery and puppetMarc Cluet
 
Puppet Camp London Fall 2015 - Service Discovery and Puppet
Puppet Camp London Fall 2015 - Service Discovery and PuppetPuppet Camp London Fall 2015 - Service Discovery and Puppet
Puppet Camp London Fall 2015 - Service Discovery and PuppetMarc Cluet
 
Puppet and your Metadata - PuppetCamp London 2015
Puppet and your Metadata - PuppetCamp London 2015Puppet and your Metadata - PuppetCamp London 2015
Puppet and your Metadata - PuppetCamp London 2015Marc Cluet
 
Consul First Steps
Consul First StepsConsul First Steps
Consul First StepsMarc Cluet
 
Autoscaling Best Practices - WebPerf Barcelona Oct 2014
Autoscaling Best Practices - WebPerf Barcelona Oct 2014Autoscaling Best Practices - WebPerf Barcelona Oct 2014
Autoscaling Best Practices - WebPerf Barcelona Oct 2014Marc Cluet
 
Microservices and the Cloud - DevOps Cardiff Meetup
Microservices and the Cloud - DevOps Cardiff MeetupMicroservices and the Cloud - DevOps Cardiff Meetup
Microservices and the Cloud - DevOps Cardiff MeetupMarc Cluet
 
Microservices and the Cloud
Microservices and the CloudMicroservices and the Cloud
Microservices and the CloudMarc Cluet
 
How to implement microservices
How to implement microservicesHow to implement microservices
How to implement microservicesMarc Cluet
 
A Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and PuppetA Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and PuppetMarc Cluet
 
Autoscaling Best Practices
Autoscaling Best PracticesAutoscaling Best Practices
Autoscaling Best PracticesMarc Cluet
 
Rackspace Hack Night - Vagrant & Packer
Rackspace Hack Night - Vagrant & PackerRackspace Hack Night - Vagrant & Packer
Rackspace Hack Night - Vagrant & PackerMarc Cluet
 
Innovation in the Cloud - Rackspace Zurich Event
Innovation in the Cloud - Rackspace Zurich EventInnovation in the Cloud - Rackspace Zurich Event
Innovation in the Cloud - Rackspace Zurich EventMarc Cluet
 
Introduction to DevOps - Rackspace tech night
Introduction to DevOps - Rackspace tech nightIntroduction to DevOps - Rackspace tech night
Introduction to DevOps - Rackspace tech nightMarc Cluet
 
Hadoop operations
Hadoop operationsHadoop operations
Hadoop operationsMarc Cluet
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoopMarc Cluet
 
Ssh that wonderful thing
Ssh that wonderful thingSsh that wonderful thing
Ssh that wonderful thingMarc Cluet
 

More from Marc Cluet (20)

Your Kernel and You
Your Kernel and YouYour Kernel and You
Your Kernel and You
 
Managing DevOps teams, staying alive
Managing DevOps teams, staying aliveManaging DevOps teams, staying alive
Managing DevOps teams, staying alive
 
The DevOps journey - How to get there painlessly
The DevOps journey - How to get there painlesslyThe DevOps journey - How to get there painlessly
The DevOps journey - How to get there painlessly
 
Elastic Beanstalk, usos prácticos y conceptos
Elastic Beanstalk, usos prácticos y conceptosElastic Beanstalk, usos prácticos y conceptos
Elastic Beanstalk, usos prácticos y conceptos
 
Service discovery and puppet
Service discovery and puppetService discovery and puppet
Service discovery and puppet
 
Puppet Camp London Fall 2015 - Service Discovery and Puppet
Puppet Camp London Fall 2015 - Service Discovery and PuppetPuppet Camp London Fall 2015 - Service Discovery and Puppet
Puppet Camp London Fall 2015 - Service Discovery and Puppet
 
Puppet and your Metadata - PuppetCamp London 2015
Puppet and your Metadata - PuppetCamp London 2015Puppet and your Metadata - PuppetCamp London 2015
Puppet and your Metadata - PuppetCamp London 2015
 
Consul First Steps
Consul First StepsConsul First Steps
Consul First Steps
 
Autoscaling Best Practices - WebPerf Barcelona Oct 2014
Autoscaling Best Practices - WebPerf Barcelona Oct 2014Autoscaling Best Practices - WebPerf Barcelona Oct 2014
Autoscaling Best Practices - WebPerf Barcelona Oct 2014
 
Microservices and the Cloud - DevOps Cardiff Meetup
Microservices and the Cloud - DevOps Cardiff MeetupMicroservices and the Cloud - DevOps Cardiff Meetup
Microservices and the Cloud - DevOps Cardiff Meetup
 
Microservices and the Cloud
Microservices and the CloudMicroservices and the Cloud
Microservices and the Cloud
 
How to implement microservices
How to implement microservicesHow to implement microservices
How to implement microservices
 
A Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and PuppetA Metadata Ocean in Chef and Puppet
A Metadata Ocean in Chef and Puppet
 
Autoscaling Best Practices
Autoscaling Best PracticesAutoscaling Best Practices
Autoscaling Best Practices
 
Rackspace Hack Night - Vagrant & Packer
Rackspace Hack Night - Vagrant & PackerRackspace Hack Night - Vagrant & Packer
Rackspace Hack Night - Vagrant & Packer
 
Innovation in the Cloud - Rackspace Zurich Event
Innovation in the Cloud - Rackspace Zurich EventInnovation in the Cloud - Rackspace Zurich Event
Innovation in the Cloud - Rackspace Zurich Event
 
Introduction to DevOps - Rackspace tech night
Introduction to DevOps - Rackspace tech nightIntroduction to DevOps - Rackspace tech night
Introduction to DevOps - Rackspace tech night
 
Hadoop operations
Hadoop operationsHadoop operations
Hadoop operations
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Ssh that wonderful thing
Ssh that wonderful thingSsh that wonderful thing
Ssh that wonderful thing
 

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Scalable, good, cheap

  • 1. Scalable, Good, Cheap a tale of sexiness, puppets, shell scripts, and python
  • 3.
  • 5.
  • 6. Get your infrastructure started right! (not just preparing for incident and rapid event response)
  • 7. Who we are? Avleen Vig (@avleen) ● Senior Systems Engineer at Etsy ● Good at: Scaling frontends, python ● Previous companies: WooMe, Google, Earthlink Marc Cluet (@lynxman) ● Senior Systems Engineer at WooMe ● Good at: Backend scaling, bash/python, languages ● Previous companies: RTFX, Tiscali, World Online
  • 8. Overview ● Workflow ● Why planning for scaling is important ● How do you choose your software ● Setting up your infrastructure ● Managing your infrastructure
  • 9. The background ● Larger startup, $32m in funding ● 6 million+ active users ● Dozens of developers ● 6 systems administrators ● 4 DBAs ● 10+ code releases every day ● Geographically distributed employees ○ Brooklyn HQ ○ Satellites in Berlin, San Francisco ○ Small number of remote employees
  • 10. The background ● Small, funded start up ● 6 python developers ● 2 front end developers ● 3 systems administrators ● 1 DBA (moustache included) ● Multiple code releases every day ● Geographically distributed employees ○ Berlin, Copenhagen, Leeds, London, Los Angeles, Oakland, Paris, Portland, Zagreb
  • 11. Workflow ● Ticket systems ○ Ticket, or it didn't happen! ● Documentation ○ Wikis are good ● Don't Repeat Yourself ○ If you keep doing the same thing manually, automate ● Version control everything ○ All of your scripts ○ All of your configurations
  • 12. Workflow ● Everything will change ● Technical debt vs Premature optimisation ○ If you try to be too accurate too early, you'll fail
  • 13. Team integration ● Be sure to hire the right people ○ Beer recruitment interview ● Encourage speed ○ Release soon and release often ● Embrace mistakes as part of your day to day ○ Learn to work with it ● Ask for peer reviews for important components ○ Helps sanity checking your logic ● Developers, Sysadmins, DBAs, one team
  • 14. Team communication ● Team communication is the most critical factor ● Make sure everyone is in the loop ● Useful applications ○ IRC ○ Skype ○ email ○ shout! ● Don't be afraid to use the phone to avoid miscommunication
  • 15. Layering! Not just for haircuts. Separate your systems ● Front end ● Application ● Database ● Caching
  • 16. Choosing your software ● What does your software need to do? ○ FastCGI / HTTP proxy? Use nginx ○ PHP processing? Use apache ● What expertise do you already have? ○ Stick to what you're 100% good at ● Don't rewrite everything ○ If it does 70% of what you need it's good for you
  • 17. Release management ● Fast and furious ● Automate, automate, automate ● Script your deploys and rollbacks ● Continuous deployment ● MTTR vs MTBF
  • 19.
  • 20. Logging ● Centralize your logging ○ syslog-ng ● Parsing web logs - the secret troubleshooting weapon ○ SQL ○ Splunk
  • 21. Web logs in a database! CREATE TABLE access ( ip inet, hostname text, username text, date timestamp without time zone, method text, path text, protocol text, status integer, size integer, referrer text, useragent text, clienttime double precision, backendtime double precision, backendip inet, backendport integer, backendstatus integer, ssl_cipher text, ssl_protocol text, scheme text );
  • 22. Web logs in a database!
  • 23. Monitoring ● Alerting vs Trend analysis
  • 24.
  • 25. Monitoring ● Alerting vs Trend analysis ○ Nagios is great for raising alerts on problems
  • 26.
  • 27. Monitoring ● Alerting vs Trend analysis ○ Nagios is great for raising alerts on problems ○ Ganglia is great at long term trend analysis ○ Know when something is out of the "ordinary"
  • 28. Monitoring ● Alerting vs Trend analysis ○ Nagios is great for raising alerts on problems ○ Ganglia is great at long term trend analysis ○ Know when something is out of the "ordinary" ● What should you monitor? ○ Anything which breaks once ○ Customer facing services
  • 29. Monitoring ● Alerting vs Trend analysis ○ Nagios is great for raising alerts on problems ○ Ganglia is great at long term trend analysis ○ Know when something is out of the "ordinary" ● What should you graph? ○ Everything! If it moves, graph it. ○ Customer facing rates and statistics
  • 30. Monitoring Get statistics from your logs: ● PostgreSQL: pgfouine ● MySQL: mk-query-digest ● Web servers: webalizer, awstats, urchin ● Custom applications: Do it yourself! Integrate with Ganglia
  • 33.
  • 34. Caching ● Caches are disposable ● But what about the thundering herd?
  • 35. The importance of scaling
  • 36. The importance of scaling ● August 2003 Northeastern US and Canada blackout ○ Caused by poor process execution ○ Lack of good monitoring ○ Poor scaling
  • 37. The importance of scaling
  • 38. The importance of scaling ● Massive destruction avoided! ○ 256 power stations automatically shut down ○ 85% after disconnecting from the grid ○ Power lost but plants saved!
  • 39. Caching ● Caches are disposable ● But what about the thundering herd? ○ Increase backend capacity along with cache capacity ○ Plan for cache failure ○ Reduce demand when cache fails
  • 40. Caching ● Find out how your caching software works ○ Memcache + peep! ○ Is it better with lots of keys and small objects? ○ Or fewer keys and large objects? ○ How is memory allocated?
  • 41. Caching ● Caches are disposable ○ Solved! ● But what about the thundering herd? ○ Solved! ● Now we get into database scaling! ○ Over to Marc...
  • 42. Databases Databases... or how to live and die dangerously
  • 43. Databases SQL or NoSQL?
  • 44. Databases ● SQL ○ Gives you transactional consistency ○ Good known system ○ Hard to scale ● NoSQL ○ Transactionally consistent "eventually" ○ New cool system ○ Easy to scale
  • 45. Databases ● SQL ○ Gives you transactional consistency ○ Good known system ○ Hard to scale ● NoSQL ○ Transactionally consistent "eventually" ○ New cool system ○ Easy to scale You may end up using BOTH!
  • 46. Databases ● Be smart about your table design
  • 47. Databases ● Be smart about your table design ○ Keep it simple but modular to avoid surprises
  • 48. You need to design your database right!
  • 49. Databases ● Be smart about your table design ○ Keep it simple but modular to avoid surprises ○ Don't abuse many-to-many tables, they will just give you hell
  • 50. Databases ● Be smart about your table design ○ Keep it simple but modular to avoid surprises ○ Don't abuse many-to-many tables, they will just give you hell ● YOU WILL GET IT WRONG ○ You'll need to redesign parts of your DB semi-regularly ○ Be prepared for the unexpected
  • 51. Databases The read dilemma ● As the tables grow so do read times and memory. Several options: ○ Check your slow query log, tune indexes ○ Partition to read smaller numbers of rows ○ Master / Slave, but this adds replication lag!
  • 52. Databases The read dilemma ● As the tables grow so do read times and memory. Several options: ○ Check your slow query log, tune indexes ■ Single most common problem with slow queries and capacity ■ Be careful about foreign keys
  • 53. Databases The read dilemma ● As the tables grow so do read times and memory. Several options: ○ Check your slow query log, tune indexes ○ Partition to read smaller numbers of rows ■ By range (date, id) ■ By hash (usernames) ■ By anything you can imagine!
  • 54. Databases The write conundrum ● As the database grows so do writes ● Writes are bound by disk I/O ○ RAID1+0 helps ● Don't shoot yourself in the foot! ○ Don't try to solve this early ○ Have monitoring ready to foresee this issue ○ Bring pizza
  • 55. Databases Divide writes! ● Remember about modular? This is it
  • 56. Databases How to give a consistent view to the servers? Use a query director! ● pgbouncer on Postgres ● gizzard on MySQL
  • 57. Web frontend ● Hardware load balancers - Good but expensive! ● Software load balancers - Good and cheap! (more pizza) ○ Web server frontends ■ nginx, lighttpd, apache ○ Reverse proxies ■ varnish, squid ○ Kernel stuff ■ Linux ipvs
  • 58. Web frontend Which way should I go? ● Web servers as load balancers ○ Gives you nice add on features ○ You can offload some process in the frontend ○ Buffering problems ● Reverse proxies ○ Caching stuff is good ○ Fast reaction time ○ No buffering problems
  • 59. Web frontend Divide your web clusters! ● You can send different requests to different clusters ● You can use an API call to connect between them
  • 60. Configuration management ● Be ready to mass scale ○ Keep all your machines in line ● Automated server installs ○ Use it to install new software ○ Also to rapidly deploy new versions
  • 61. Writing tools ● If you do something more than 2 times it's worth scripting ● Write small tools when you need them ● Stick to one or two languages ○ And be good at them
  • 62. Writing tools ● Even better ● Have your scripts repo in a cvs and push it everywhere
  • 63. Backups ● It's important to have backups
  • 64. Backups ● It's important to have backups ● It's even more important to exercise them! ○ Having backups without testing recovery is like having no backups
  • 65. Backups ● It's important to have backups ● It's even more important to exercise them! ○ Having backups without testing recovery is like having no backups ● How can we exercise backups for cheap?
  • 66. Backups ● It's important to have backups ● It's even more important to exercise them! ○ Having backups without testing recovery is like having no backups ● How can we exercise backups for cheap? ○ Cloud computing!
  • 67. Cloud computing ● Cloud computing help us recreate our platform on the cloud ● Giving us a more than credible recovery scenario ● Also very useful to spawn more instances if we run into problems
  • 68. Interesting things to read Wikipedia ● http://en.wikipedia.org/wiki/DevOps Web Operations and Capacity Planning ● http://kitchensoap.com High scalability (if you get there) ● http://highscalability.com/ If you really fancy databases, explain extended ● http://explainextended.com/
  • 69. Questions? Work at Etsy! Work at WooMe! http://etsy.com/jobs http://bit.ly/work4woome @avleen @lynxman