Release Workflow


Diego Muñoz
diego@tuenti.com
http://twitter.com/Kartones
Agenda
•   Some numbers
•   Release workflow
•   Some tools
•   Some rants
Some numbers
•   +12M users
•   +100 usage minutes / day (avg)
•   +200M chat messages / day
•   +4M photos uploaded / day (peaks)
•   +40,000M page views / month
•   +35K requests / sec (peaks)
•   +1K servers
•   +250 employees (~60% techies)
•   +15K files in the repositories
•   +8K Tests
Release Workflow



Branch   Code   Test   Integrate   Release   Stabilize
Release Workflow: Branch
Branch      Code      Test    Integrate   Release   Stabilize




•   8-12 branches per release
•   Current record: 29 branches
•   Repository per functional area (be, fe, stats, …)
•   Avg. # lines modified per release: 63K
Release Workflow: Code + Test
Branch      Code      Test      Integrate   Release   Stabilize




•   Scrum (or at least Agile)
•   As TDD as possible
•   Labs
•   A/B Testing
•   PoCs
•   Dark launch
Release Workflow: Integrate
Branch     Code     Test   Integrate   Release   Stabilize




•   Repo always available
•   Only merge with 100% tests ok / TFW approval
•   QA Regression & manual tests
•   Fix possible merge/integration problems ASAP
Release Workflow: Release
Branch    Code     Test    Integrate   Release   Stabilize




• 2 releases per week (tuesdays & thursdays)
• Latest stable changeset from Integration taken
  previous day 11 AM
• Release doc, pre-release meetings
• Staging servers to test with live data
• We are searching for a Release Manager ;)
Release Workflow: Stabilize
Branch     Code     Test    Integrate   Release   Stabilize




•   First code push: 8 AM
•   Release window: 3 hours (normal scenario)
•   Live error stabilization or rollback
•   Representatives from all involved teams
Some tools
DVCS: Mercurial
•   http://mercurial.selenic.com/
•   Syntax similar to SVN (our old system)
•   Easy API to plug our plugins and hooks
•   100% cross-platform (now Git also but not before)
• Commit hooks to check syntax, coding standards…
• Bottleneck:
    – Push/pulls through VPN are slow
Issue Tracking: Trac
•   http://trac.edgewall.org/
•   User Stories tasks
•   Bugs
•   Wiki
•   Plugins and extensible
•   Bottleneck: Sometimes slow, code viewing not
    optimal
Testing: PHPUnit
• http://www.phpunit.de
• Some caveats
  – Mocking just ‘works’
  – PHP process spawning PHP tests
• We have:
  – Vastly improved mocking framework
  – Shell scripts that isolate test batteries
  – Better integration with Selenium
• Bottleneck: Our current FEFW does not cope
  perfectly with PHPUnit/Selenium
CI: Jenkins
• http://jenkins-ci.org/
• Previously Hudson too
• CI servers farm
• Parallelization, plugins, special reports, custom
  tunnings
• Bottleneck: Acceptance tests
Storage: MySQL
• Live site storage
• Dev. env. storage
  – 1 DB per user (to run tests)
  – 1 shared DB (common faked data)
• Bottleneck: Slow when running tests
File copying: RSync
•   http://rsync.samba.org/
•   Deployment of code (live & dev)
•   Sends deltas/diffs
•   Really fast
Configuration: Puppet
•   http://puppetlabs.com/
•   Production machines
•   Jenkins nodes
•   VM management / Dev web servers config
Stats: Hadoop
•   http://hadoop.apache.org/
•   Dedicated cluster
•   HBase
•   Hive
VirtualBox
•   https://www.virtualbox.org/
•   Multi-platform
•   Accurate and fast emulation of Windows OS
•   Bottleneck: RAM of host machine
Search: Sphinx
•   http://sphinxsearch.com/
•   Non-realtime (index based)
•   Very fast
•   Bottleneck: Index generation on dev & test env.
Caching: Memcached
• http://memcached.org/
• Dev. Behaviour == live behaviour
• Tuenti has sent improvements & patches
  – UDP patch, random ports
• Bottleneck: 32GB RAM / machine practical limit
Some rants
Statics versioning: The good
• Browser caching problems gone
• Transparent & easy to use by developers
• Easy to deactivate for testing
Statics versioning: The bad
• Dangerous if not careful with non-
  production ready files
• Waste of bandwith
• No dev versioning == browser caching
  problems
File bundling: The good
•   Less files == faster download & deploy
•   Big text file == better HTTP Gzip
•   Build time only (no need for dev)
•   Combines perfectly with minification and
    versioning
File bundling: The bad
• Firebug debugging harder (big single files)
• One syntax error breaks multiple bundled files
  – Minimized + Bundled + Error == Real pain
• Hierarchy needed to avoid bundling multiple
  times same files.
  – Web, external/public page, mobile…
• Dev. only errors !
Our Build script
•   Localization
•   Minification
•   Bundling
•   Versioning
•   Statics deployment to CDNs
•   Deltas of changes or full build
•   …
•   Bottleneck: Build time
HipHop
• Migrating old code to fully support HipHop
  – With PHP 5.3
• Obvious speed improvements
• Also nice for static code analysis
Our Chat
•   Erlang (server) + Javascript (client)
•   Jabber protocol
•   Ejjabberd tweaked (3,5x faster)
•   200M msgs/day, 1M concurrent users peak,…
•   20 machines, ~5 instances per machine
•   Same behaviour dev/live is critical

• We’re searching for Erlang experts ;)
exit(0);


 Sounds interesting?
http://jobs.tuenti.com

Tuenti Release Workflow

  • 1.
  • 2.
    Agenda • Some numbers • Release workflow • Some tools • Some rants
  • 3.
    Some numbers • +12M users • +100 usage minutes / day (avg) • +200M chat messages / day • +4M photos uploaded / day (peaks) • +40,000M page views / month • +35K requests / sec (peaks) • +1K servers • +250 employees (~60% techies) • +15K files in the repositories • +8K Tests
  • 4.
    Release Workflow Branch Code Test Integrate Release Stabilize
  • 5.
    Release Workflow: Branch Branch Code Test Integrate Release Stabilize • 8-12 branches per release • Current record: 29 branches • Repository per functional area (be, fe, stats, …) • Avg. # lines modified per release: 63K
  • 6.
    Release Workflow: Code+ Test Branch Code Test Integrate Release Stabilize • Scrum (or at least Agile) • As TDD as possible • Labs • A/B Testing • PoCs • Dark launch
  • 7.
    Release Workflow: Integrate Branch Code Test Integrate Release Stabilize • Repo always available • Only merge with 100% tests ok / TFW approval • QA Regression & manual tests • Fix possible merge/integration problems ASAP
  • 8.
    Release Workflow: Release Branch Code Test Integrate Release Stabilize • 2 releases per week (tuesdays & thursdays) • Latest stable changeset from Integration taken previous day 11 AM • Release doc, pre-release meetings • Staging servers to test with live data • We are searching for a Release Manager ;)
  • 9.
    Release Workflow: Stabilize Branch Code Test Integrate Release Stabilize • First code push: 8 AM • Release window: 3 hours (normal scenario) • Live error stabilization or rollback • Representatives from all involved teams
  • 10.
  • 11.
    DVCS: Mercurial • http://mercurial.selenic.com/ • Syntax similar to SVN (our old system) • Easy API to plug our plugins and hooks • 100% cross-platform (now Git also but not before) • Commit hooks to check syntax, coding standards… • Bottleneck: – Push/pulls through VPN are slow
  • 12.
    Issue Tracking: Trac • http://trac.edgewall.org/ • User Stories tasks • Bugs • Wiki • Plugins and extensible • Bottleneck: Sometimes slow, code viewing not optimal
  • 13.
    Testing: PHPUnit • http://www.phpunit.de •Some caveats – Mocking just ‘works’ – PHP process spawning PHP tests • We have: – Vastly improved mocking framework – Shell scripts that isolate test batteries – Better integration with Selenium • Bottleneck: Our current FEFW does not cope perfectly with PHPUnit/Selenium
  • 14.
    CI: Jenkins • http://jenkins-ci.org/ •Previously Hudson too • CI servers farm • Parallelization, plugins, special reports, custom tunnings • Bottleneck: Acceptance tests
  • 15.
    Storage: MySQL • Livesite storage • Dev. env. storage – 1 DB per user (to run tests) – 1 shared DB (common faked data) • Bottleneck: Slow when running tests
  • 16.
    File copying: RSync • http://rsync.samba.org/ • Deployment of code (live & dev) • Sends deltas/diffs • Really fast
  • 17.
    Configuration: Puppet • http://puppetlabs.com/ • Production machines • Jenkins nodes • VM management / Dev web servers config
  • 18.
    Stats: Hadoop • http://hadoop.apache.org/ • Dedicated cluster • HBase • Hive
  • 19.
    VirtualBox • https://www.virtualbox.org/ • Multi-platform • Accurate and fast emulation of Windows OS • Bottleneck: RAM of host machine
  • 20.
    Search: Sphinx • http://sphinxsearch.com/ • Non-realtime (index based) • Very fast • Bottleneck: Index generation on dev & test env.
  • 21.
    Caching: Memcached • http://memcached.org/ •Dev. Behaviour == live behaviour • Tuenti has sent improvements & patches – UDP patch, random ports • Bottleneck: 32GB RAM / machine practical limit
  • 22.
  • 23.
    Statics versioning: Thegood • Browser caching problems gone • Transparent & easy to use by developers • Easy to deactivate for testing
  • 24.
    Statics versioning: Thebad • Dangerous if not careful with non- production ready files • Waste of bandwith • No dev versioning == browser caching problems
  • 25.
    File bundling: Thegood • Less files == faster download & deploy • Big text file == better HTTP Gzip • Build time only (no need for dev) • Combines perfectly with minification and versioning
  • 26.
    File bundling: Thebad • Firebug debugging harder (big single files) • One syntax error breaks multiple bundled files – Minimized + Bundled + Error == Real pain • Hierarchy needed to avoid bundling multiple times same files. – Web, external/public page, mobile… • Dev. only errors !
  • 27.
    Our Build script • Localization • Minification • Bundling • Versioning • Statics deployment to CDNs • Deltas of changes or full build • … • Bottleneck: Build time
  • 28.
    HipHop • Migrating oldcode to fully support HipHop – With PHP 5.3 • Obvious speed improvements • Also nice for static code analysis
  • 29.
    Our Chat • Erlang (server) + Javascript (client) • Jabber protocol • Ejjabberd tweaked (3,5x faster) • 200M msgs/day, 1M concurrent users peak,… • 20 machines, ~5 instances per machine • Same behaviour dev/live is critical • We’re searching for Erlang experts ;)
  • 30.

Editor's Notes

  • #7 TDD: Backend nearer, FE hard once you enter visual tests (acceptance)
  • #9 Monday: Too much trafficFriday: Weekend next day, safer not to just in case something happens. (First redesign story)
  • #10 Shared Gdocs spreadsheet in which QA add bugs and engineers check and mark
  • #13 Partially due to our architecture: We’re working on master-slave architectures to make writes on master but reads on slaves, failover…
  • #14 Yes, we use Singletons. the problem is PHP running PHP and thus keeping things between test batteriesWe’re working on adding more testeability features to the FEFW
  • #15 We use XEN (http://xen.org/) for Windows virtualization on Jenkins buildsNodes are not virtualized because 20% less performance
  • #16 Mock everything (unit/integration)Reuse data if possible (acceptance)
  • #20 Now all dev laptops have 8GB RAM and old ones can ask for an upgrade
  • #21 Indexed in 5-15 min normal scenario, worst case 1h max/limit
  • #22 UDP + random ports instead of TCP biggest win
  • #26 JS & CSS
  • #27 IE 32 CSS files limitation best example of dev. Only problemsSplit in multiple lines instead of just one (ease IE debugging)
  • #28 Migration from single SH + PHP script to Ant parallelized script almost done. Build time: 1 minute!