Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Aviran Mordo
Head of Engineering
@aviranm
linkedin.com/in/aviran
aviransplace.com
The Road to
Continuous Delivery
Wix In Numbers
• Over 66,000,000 users
• Static storage is >2PB of data
• 3 Data centers + 3 Clouds (Google, Amazon)
• 2B ...
Traditional Dev Pipeline
Product Dev QA Operations
16:06
Traditional Dev Pipeline
Product Dev QA Operations
Waterfall
Long development cycle
Time waste (Wait)
Late feedback
Hard to fix
1-2 Releases a year
Scrum
Lean
Agile
SCRUM
XP
Scrum != Agile
Lets Go Back In Time
Where We Were
• Working traditional waterfall
• With fear of change
• With low product quality
• With slow development vel...
16:06
Production System
Approach to Production
• Build only what is needed
• Stop if something goes wrong
• Eliminate anything w...
Seeing Waste
SevenWastes of
Manufacturing
• Inventory
• Extra Processing
• Overproduction
• Transportation
• Waiting
• Mot...
Sometimes
16%
Rarely
19%
Never
45%
Always
7%
Often
13%
The Biggest Source Of Waste
Features and functions used in a typica...
Lean Product development
Top 5 Most-Used Commands in Microsoft Word
• Paste
• Save
• Copy
• Undo
• Bold
Paste itself accou...
Scaling challenges – Product
Product Minimum Viable Product (MVP)
• Does MVP meet your product standards?
• What about too...
Get out of thought land
• The law of failure
Most new “its” will fail even if they are flawlessly
executed
• Invest less, ...
Continuous Delivery
Risk
• Waterfall - minimize number of deployments
• CD - minimize number of changes and impact in $$
Risk = #deployments
*...
Small Development Iterations
• No Waterfall
• No Scrum
• No Iterations
• No long documents
• Build something small
• When ...
Product / Dev / QA / Ops
boundaries are going down
What Is The Common Denominator?
• Product manager
• Project manager
• QA
• Operations
• DBA
CD is culture & mindset
• Trust the developers
- Empower developers to change production
- Developer knows his system best...
Dev Centric Culture –
Involve the Developer
• Product definition (with product)
• Development (with architect)
• Testing (...
• The process for releasing/deploying software MUST be
repeatable and reliable
• Automate everything!
• If something's dif...
Test Driven Development
• No new code is pushed to Git without being fully tested
- We currently have over 40,000 automate...
What people think of TDD
• TDD slows down development
• With TDD we write more code (product + test code).
• TDD has no si...
What people think of TDD
• TDD slows down development
• With TDD we write more code (product + test code).
• TDD has no si...
TDD Actual impact on development
• We develop products faster
• Removes fear of change
• Easier to enter some one else’s p...
Guidelines for successful TDD
• Tests should run on project checkout to a
random computer.
• Tests should be debugged on a...
Refactoring
Is Refactoring Rework?
Absolutely NOT !
• Refactoring is the outcome of learning
• Refactoring is the cornerstone of impro...
Refactoring
Refactor from inside out
• Small iterations with tests
• Refactor small methods make
sure the tests don’t brea...
16:06
Code branch
New Code Old Code
FT
Opened
Yes No
Usage example
Simple “if” statement in your code
Feature Toggles
• Everyone develops on the Trunk
• Every piece of code can get to production at anytime
• Unused new code ...
DB Schema Changes Without Downtime
• Adding columns
- Use another table link by primary key
- Use blob field for schema fl...
New DB schema with data migration
• Plan a lazy migration path controlled by feature toggle
1. Write to old / Read from ol...
Feature Toggle Strategies
(gradual expose users)
• Company employees
• Specific users or group of users
• Percentage of tr...
Feature Toggle Override
• By specific server
• Used to test system load
• New database flows/migration
• Refactoring that ...
A/B Test
A/B Test
• Every new feature is A/B tested
• We open the new feature to a % of users
- Define KPIs to check if the new fea...
An interesting site effect on product
How many times did you have the conversion “what is
better”?
- Put the menu on top /...
Marking users for persistent UX
• Anonymous user
- Toss is randomly determined
- Can not guarantee persistent experience i...
• Do not mix anonymous and registered tests
• AB test parentage of users with optional filters
• New Users Only (Registere...
A/B Test Features
• A/B Test Override
• Start
• Stop
• Pause
• Bots are always excluded from the test
Wix PETRI
NOT !!!
Gradual Deployment
• Assume two components
• We shutdown one and install on it the
new version. It is not active yet
• Do ...
Self Test / Post Deployment
TestAfter each server deployment run a self test before deploying the next
server.
• Checking ...
Tools - App-info – Self Test
Backward and Forward compatible
• Assume two components
• We release a new version of one
• Now Rollback the other…
A 1.1
...
A Story on Wix Time Machine
Time machine event =
• Deployment capabilities : “no click” deployment
- Dozens of services , 130+ servers, 3 Data Centers...
17,000 Deployments (production changes) a year
Double the velocity from last year
Every 7 minutes production changes its s...
Do You Have The Guts To
Deploy 60 Times A Day?
CD – prepare to invest…..
• Dev infrastructure - Refactor , Refactor, Refactor
• Testing infrastructure & know how
• Deplo...
How does it work – CD Practices
• Test driven development
• Small Development Iterations
• Backwards and Forwards compatib...
Tools - App-info - Dashboard
Tools - App-info –
Running Experiments
App-Info Resource Pools
Tools – Monitoring - New Relic
Tools – Frying Pan
Tools – Lifecycle To Rule Them All
Where are we today?
• We have re-written our flash editor product as an HTML 5 editor
- In just 4 months
• Introduced Wix ...
Aviran Mordo
Head of Back-end Engineering
@aviranm
linkedin.com/in/aviran
aviransplace.com
The Road to
Continuous Delivery...
How would you do it?
How will you change Wix session encryption key
with as little service interruption as possible
• Encr...
Road to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.com
Upcoming SlideShare
Loading in …5
×

Road to Continuous Delivery - Wix.com

1,866 views

Published on

Guide to continuous delivery and the journey wix.com had made transitioning to DevOps and continuous delivery culture making ~100 production changes daily

Published in: Software
  • Be the first to comment

Road to Continuous Delivery - Wix.com

  1. 1. Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com The Road to Continuous Delivery
  2. 2. Wix In Numbers • Over 66,000,000 users • Static storage is >2PB of data • 3 Data centers + 3 Clouds (Google, Amazon) • 2B HTTP requests/day • 1000 people work atWix
  3. 3. Traditional Dev Pipeline Product Dev QA Operations
  4. 4. 16:06
  5. 5. Traditional Dev Pipeline Product Dev QA Operations
  6. 6. Waterfall Long development cycle Time waste (Wait) Late feedback Hard to fix 1-2 Releases a year
  7. 7. Scrum
  8. 8. Lean Agile SCRUM XP Scrum != Agile
  9. 9. Lets Go Back In Time
  10. 10. Where We Were • Working traditional waterfall • With fear of change • With low product quality • With slow development velocity • With tradition enterprise development lifecycle -Three months of a “VERSION” development and QA - Six months of crisis mode stabilizing system
  11. 11. 16:06
  12. 12. Production System Approach to Production • Build only what is needed • Stop if something goes wrong • Eliminate anything which does not add value Philosophy of Work • Respect for workers • Full utilization of workers capabilities • Entrust workers with responsibility & authority Taiichi Ohno (1912-1990)
  13. 13. Seeing Waste SevenWastes of Manufacturing • Inventory • Extra Processing • Overproduction • Transportation • Waiting • Motion • Defects SevenWastes of Software Development • Partially DoneWork • Paperwork • Extra Features • Building the wrong thing • Waiting for information • Task switching • Defects
  14. 14. Sometimes 16% Rarely 19% Never 45% Always 7% Often 13% The Biggest Source Of Waste Features and functions used in a typical system Often or Always used: 20% Rarely or never used: 64%
  15. 15. Lean Product development Top 5 Most-Used Commands in Microsoft Word • Paste • Save • Copy • Undo • Bold Paste itself accounts for more than 11% of all commands used, and has more than twice as much usage as the #2 entry on the list, Save. 32% of the total command usage
  16. 16. Scaling challenges – Product Product Minimum Viable Product (MVP) • Does MVP meet your product standards? • What about tooltip, help,first time ux, etc.. ? • And that can win in a/b test … To Be Implemente d
  17. 17. Get out of thought land • The law of failure Most new “its” will fail even if they are flawlessly executed • Invest less, in-touch less , better ability to admit it fail Data beats opinions - let the customer decide Make sure you building the right it before build it right Quick Feedback
  18. 18. Continuous Delivery
  19. 19. Risk • Waterfall - minimize number of deployments • CD - minimize number of changes and impact in $$ Risk = #deployments * chance of something going wrong (~ number of changes) * impact of something wrong in $$
  20. 20. Small Development Iterations • No Waterfall • No Scrum • No Iterations • No long documents • Build something small • When it is ready, deploy it - Measure it - Then fix it - Repeat, until Dev, Product and Customers are happy
  21. 21. Product / Dev / QA / Ops boundaries are going down
  22. 22. What Is The Common Denominator? • Product manager • Project manager • QA • Operations • DBA
  23. 23. CD is culture & mindset • Trust the developers - Empower developers to change production - Developer knows his system best • Automation as a default choice - No more “is it worth to automate ?” - Everything should be automated • Welcome to the twilight zone - Product/Dev/QA boundaries are going down - Everyone need to care about everything - Less formality : Corridor - IN , Meeting Room - Out
  24. 24. Dev Centric Culture – Involve the Developer • Product definition (with product) • Development (with architect) • Testing (with QA developers) • Deployment / Rollback(with ops) • Monitoring / BI (with BI team) • DevOps – to enable deployment and rollback, fully automated Developer Product QA ManagementOperation BI Support Circle
  25. 25. • The process for releasing/deploying software MUST be repeatable and reliable • Automate everything! • If something's difficult or painful, do it more often • Keep everything in source control • Done means “released” • Built in quality • Everybody has responsibility for the release process • Improve continuously Continuous Delivery principles
  26. 26. Test Driven Development • No new code is pushed to Git without being fully tested - We currently have over 40,000 automated tests • Before fixing a bug first write a test to reproduce the bug • Cover legacy (untested) systems with Integration tests
  27. 27. What people think of TDD • TDD slows down development • With TDD we write more code (product + test code). • TDD has no significant impact on quality
  28. 28. What people think of TDD • TDD slows down development • With TDD we write more code (product + test code). • TDD has no significant impact on quality
  29. 29. TDD Actual impact on development • We develop products faster • Removes fear of change • Easier to enter some one else’s project • Do we still need QA? (Yes, they code automation tests) - We don’t have QA for back-end applications • Writing a feature is 10-30% slower, 45-90% less bugs • 50% faster to reach production. • Considerably less time to fix bugs (almost no need for debuger)
  30. 30. Guidelines for successful TDD • Tests should run on project checkout to a random computer. • Tests should be debugged on a developer’s machine • Tests should run fast • Tests have to be readable – They are the project’s specs • Fixture is evil!
  31. 31. Refactoring
  32. 32. Is Refactoring Rework? Absolutely NOT ! • Refactoring is the outcome of learning • Refactoring is the cornerstone of improvement • Refactoring builds the capacity to change • Refactoring doesn’t cost, it pays
  33. 33. Refactoring Refactor from inside out • Small iterations with tests • Refactor small methods make sure the tests don’t break • Deploy often Re-write from the outside in • Write from scratch (one piece at a time) • Code duplication sometimes needed (temporary) • Protected by Feature Toggle Before refactoring cover everything with tests - Legacy code usually covered by IT tests
  34. 34. 16:06
  35. 35. Code branch New Code Old Code FT Opened Yes No
  36. 36. Usage example Simple “if” statement in your code
  37. 37. Feature Toggles • Everyone develops on the Trunk • Every piece of code can get to production at anytime • Unused new code can go to production – no harm done • Operational new code goes with a guard – use new or old code by feature toggle
  38. 38. DB Schema Changes Without Downtime • Adding columns - Use another table link by primary key - Use blob field for schema flexibility • Removing fields - Stop using. Do not do any DB schema changes
  39. 39. New DB schema with data migration • Plan a lazy migration path controlled by feature toggle 1. Write to old / Read from old 2. Write to both / Read from old 3. Write to both / Read from new, fallback to old Backward compatibility is a must 4. Write to new / Read from new, fallback to old 5. Eagerly migrate data in the background 6. Write to new / Read from new
  40. 40. Feature Toggle Strategies (gradual expose users) • Company employees • Specific users or group of users • Percentage of traffic • By GEO • By Language • By user-agent • User Profile based • By context (site id or some kind of hash on site id)
  41. 41. Feature Toggle Override • By specific server • Used to test system load • New database flows/migration • Refactoring that may affect performance and memory usage • By Url parameter • Enable internal testing • Product acceptance • Faking GEO • By FT cookie value • Testing • When working with API on a single page application
  42. 42. A/B Test
  43. 43. A/B Test • Every new feature is A/B tested • We open the new feature to a % of users - Define KPIs to check if the new feature is better - If it is better, keep it - If worse, check why and improve - impact of flaws is just for % of our users
  44. 44. An interesting site effect on product How many times did you have the conversion “what is better”? - Put the menu on top / on the side Well, how about building both and A/B Testing?
  45. 45. Marking users for persistent UX • Anonymous user - Toss is randomly determined - Can not guarantee persistent experience if changing browser • Registered User - Toss is determined by the user ID - Guarantee toss persistency across browsers - Allows setting additional tossing criteria (for example new users only) - Only use this for sections that a user has to be authenticated
  46. 46. • Do not mix anonymous and registered tests • AB test parentage of users with optional filters • New Users Only (Registered users only) • By language • By GEO • By Browser • user-agent • OS • Any other criteria you have on your users
  47. 47. A/B Test Features • A/B Test Override • Start • Stop • Pause • Bots are always excluded from the test
  48. 48. Wix PETRI
  49. 49. NOT !!!
  50. 50. Gradual Deployment • Assume two components • We shutdown one and install on it the new version. It is not active yet • Do self test • Activate the new server it is passes self test • Continue deploying the other servers, a few at a time, checking each one with self test A 1.1 B 1.1 A 1.1 B 1.2 A 1.1 A 1.1 B 1.1 B 1.1 A 1.1 A 1.1 B 1.1 B 1.2 A 1.1 B 1.2 A 1.1 A 1.1 B 1.1 B 1.2 A 1.1 B 1.1 A 1.1 A 1.1 B 1.1 B 1.2
  51. 51. Self Test / Post Deployment TestAfter each server deployment run a self test before deploying the next server. • Checking server configuration and topology - Make sure DB is accessible - Is the schema the one I expect - Access required local resources (files, config, templates, etc’) - Access remote resources - RPC / REST endpoints reachable and operational • Server will refuse requests unless it passes the self test • Allow a way to skip self test (and continue deployment)
  52. 52. Tools - App-info – Self Test
  53. 53. Backward and Forward compatible • Assume two components • We release a new version of one • Now Rollback the other… A 1.1 B 1.2 A 1.2 B 1.1A 1.1A 1.1 B 1.1 B 1.2 A 1.2A 1.1 B 1.1B 1.1 A 1.1 B 1.1A 1.1A 1.1 B 1.1B 1.1 A 1.0 A 1.2A 1.1 B 1.2B 1.1 B 1.2 A 1.2 A 1.2A 1.1 B 1.2B 1.1 B 1.0
  54. 54. A Story on Wix Time Machine
  55. 55. Time machine event = • Deployment capabilities : “no click” deployment - Dozens of services , 130+ servers, 3 Data Centers • Backward and forward compatibility at the extreme field test case - Mixed versions of services / DB with no service downtime • Empowerment - The power we give to individual • Risk taken and failure embracement
  56. 56. 17,000 Deployments (production changes) a year Double the velocity from last year Every 7 minutes production changes its state (during working hours)
  57. 57. Do You Have The Guts To Deploy 60 Times A Day?
  58. 58. CD – prepare to invest….. • Dev infrastructure - Refactor , Refactor, Refactor • Testing infrastructure & know how • Deployment infrastructure & tools • Automation , Automation , Automation • Monitoring (business and technical) • hundreds of aspects • thresholds use is a Must • Monitor business KPIs • Internal & external • Endless Tuning & learning
  59. 59. How does it work – CD Practices • Test driven development • Small Development Iterations • Backwards and Forwards compatible • Gradual Deployment & Self-Test • Feature Toggle • A/B Testing • Exception Classification • Production visibility
  60. 60. Tools - App-info - Dashboard
  61. 61. Tools - App-info – Running Experiments
  62. 62. App-Info Resource Pools
  63. 63. Tools – Monitoring - New Relic
  64. 64. Tools – Frying Pan
  65. 65. Tools – Lifecycle To Rule Them All
  66. 66. Where are we today? • We have re-written our flash editor product as an HTML 5 editor - In just 4 months • Introduced Wix 3rd party applications (developers API) - In just 6 weeks • We are easily replacing significant parts of our infrastructure • And we are doing ~60 releases a day! • Production state changes every 7 minutes.
  67. 67. Aviran Mordo Head of Back-end Engineering @aviranm linkedin.com/in/aviran aviransplace.com The Road to Continuous Delivery Read more: The Road to Continuous Delivery: http://goo.gl/K6zEK Dev-Centric Culture: http://goo.gl/0Vo70t
  68. 68. How would you do it? How will you change Wix session encryption key with as little service interruption as possible • Encryption key is currently hard coded in the framework • All the services have the encryption key • User server creates a session • Services can renew a session • External services not in the framework also have the encryption key

×