Wix Dev-Centric Culture
Aviran Mordo
Head Of Back-End Engineering @ Wix
@aviranm
http://www.linkedin.com/in/aviran
04:30
04:30
Wix In Numbers
• Over 45,000,000 users
– >1M new users/month
• Static storage is >800TB of data
– >1.5TB new files/day
• 3...
Traditional Dev Pipeline
Product Dev QA Operations
04:30
04:30
04:30
Product Dev QA Operations
04:30
SCRUM
04:30
04:30
Lean
Agile
SCRUM
XP
SCRUM != Agile
Jan 2014
Deployments (production changes) per month
Every 9 minutes
production
changes its state
(during working hours)
Do You Have The Guts To
Deploy 60 Times A Day?
04:30
04:30
Where We Were
• We were working traditional waterfall
• With fear of change
– It is working, why touch it?
– Uploading a r...
04:30
04:30
Taiichi Ohno
Lean Product development
“Top 5 Most-Used Commands in Microsoft Word
• Paste
• Save
• Copy
• Undo
• Bold
These five comman...
Scaling challenges – Product
• Product Minimum Viable Product
(MVP)
– Does MVP meet your product
standards?
• What about t...
Get out of thought land
• The law of failure
– Most new “its” will fail even if they are flawlessly executed
• Invest less...
04:30
Risk
• Waterfall - minimize number of deployments
• CD - minimize number of changes and impact in $$
04:30
Risk = #deploym...
Small Development Iterations
• No Waterfall
• No Scrum
• No Iterations
• No long documents
• Build something small
• When ...
Product/Dev/QA/Ops boundaries are going down
What Is The Common Denominator?
• Product manager
• Project manager
• QA
• Operations
• DBA
CD is culture & mindset
• Trust the developers
– Empower developers to change production
– Developer knows his system best...
Dev Centric Culture – Involve The Developer
• Product definition (with product)
• Development (with architect)
• Testing (...
Continuous Delivery – Key points
• Abandon the “VERSION” paradigm – move to a
feature centric methodology
• Make small and...
Test Driven Development
• No new code is pushed to Git without being fully tested
– We currently have around 10,000 automa...
What people think of TDD
• TDD slows down development
• With TDD we write more code (product + test code).
• TDD has no si...
What people think of TDD
• TDD slows down development
• With TDD we write more code (product + test code).
• TDD has no si...
TDD Actual impact on development
• We develop products faster
• Removes fear of change
• Easier to enter some one else’s p...
04:30
Is Refactoring Rework?
Absolutely NOT !
• Refactoring is the outcome of learning
• Refactoring is the cornerstone of impro...
Refactoring
• Refactor from inside out
– Small iterations with tests
– Refactor small methods -
make sure the tests don’t
...
04:30
Code branch
04:30
New Code Old Code
FT
Opened
Yes No
Usage example
Simple “if” statement in your code
04:30
Feature Toggles
• Everyone develops on the Trunk
• Every piece of code can get to production at anytime
04:30
Feature Toggle to the rescue
• Unused new code can go to production – no harm done
• Operational new code goes with a guar...
04:30
DB Schema Changes Without Downtime
• Adding columns
– Use another table link by primary key
– Use blob field for schema fl...
New DB schema with data migration
• Plan a lazy migration path controlled by feature toggle
1. Write to old / Read from ol...
Feature Toggle Strategies (gradual expose users)
• Company employees
• Specific users or group of users
• Percentage of tr...
Feature Toggle Override
• By specific server
– Used to test system load
– New database flows/migration
– Refactoring that ...
04:30
Wix PETRI
A/B Tests
04:30
A/B Test
• Every new feature is A/B tested
• We open the new feature to a % of users
– Define KPIs to check if the new fea...
An interesting site effect on product
• How many times did you have the conversion “what is
better”?
– Put the menu on top...
Marking users with toss value in a cookie
• Anonymous user
– Toss is randomly determined
– Can not guarantee persistent ex...
• Do not mix anonymous and registered tests
• AB test parentage of users with optional filters
– New Users Only (Registere...
A/B Test Features
• A/B Test Override
– Allows to set a value of a test for validation
– Helps support experience what use...
04:30
NOT !!!
Gradual Deployment
04:30
• Assume two components
• We shutdown one and install on it the
new version. It is not active yet...
Self Test / Post Deployment Test
After each server deployment run a self test before deploying the next server.
• Checking...
Tools - App-info – Self Test
Backward and Forward compatible
• Assume two components
• We release a new version of one
• Now Rollback the other…
04:30
...
Time machine event =
• Deployment capabilities : “no click” deployment
– Dozens of services , 130+ servers over 3 Data Cen...
CD – prepare to invest…..
• Dev infrastructure - Refactor , Refactor, Refactor
• Testing infrastructure & know how
• Deplo...
How does it work – CD Practices
• Test driven development
• Small Development Iterations
• Backwards and Forwards compatib...
Tools - App-info - Dashboard
Tools - App-info – Running Experiments
Tools – Monitoring - New Relic
Tools – Frying Pan
Tools – Lifecycle To Rule Them All
Where are we today?
• We have re-written our flash editor product as an HTML 5 editor
– In just 4 months
• Introduced Wix ...
Aviran Mordo
@aviranm
http://www.linkedin.com/in/aviran
http://www.aviransplace.com
04:30
Read more: The Road To Continuou...
Wix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous Delivery
Upcoming SlideShare
Loading in …5
×

Wix Dev-Centric Culture And Continuous Delivery

2,522 views

Published on

How Wix is doing continuous delivery and our Dev-Centric culture to support that

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,522
On SlideShare
0
From Embeds
0
Number of Embeds
73
Actions
Shares
0
Downloads
49
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide
  • Today I’m going to tell you some of the strategies we use that allow us to deploy 10 times a day
  • Wix is a web publishing platform
  • TaiichiOhno, Toyota's chief of production in the post-WWII period. He was THE main developer of Toyota Production System (TPS).
  • Lean Product development
  • Management role is to help developer do its work
  • One of the key components to successful CD
  • Full load on a single serverOverride size limitation by setting a cookie on the client
  • Link to purchase on the editor was causing drop in conversion because users went there too soon without intent
  • Link to purchase on the editor was causing drop in conversion because users went there too soon without intent
  • Wix Dev-Centric Culture And Continuous Delivery

    1. 1. Wix Dev-Centric Culture Aviran Mordo Head Of Back-End Engineering @ Wix @aviranm http://www.linkedin.com/in/aviran 04:30
    2. 2. 04:30
    3. 3. Wix In Numbers • Over 45,000,000 users – >1M new users/month • Static storage is >800TB of data – >1.5TB new files/day • 3 Data centers + 2 Clouds (Google, Amazon) – ~300 servers • >700M HTTP requests/day • ~600 people work at Wix – Of which ~ 200 in R&D
    4. 4. Traditional Dev Pipeline Product Dev QA Operations 04:30
    5. 5. 04:30
    6. 6. 04:30 Product Dev QA Operations
    7. 7. 04:30
    8. 8. SCRUM 04:30
    9. 9. 04:30 Lean Agile SCRUM XP SCRUM != Agile
    10. 10. Jan 2014 Deployments (production changes) per month Every 9 minutes production changes its state (during working hours)
    11. 11. Do You Have The Guts To Deploy 60 Times A Day? 04:30
    12. 12. 04:30
    13. 13. Where We Were • We were working traditional waterfall • With fear of change – It is working, why touch it? – Uploading a release means downtime and bugs! • With low product quality • With slow development velocity • With tradition enterprise development lifecycle – Three months of a “VERSION” development and QA – Six months of crisis mode cleaning bugs and stabilizing system
    14. 14. 04:30
    15. 15. 04:30 Taiichi Ohno
    16. 16. Lean Product development “Top 5 Most-Used Commands in Microsoft Word • Paste • Save • Copy • Undo • Bold These five commands account for around 32% of the total command use in Word. Paste itself accounts for more than 11% of all commands used, and has more than twice as much usage as the #2 entry on the list, Save. Beyond the top 10 commands, the curve flattens out considerably. The percentage difference in usage between the #100 command ("Accept Change") and the #400 command ("Reset Picture") is about the same in difference between #1 and #11 ("Change Font Size") “
    17. 17. Scaling challenges – Product • Product Minimum Viable Product (MVP) – Does MVP meet your product standards? • What about tooltip, help,first time ux, etc.. ? – How to define a product that can be developed in a day ? – And that can win in a/b test … To Be Implemented
    18. 18. Get out of thought land • The law of failure – Most new “its” will fail even if they are flawlessly executed • Invest less, in-touch less , better ability to admit it fail – Data beats opinions - let the customer decide make sure you building the right it before build it right Quick Feedback
    19. 19. 04:30
    20. 20. Risk • Waterfall - minimize number of deployments • CD - minimize number of changes and impact in $$ 04:30 Risk = #deployments * chance of something going wrong (~ number of changes) * impact of something wrong in $$
    21. 21. Small Development Iterations • No Waterfall • No Scrum • No Iterations • No long documents • Build something small • When it is ready, deploy it – Measure it – Then fix it – Again – And again, until Dev, Product and Customers are happy • Then start changing it – Again, as a small change
    22. 22. Product/Dev/QA/Ops boundaries are going down
    23. 23. What Is The Common Denominator? • Product manager • Project manager • QA • Operations • DBA
    24. 24. CD is culture & mindset • Trust the developers – Empower developers to change production – Developer knows his system best • Automation as a default choice – no more “is it worth to automate ? ” – Everything should be automated • Welcome to the twilight zone – Product/Dev/QA boundaries are going down – Everyone need to care about everything – Less formality : Corridor - IN , Meeting Room - Out
    25. 25. Dev Centric Culture – Involve The Developer • Product definition (with product) • Development (with architect) • Testing (with QA developers) • Deployment / Rollback(with operations) • Monitoring / BI (with BI team) • DevOps – to enable deployment and rollback, fully automated
    26. 26. Continuous Delivery – Key points • Abandon the “VERSION” paradigm – move to a feature centric methodology • Make small and frequent release as soon as possible • Automate everything – TDD/CI/CD • Measure everything – A/B test every new feature – Monitor real KPIs (business, not CPU) • Deploy without downtime 04:30
    27. 27. Test Driven Development • No new code is pushed to Git without being fully tested – We currently have around 10,000 automated tests • Before fixing a bug first write a test to reproduce the bug • Cover legacy (untested) systems with Integration tests 04:30
    28. 28. What people think of TDD • TDD slows down development • With TDD we write more code (product + test code). • TDD has no significant impact on quality 04:30
    29. 29. What people think of TDD • TDD slows down development • With TDD we write more code (product + test code). • TDD has no significant impact on quality 04:30
    30. 30. TDD Actual impact on development • We develop products faster • Removes fear of change • Easier to enter some one else’s project • Do we still need QA? (Yes, they code automation tests) – We don’t have QA for back-end applications • Writing a feature is 10-30% slower, 45-90% less bugs • 50% faster to reach production. • Considerably less time to fix bugs 04:30
    31. 31. 04:30
    32. 32. Is Refactoring Rework? Absolutely NOT ! • Refactoring is the outcome of learning • Refactoring is the cornerstone of improvement • Refactoring builds the capacity to change • Refactoring doesn’t cost, it pays 04:30
    33. 33. Refactoring • Refactor from inside out – Small iterations with tests – Refactor small methods - make sure the tests don’t break – Deploy often • Re-write from the outside in – Write from scratch (one piece at a time) – Code duplication sometimes needed (temporary) – Protected by Feature Toggle 04:30 Before refactoring make sure everything is covered with tests - Legacy code usually covered by IT tests
    34. 34. 04:30
    35. 35. Code branch 04:30 New Code Old Code FT Opened Yes No
    36. 36. Usage example Simple “if” statement in your code 04:30
    37. 37. Feature Toggles • Everyone develops on the Trunk • Every piece of code can get to production at anytime 04:30
    38. 38. Feature Toggle to the rescue • Unused new code can go to production – no harm done • Operational new code goes with a guard – use new or old code by feature toggle 04:30
    39. 39. 04:30
    40. 40. DB Schema Changes Without Downtime • Adding columns – Use another table link by primary key – Use blob field for schema flexibility • Removing fields – Stop using. Do not do any DB schema changes 04:30
    41. 41. New DB schema with data migration • Plan a lazy migration path controlled by feature toggle 1. Write to old / Read from old 2. Write to both / Read from old 3. Write to both / Read from new, fallback to old • Backward compatibility is a must 4. Write to new / Read from new, fallback to old 5. Eagerly migrate data in the background 6. Write to new / Read from new 04:30
    42. 42. Feature Toggle Strategies (gradual expose users) • Company employees • Specific users or group of users • Percentage of traffic • By GEO • By Language • By user-agent • User Profile based • By context (site id or some kind of hash on site id) 04:30
    43. 43. Feature Toggle Override • By specific server – Used to test system load – New database flows/migration – Refactoring that may affect performance and memory usage • By Url parameter – Enable internal testing – Product acceptance – Faking GEO • By FT cookie value – Testing – When working with API on a single page application 04:30
    44. 44. 04:30 Wix PETRI
    45. 45. A/B Tests 04:30
    46. 46. A/B Test • Every new feature is A/B tested • We open the new feature to a % of users – Define KPIs to check if the new feature is better or worse – If it is better, we keep it – If worse, we check why and improve – If we find flaws, the impact is just for % of our users (kind of Feature Toggle) 04:30
    47. 47. An interesting site effect on product • How many times did you have the conversion “what is better”? – Put the menu on top / on the side • Well, how about building both and A/B Testing? 04:30
    48. 48. Marking users with toss value in a cookie • Anonymous user – Toss is randomly determined – Can not guarantee persistent experience if changing browser • Registered User – Toss is determined by the user ID – Guarantee toss persistency across browsers – Allows setting additional tossing criteria (for example new users only) – Only use this for sections that a user has to be authenticated 04:30
    49. 49. • Do not mix anonymous and registered tests • AB test parentage of users with optional filters – New Users Only (Registered users only) – By language – By GEO – By Browser – user-agent – OS – Any other criteria you have on your users 04:30
    50. 50. A/B Test Features • A/B Test Override – Allows to set a value of a test for validation – Helps support experience what users experiencing • Override methods – Via URL parameter – Via cookie • Start/Stop Test • Pause tests • Bots always get “A” 04:30
    51. 51. 04:30 NOT !!!
    52. 52. Gradual Deployment 04:30 • Assume two components • We shutdown one and install on it the new version. It is not active yet • Do self test • Activate the new server it is passes self test • Continue deploying the other servers, a few at a time, checking each one with self test A 1.1 B 1.1 A 1.1 B 1.2 A 1.1 A 1.1 B 1.1 B 1.1 A 1.1 A 1.1 B 1.1 B 1.2 A 1.1 B 1.2 A 1.1 A 1.1 B 1.1 B 1.2 A 1.1 B 1.1 A 1.1 A 1.1 B 1.1 B 1.2
    53. 53. Self Test / Post Deployment Test After each server deployment run a self test before deploying the next server. • Checking server configuration and topology – Make sure database is accessible (DB connection string) – Is the schema the one I expect – Access required local resources (data files, other config files, templates, etc’) – Access remote resources – RPC / REST endpoints reachable and operational • Server will refuse requests unless it passes the self test • Allow a way to skip self test (and continue deployment) 04:30
    54. 54. Tools - App-info – Self Test
    55. 55. Backward and Forward compatible • Assume two components • We release a new version of one • Now Rollback the other… 04:30 A 1.1 B 1.2 A 1.2 B 1.1A 1.1A 1.1 B 1.1 B 1.2 A 1.2A 1.1 B 1.1B 1.1 A 1.1 B 1.1A 1.1A 1.1 B 1.1B 1.1 A 1.0 A 1.2A 1.1 B 1.2B 1.1 B 1.2 A 1.2 A 1.2A 1.1 B 1.2B 1.1 B 1.0
    56. 56. Time machine event = • Deployment capabilities : “no click” deployment – Dozens of services , 130+ servers over 3 Data Centers • Backward and forward compatibility at the extreme field test case – Mixed versions of services / DB with no service downtime • Empowerment – The power we give to individual • Risk taken and failure embracement
    57. 57. CD – prepare to invest….. • Dev infrastructure - Refactor , Refactor, Refactor • Testing infrastructure & know how • Deployment infrastructure & tools • Automation , Automation , Automation • Monitoring (business and technical) – hundreds of aspects – thresholds use is a Must – Monitor business KPIs – Internal & external – Endless Tuning & learning
    58. 58. How does it work – CD Practices • Test driven development • Small Development Iterations • Backwards and Forwards compatible • Gradual Deployment & Self-Test • Feature Toggle • A/B Testing • Exception Classification • Production visibility 04:30
    59. 59. Tools - App-info - Dashboard
    60. 60. Tools - App-info – Running Experiments
    61. 61. Tools – Monitoring - New Relic
    62. 62. Tools – Frying Pan
    63. 63. Tools – Lifecycle To Rule Them All
    64. 64. Where are we today? • We have re-written our flash editor product as an HTML 5 editor – In just 4 months • Introduced Wix 3rd party applications (developers API) – In just 6 weeks • We are easily replacing significant parts of our infrastructure • And we are doing ~50 releases a day! • Production state changes every 9 minutes. 04:30
    65. 65. Aviran Mordo @aviranm http://www.linkedin.com/in/aviran http://www.aviransplace.com 04:30 Read more: The Road To Continuous Delivery: http://goo.gl/K6zEK Dev-Centric Culture: http://goo.gl/0Vo70t

    ×