How Etsy Builds Automated Regression Suites


Published on

Slides from the NYC Metro Selenium Users Meetup group:

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Intro, excited, what I do at Etsy. \n- started 6 months ago, job title, nerdy... I'm an Automation Engineer. \n- Dad, Christmas, testing for 5 years\n- combined super powers, Testing and Automation team of three\n- building our team\n- prior to July 2010, no dedicated Testing team. Noah flatly refused to show me even a line of code from the previous test suite.\n- no idea what these old tests did, what prompted a total re-design of a new set of functional tests. click\n
  • - Debugging took a long time\n- Functional tests were written by different engineers at different times\n- no customer focus\n- Python client driver working with Selenium-RC version 1\n- Tests used randomized data\n- Tests failed intermittenly, results were not deterministic\n- Tests were running against 5 deploys per day when Etsy had 20 Engineers\n\nFast forward to May 2011, and this is what our automation suite looks like now: click\n\n
  • Fast forward to May 2011, and this is what our automation suite looks like:\n- Functional Tests are written by a dedicated Automation Engineer\n- Customer focus is paramount to us!\n- Use cases are written in a BDD framework\n- Ruby client driver working with Selenium-RC\n- Data-driven tests\n- Results are nearly always green, rarely fail\n- Debugging takes very little time\n- Running against 25 deploys per day, Etsy has about 70 Engineers\n\nSo when it came to building a completely new set of functional tests, we decided to divide features on Etsy into service tiers. click\n
  • Incidently this is how we prioritze our bugs too. So we started automating the business critical features first in Tier 1. So to give you an idea some tier 1 services include:\n- Sign in, registration, checkout, listing process\n
  • These are still important but a lower priority than our features in Tier 1. We’re not as screwed if these services are unavailable. Examples:\nShowcase, Treasuries, Circles and Activity Feeds, Convos\n
  • How many of you are testing a website that you don't or wouldn't use in your day to day life? I used to work for a financial company before Etsy, and I never once used that website outside of my daily work. Think about that. \nAt Etsy we're all using the website, we are our own customers and that's why we care so much about quality. If shit's broke, it's affecting us as well, not only our customers.\n- 2 hour crafting session\n- reiterate having a customer focus is imperative\n- functional tests are written from the perspective of ourselves as customers.\n\n- functional tests are written in Cucumber, Ruby and we run them with Selenium-rc 1. We haven't yet migrated over to Selenium 2 but that is something that's definitely on the cards for us. I also mentioned earlier that our team is quite small so managing a large test lab with multiple machines across different browsers it's not a practical solution for us, given we don't have the capacity or the resources to maintain it. In fact we probably never will do it because we don't need to. What we do instead and this is a great way to avoid the overhead involved with maintaining a large test environment,is we run our tests headlessly. \n\nHeadless testing was a completely foreign concept\n- financial testing background, visual test results, great for auditing purposes, troubleshooting (everyone likes to see the exact steps where their tests failed)\n- record and playback individual test runs\n
  • Xvfb - virtual frame buffer, which is an X server that maintains a virtual display in memory. We run our functional tests and point them to our live qa server, which drives a Firefox instance (and this FF instance is running in an Xvfb virtual desktop).\n\n- reiterate only talking about functional tests, plethora of other tests running in our CI environment, unit, integration, network, smoke.\n- functional tests for now only on on end-to-end mission critical components of the website. - we get a lot of confidence running all these tests approximately every time we deploy which is around 25 pushes a day.\n\nSo before we start the selenium server we make sure we've installed xvfb, we select a unique display number (usually 99) and we run Xvfb on this display with the access control off. \n- :99 means that the server will listen for connections as server number 99\n
  • Cucumber, BEHAVIOURAL DRIVEN DEVELOPMENT, dead easy to use, written in plain English. \n- goals at Etsy, enabling virtually anyone in the company to jump in and write tests. \n- support and community staff who know our website inside and out\n- we have product managers acceptance testing cycles. \n- is super easy when you're using cucumber. \n- don't need to know how to code to use it. \n- philosophy, everyone is responsible for the quality of the product. Testing and Automation Team - not the QA team, because everyone owns quality. \n- If you don't own it, you don't work at Etsy.\n
  • So back to Cucumber, \n\n1. First we describe the expected behaviour in plain text following the given, when, then principle.\n
  • Cucumber command drives the underlying rspec commands\n- here we want to set username, password and email during registration for our new_buyer role\n- pass in new_buyer as role type\n- meta programming for being able to use dynamic roles, only place we use this and we keep our tests as dumb as possible.\n- username_of _new_buyer defined in another helper function (layers of abstraction) which I’ll show later when I talk about data\n- The step in yellow is a guard assertion which tells us what we should/should not expect\n
  • Drill deeper down into our layers, this is a helper module called sign in register. We do this because we want to reuse the same functions.. i.e we want our tests to register with different login credentials as we have different tests and we don’t want to reuse the same usernames. \n
  • We use guard assertions to check expected behavior with the application under test. Here we have a problem area where sometimes if we don’t set the username to be unique a error message will display and our tests will fail. This gets reported back to our test results which makes it easier for us to determine the point of fail, if it’s a duplicate username issue - that’s okay because we know this could happen though it’s not common.\n
  • Talk about test run\n
  • \n
  • - old functional test suite used to use random data, which sometimes generated random results\n- now use pre-created data or we generate data on the fly\n- used by our tests and no-one else. A lot of our tests rely on unique registration details, for example my username is mdnetto and once I'm registered I can't register under the same username twice. So when we add our tests to our continuous integration environment we concatenate the build number to a pre-defined username.\n- code showing a helper module we've created which takes the build number and adds it to the end of our standard new buyer username (misspiggy).\n
  • - heavily involved in what' called continuous deployment\n- we push code all day, every day, making small frequent changes to our production code base 24/7.\n- change sets are small which means should we need to roll anything back or fix it forward it's not a big deal. \n- never branch, don't deal with merging issues. not stopping and starting servers or taking the website down for hours. \n- deployment cycles last for around 20 mins. of that 20 mins we’ve settled on about 10 minutes as the longest time frame that the automated tests can during a push\n- we're talking unit, smoke, integration, network and functional) this leaves us enough time to re-run the tests once more during a deployment, without going too far past the 20 minute time limit.\n\n- speed, Jan this year (over a billion page views)\n- busiest month to date was Nov, 2010 as this was the lead up to Christmas \n- engineers deployed around 700 times to prod\n- that's around 70 unique people committing code and 60 unique individuals deploying to production.\n\nWe're able to do this by keeping our deployment process simple (it's a single button). \n- over-communicate irc and email, \n- commit code by jumping into an irc channel called push and update the channel topic with your name, adding yourself to what we call the push train (which is a group of 2-5 pushers, code pushers not drug pushers). Usually the first person to jump on the train is responsible for driving the train (i.e hitting that deploy button). \n- key concept deploying code is easy. Once your code is ready to go, you go to Deployinator (an in-house deployment dashboard) and push the button to get it on QA. click\n
  • A push to QA automatically triggers the tests to run on our CI system (jenkins). \n\n
  • - split our tests up into logical subsets \n- distribute across 10 machines 'slaves' \n- we run tests concurrently\n- functional tests we set up multiple instances of selenium running on each machine in order to keep our tests running for no longer than 5 mins \n- remember these tests are blocking deploys). If we ran all these tests (unit, functional, smoke, integration and network) end-to-end they would take about half an hour to execute. \n- concurrent slaves are our friends. \n\n
  • Talk about project Jenkins view\n\nSo getting back to our push queue, the person driving the push train will wait and watch Jenkins to make sure the tests running against QA, and are all good and return green results. If they're green they end up looking like this...\n
  • When we drill down we can see individual test results. \n
  • So assuming all the tests are green, \n- train driver then pushes to Princess (staging env which uses prod hardware and data stores). - manually verify their changes here, confirm changes, ready to go live, train driver hits the “Prod” button code is live\n- everyone in IRC knows who pushed what code, complete with a link to the diff\n- email notifications\n\n- how is stuff not broken all the time. We develop our code behind config flags, push it to dark (usually in dev env first) then either gradually release it live to a percentage of users or flip it live (via the config flag). So before we release code to production in 99.9% of cases we're testing it on ourselves in Dev first.\n\nChallenges:\n
  • We only have one dedicated automation engineer\n\n
  • - 20 min deploys, just ship mentality\n- Scalable (anyone can use cucumber)\n- Everything is automated, we get alerts with every test pass/fail\n
  • \n
  • \n
  • How Etsy Builds Automated Regression Suites

    1. 1. Automated Regression Suites in a CI Environment by Michelle D’Netto
    2. 2. Etsy’s ‘Ye Olde’ Automation Test Suite Python and Selenium-RC Random data Non-deterministic results 5 deploys, 20 engineers
    3. 3. How Etsy does Automation Now Automation Engineer BDD framework Ruby with Selenium-RC Data-driven tests 25 deploys, 70 Engineers
    4. 4. Tiered ServicesT1: * Business Critical Features * If these services are broken we’re screwed
    5. 5. Tiered ServicesT2: * Important but not mission critical services
    6. 6. Customer Focus...Customer focus... is paramount to us.
    7. 7. Headless Testing $ Xvfb :99 -ac $ export DISPLAY=:99
    8. 8. At Etsy... At Etsy.... everyone is responsible for quality.
    9. 9. Cucumber @register Feature: Register for an Etsy account as a new buyer As an Etsy buyer I want to register for an account So that I can buy items on Etsy Scenario: I register for an Etsy account, confirm my registration and sign into Etsy Given I am new to Etsy When I type in the registration url And I select my new_buyer account credentials (username, password, email)
    10. 10. Ruby Step DefinitionAnd I select my new_buyer account credentials (username, password, email)And /^I select my (.*) account credentials (username, password, email)$/do | role | register eval(“username_of_#{role}”), eval(“password_of_#{role}”,eval(“email_address_of_#{role}”) steps %Q{Then the duplicate username message should not appear}end
    11. 11. Signin Register Helper module signin_register_Helper def register_for_etsy opt @selenium.type opt[:username_field], opt[:username] @selenium.type opt[:password_field], opt[:password] @selenium.type opt[:confirm_password_field], opt[:confirm_password] @selenium.type opt[:email_address_field], opt[:email_address] end def register username, password, email_address register_for_etsy ({ :username_field => username_field_for_registration, :password_field => password_field_for_registration, :confirm_password_field => confirm_password_field, :email_address_field => email_address_field, :email_address => email_address, :username => username,
    12. 12. Guard Assertionssteps %Q{Then the duplicate username message should not appear}And /^the duplicate username message should not appear$/ do @selenium.is_element_present(css=span#username-error) && @selenium.get_text(css=span#username-error).should_not(match(/Sorry, that username istaken./))end
    13. 13. Selenium Test Runs
    14. 14. Username Error
    15. 15. Data Driven Tests module Unique_ID_Helper def username_of_new_buyer uid_base_new_buyer_username + uid end def uid ENV[BUILD_NUMBER] end def uid_base_new_buyer_username "misspiggy" end end
    16. 16. Continuous IntegrationOur deployment environment requires a lot of trust,transparency, communication, coordination, anddiscipline across the team. - Chad Dickerson (CTO @ Etsy)
    17. 17. Deployinator
    18. 18. Test Slaves
    19. 19. Jenkins Projects
    20. 20. Green Test Builds
    21. 21. Cucumber Test Results reported through Jenkins
    22. 22. Challenges We cant see any test results!
    23. 23. Advantages
    24. 24. Read More....
    25. 25. Q&A