DEPLOYING NEXT-
GENERATION SYSTEMS
    TWILIO ENGINEERING
WHY NEXT
GENERATION
 SYSTEMS?
        #twiliocon
125,000,000
                                                                    123,090,34
100,000,000                                                         4

 75,000,000


                                                      67,186,111
              e


 50,000,000




                              25,679,631
 25,000,000


                365,782
         0


              Dec 2009    Dec 2010                  Dec 2011       Dec 2012



                                Twilio Transactions Per Month
WHY ZERO
DOWNTIME?
       #twiliocon
HI. I’M TIM
I’m Director of Engineering at twilio




                              #twiliocon
THE CHALLENGE
• Design, build, deploy replacements of 2 core systems.




                                                 #twiliocon
THE CHALLENGE
• Design, build, deploy replacements of 2 core systems.
    ➡ They must be HA.




                                                 #twiliocon
THE CHALLENGE
• Design, build, deploy replacements of 2 core systems.
    ➡ They must be HA.

    ➡ They must be horizontally-scalable.




                                                 #twiliocon
THE CHALLENGE
• Design, build, deploy replacements of 2 core systems.
    ➡ They must be HA.

    ➡ They must be horizontally-scalable.




                                                 #twiliocon
THE CHALLENGE
• Design, build, deploy replacements of 2 core systems.
    ➡ They must be HA.

    ➡ They must be horizontally-scalable.



... Oh, and don’t lose a single billing event or API request in the process.


                                                      #twiliocon
HI. I’M FRANK
I’m API-Team Engineering Lead at twilio




                              #twiliocon
OUT WITH THE OLD...




               #twiliocon
OUT WITH THE OLD...

        • Monolithic Codebase




                       #twiliocon
OUT WITH THE OLD...

        • Monolithic Codebase


        • Serving Millions of API Requests




                         #twiliocon
... AND IN WITH THE NEW




                 #twiliocon
... AND IN WITH THE NEW

• Python + Flask




                   #twiliocon
... AND IN WITH THE NEW

• Python + Flask

• Designed to serve billions of
  API requests




                                  #twiliocon
... AND IN WITH THE NEW

• Python + Flask

• Designed to serve billions of
  API requests

• Zero Downtime, Zero Regressions



                                    #twiliocon
TEST, TEST, TEST




             #twiliocon
TEST, TEST, TEST

• Unit & Functional tests for local development




                                                  #twiliocon
TEST, TEST, TEST

• Unit & Functional tests for local development


• The same tests run in our Staging Cluster




                                                  #twiliocon
TEST, TEST, TEST

• Unit & Functional tests for local development


• The same tests run in our Staging Cluster


• The same cluster tests run against Both API Frameworks



                                                  #twiliocon
A TALE OF TWO APIS




              #twiliocon
A TALE OF TWO APIS


CLUSTER
 TESTS



                  #twiliocon
A TALE OF TWO APIS


CLUSTER
 TESTS



                  #twiliocon
A TALE OF TWO APIS


CLUSTER
 TESTS



                  #twiliocon
A TALE OF TWO APIS
                     DIFF RESULTS


CLUSTER
 TESTS



                  #twiliocon
A TALE OF TWO APIS
                     DIFF RESULTS


CLUSTER
 TESTS



                  #twiliocon
NGINX.CONF
NGINX.CONF
# Map the HTTP header X-Requested-Api-Stack: <value>
# to a named location in nginx

map $http_x_requested_api_stack $requested_stack_default_php {
    default @php;
    python @python;

}
NGINX.CONF
# Map the HTTP header X-Requested-Api-Stack: <value>
# to a named location in nginx

map $http_x_requested_api_stack $requested_stack_default_php {
    default @php;
    python @python;

}
NGINX.CONF
# Map the HTTP header X-Requested-Api-Stack: <value>
# to a named location in nginx

map $http_x_requested_api_stack $requested_stack_default_php {
    default @php;
    python @python;

}
NGINX.CONF
# Map the HTTP header X-Requested-Api-Stack: <value>
# to a named location in nginx

map $http_x_requested_api_stack $requested_stack_default_php {
    default @php;
    python @python;

}
NGINX.CONF
# Map the HTTP header X-Requested-Api-Stack: <value>
# to a named location in nginx

map $http_x_requested_api_stack $requested_stack_default_php {
    default @php;
    python @python;

}


location @python {
    proxy_pass   127.0.0.1:5555;
}

location @php {
    proxy_pass    127.0.0.1:12345;
}
NGINX.CONF
# Map the HTTP header X-Requested-Api-Stack: <value>
# to a named location in nginx

map $http_x_requested_api_stack $requested_stack_default_php {
    default @php;
    python @python;

}


location @python {
    proxy_pass   127.0.0.1:5555;
}

location @php {
    proxy_pass    127.0.0.1:12345;
}



location ~ / {
    try_files Kwijibo $requested_stack_default_php;
}
SPOT THE DIFFERENCE
                 GET /2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaa/Applications/APaaaaaaaaaaaaaaaaaaa




{'TwilioResponse':                                                          {'TwilioResponse':
[{'Application': [                                                          [{'Application': [
 {'Sid': 'APaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'},                              {'Sid': 'APaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'},
     {'DateCreated': 'Mon, 22 Aug 2011 20:58:45 +0000'},                           {'DateCreated': 'Mon, 22 Aug 2011 20:58:45 +0000'},
     {'DateUpdated': 'Mon, 22 Aug 2011 20:58:45 +0000'},                           {'DateUpdated': 'Mon, 22 Aug 2011 20:58:45 +0000'},
     {'AccountSid': 'ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'},                         {'AccountSid': 'ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'},
     {'FriendlyName': 'Application Friendly Name'},                                {'FriendlyName': 'Application Friendly Name'},
     {'ApiVersion': '2010-04-01'},                                                 {'ApiVersion': '2010-04-01'},
     {'VoiceUrl': 'http://www.example.com/voice'},                                 {'VoiceUrl': 'http://www.example.com/voice'},
     {'VoiceMethod': 'GET'},                                                       {'VoiceMethod': 'GET'},
     {'VoiceFallbackUrl': 'http://www.example.com/voice-callback'},                {'VoiceFallbackUrl': 'http://www.example.com/voice-callback'},
     {'VoiceFallbackMethod': 'GET'},                                               {'VoiceFallbackMethod': 'GET'},
     {'StatusCallback': 'http://www.example.com/status-callback'},                 {'StatusCallback': 'http://www.example.com/status-callback'},
     {'StatusCallbackMethod': 'GET'},                                              {'StatusCallbackMethod': 'GET'},
     {'VoiceCallerIdLookup': 'false'},                                             {'VoiceCallerIdLookup': 'False'},
     {'SmsUrl': 'http://www.example.com/sms'},                                     {'SmsUrl': 'http://www.example.com/sms'},
     {'SmsMethod': 'GET'},                                                         {'SmsMethod': 'GET'},
     {'SmsFallbackUrl': 'http://www.example.com/sms-fallback'},                    {'SmsFallbackUrl': 'http://www.example.com/sms-fallback'},
     {'SmsFallbackMethod': 'GET'},                                                 {'SmsFallbackMethod': 'GET'},
     {'SmsStatusCallback': 'http://www.example.com/sms-status-callback'},          {'SmsStatusCallback': 'http://www.example.com/sms-status-callback'},
SPOT THE DIFFERENCE
    GET /2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaa/Applications/APaaaaaaaaaaaaaaaaaaa

  {'TwilioResponse':
  [{'Application': [
   {'Sid': 'APaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'},
       {'DateCreated': 'Mon, 22 Aug 2011 20:58:45 +0000'},
       {'DateUpdated': 'Mon, 22 Aug 2011 20:58:45 +0000'},
       {'AccountSid': 'ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'},
       {'FriendlyName': 'Application Friendly Name'},
       {'ApiVersion': '2010-04-01'},
       {'VoiceUrl': 'http://www.example.com/voice'},
       {'VoiceMethod': 'GET'},
       {'VoiceFallbackUrl': 'http://www.example.com/voice-callback'},
       {'VoiceFallbackMethod': 'GET'},
       {'StatusCallback': 'http://www.example.com/status-callback'},
       {'StatusCallbackMethod': 'GET'},
-      {'VoiceCallerIdLookup': 'false'},
?                               ^

+      {'VoiceCallerIdLookup': 'False'},
?                               ^
       {'SmsUrl': 'http://www.example.com/sms'},
       {'SmsMethod': 'GET'},
       {'SmsFallbackUrl': 'http://www.example.com/sms-fallback'},
       {'SmsFallbackMethod': 'GET'},
       {'SmsStatusCallback': 'http://www.example.com/sms-status-callback'},
ALL GROWN UP




          #twiliocon
ALL GROWN UP
     API




           #twiliocon
ALL GROWN UP
      API
   UNIT TESTS




                #twiliocon
ALL GROWN UP
       API
    UNIT TESTS
 FUNCTIONAL TESTS




                 #twiliocon
ALL GROWN UP
       API
    UNIT TESTS
 FUNCTIONAL TESTS
    DIFF TESTS




                 #twiliocon
ALL GROWN UP
       API
    UNIT TESTS
 FUNCTIONAL TESTS
    DIFF TESTS
  CANARY DEPLOY


                  #twiliocon
ALL GROWN UP
       API
    UNIT TESTS
 FUNCTIONAL TESTS
    DIFF TESTS
  CANARY DEPLOY
       API
                  #twiliocon
ALL GROWN UP
       API
    UNIT TESTS
 FUNCTIONAL TESTS
    DIFF TESTS
  CANARY DEPLOY
       API
                  #twiliocon
Flask-RESTful
• A simple REST resource framework for python Flask applications


• Simplifies argument parsing, generating output, & defining resources


• ORM / Library independent. It’s only dependency is Flask itself.


            http://www.twilio.com/open-source
                                                  #twiliocon
HI. I’M ADAM
I’m MicroTransactions-Team Engineering Lead at twilio




                                     #twiliocon
REWIND TO 2009
NERDY TELECOM
STARTUP SEEKS
  RELIABLE HA
         #twiliocon
MICRO-TRANSACTIONS @TWILIO




                   #twiliocon
MICRO-TRANSACTIONS @TWILIO




                   #twiliocon
MICRO-TRANSACTIONS @TWILIO




                   #twiliocon
TRANSACTIONS @TWILIO V1


                      BALANCE                       AGGREGAT
                      UPDATER                          ION

           DEQUEUER
                                         DEQUEUER


TX QUEUE   DEQUEUER              AGG     DEQUEUER
                      TX MYSQL                        AGG

           DEQUEUER                      DEQUEUER




                                       #twiliocon
TRANSACTIONS @TWILIO V1


                      BALANCE                       AGGREGAT
                      UPDATER                          ION

           DEQUEUER
                                         DEQUEUER


TX QUEUE   DEQUEUER              AGG     DEQUEUER
                      TX MYSQL                        AGG

           DEQUEUER                      DEQUEUER




                                       #twiliocon
TRANSACTIONS @TWILIO V1


                      BALANCE                       AGGREGAT
                      UPDATER                          ION

           DEQUEUER
                                         DEQUEUER


TX QUEUE   DEQUEUER              AGG     DEQUEUER
                      TX MYSQL                        AGG

           DEQUEUER                      DEQUEUER




                                       #twiliocon
TX-




                           TX-              MYSQL        INSERT only log data




Automatically increments                    MY SQL
balance and counters
                           REDIS
                                             MYSQL


                           REDIS




                                   POST-FLIGHT
                                                     #twiliocon
ONLINE SYSTEM UPGRADES




                #twiliocon
ONLINE SYSTEM UPGRADES

• Double book-keeping with
  Shadow Mode




                             #twiliocon
ONLINE SYSTEM UPGRADES

• Double book-keeping with
  Shadow Mode

• Rollback support with
  Account Flags & Feature Flags




                                  #twiliocon
ONLINE SYSTEM UPGRADES

• Double book-keeping with
  Shadow Mode

• Rollback support with
  Account Flags & Feature Flags

• Gradual rollout



                                  #twiliocon
1. SHADOW MODE
 TX     TX-MASTER-
QUEUE    DEQUEUER




                      NGB     LEGACY
                     QUEUE     QUEUE


                  NGB-        LEGACY-
                DEQUEUER     DEQUEUER


                                #twiliocon
1. SHADOW MODE
 TX     TX-MASTER-
QUEUE    DEQUEUER
                              ATOMIC         faucet controls




                      NGB     LEGACY
                     QUEUE     QUEUE


                  NGB-        LEGACY-
                DEQUEUER     DEQUEUER


                                #twiliocon
Sid     Account Sid   Old System   New System

                 SV...      AC...        29.5500      29.5500

                 SV...      AC...       144.8200     144.8200

 COMPARING       SV...

                 SV...
                            AC...

                            AC...
                                         35.6700

                                         30.4200
                                                      35.6700

                                                      30.4200

  FAUCETS: VS.
OLD SYSTEM       SV...

                 SV...
                            AC...

                            AC...
                                        106.9100

                                        109.7900
                                                     106.9100

                                                     109.7900


NEW SYSTEM       SV...

                 SV...
                            AC...

                            AC...
                                         5.5900

                                         10.8400
                                                      5.5900

                                                      10.8400

                 SV...      AC...        74.0250     73.9450

                 SV...      AC...         29.00       49.00

                 SV...      AC...        13.9600     13.1200

                 SV...      AC...        71.3800     91.2400

                 SV...      AC...       671.0650     646.5050

                 SV...      AC...        71.2600      71.2600

                 SV...      AC...        44.5000      44.5000
Sid     Account Sid   Old System   New System

                 SV...      AC...        29.5500      29.5500

                 SV...      AC...       144.8200     144.8200

 COMPARING       SV...

                 SV...
                            AC...

                            AC...
                                         35.6700

                                         30.4200
                                                      35.6700

                                                      30.4200

  FAUCETS: VS.
OLD SYSTEM       SV...

                 SV...
                            AC...

                            AC...
                                        106.9100

                                        109.7900
                                                     106.9100

                                                     109.7900


NEW SYSTEM       SV...

                 SV...
                            AC...

                            AC...
                                         5.5900

                                         10.8400
                                                      5.5900

                                                      10.8400

                 SV...      AC...        74.0250     73.9450

                 SV...      AC...         29.00       49.00

                 SV...      AC...        13.9600     13.1200

                 SV...      AC...        71.3800     91.2400

                 SV...      AC...       671.0650     646.5050

                 SV...      AC...        71.2600      71.2600

                 SV...      AC...        44.5000      44.5000
2. ROLLBACK SUPPORT
     Toggle Account Features




                               #twiliocon
2. ROLLBACK SUPPORT
      Toggle Cluster Features




                                #twiliocon
3. GRADUAL ROLLOUT (PRACTICE
e




 Week 1   Week 2    Week 3   Week 4   Week 5   Week 6    Week 7    Week 8


            # of Errors                   Accounts on Next Gen Billing
BETTER TOOLS

                  Problem                                Solution


• Redis killed by Linux OOM Killer       • Rollback account flags

• Inaccurate balances on NGB             • Fix bug, deploy, compare

• Credit card purchases failing          • Rollback account flags

• Failing to write log entries for NGB   • Rollback feature flags



                                                   #twiliocon
THANK YOU

COME ASK US QUESTIONS!

Deploying Next Gen Systems with Zero Downtime

  • 1.
  • 2.
  • 3.
    125,000,000 123,090,34 100,000,000 4 75,000,000 67,186,111 e 50,000,000 25,679,631 25,000,000 365,782 0 Dec 2009 Dec 2010 Dec 2011 Dec 2012 Twilio Transactions Per Month
  • 4.
  • 5.
    HI. I’M TIM I’mDirector of Engineering at twilio #twiliocon
  • 6.
    THE CHALLENGE • Design,build, deploy replacements of 2 core systems. #twiliocon
  • 7.
    THE CHALLENGE • Design,build, deploy replacements of 2 core systems. ➡ They must be HA. #twiliocon
  • 8.
    THE CHALLENGE • Design,build, deploy replacements of 2 core systems. ➡ They must be HA. ➡ They must be horizontally-scalable. #twiliocon
  • 9.
    THE CHALLENGE • Design,build, deploy replacements of 2 core systems. ➡ They must be HA. ➡ They must be horizontally-scalable. #twiliocon
  • 10.
    THE CHALLENGE • Design,build, deploy replacements of 2 core systems. ➡ They must be HA. ➡ They must be horizontally-scalable. ... Oh, and don’t lose a single billing event or API request in the process. #twiliocon
  • 11.
    HI. I’M FRANK I’mAPI-Team Engineering Lead at twilio #twiliocon
  • 12.
    OUT WITH THEOLD... #twiliocon
  • 13.
    OUT WITH THEOLD... • Monolithic Codebase #twiliocon
  • 14.
    OUT WITH THEOLD... • Monolithic Codebase • Serving Millions of API Requests #twiliocon
  • 15.
    ... AND INWITH THE NEW #twiliocon
  • 16.
    ... AND INWITH THE NEW • Python + Flask #twiliocon
  • 17.
    ... AND INWITH THE NEW • Python + Flask • Designed to serve billions of API requests #twiliocon
  • 18.
    ... AND INWITH THE NEW • Python + Flask • Designed to serve billions of API requests • Zero Downtime, Zero Regressions #twiliocon
  • 19.
  • 20.
    TEST, TEST, TEST •Unit & Functional tests for local development #twiliocon
  • 21.
    TEST, TEST, TEST •Unit & Functional tests for local development • The same tests run in our Staging Cluster #twiliocon
  • 22.
    TEST, TEST, TEST •Unit & Functional tests for local development • The same tests run in our Staging Cluster • The same cluster tests run against Both API Frameworks #twiliocon
  • 23.
    A TALE OFTWO APIS #twiliocon
  • 24.
    A TALE OFTWO APIS CLUSTER TESTS #twiliocon
  • 25.
    A TALE OFTWO APIS CLUSTER TESTS #twiliocon
  • 26.
    A TALE OFTWO APIS CLUSTER TESTS #twiliocon
  • 27.
    A TALE OFTWO APIS DIFF RESULTS CLUSTER TESTS #twiliocon
  • 28.
    A TALE OFTWO APIS DIFF RESULTS CLUSTER TESTS #twiliocon
  • 29.
  • 30.
    NGINX.CONF # Map theHTTP header X-Requested-Api-Stack: <value> # to a named location in nginx map $http_x_requested_api_stack $requested_stack_default_php { default @php; python @python; }
  • 31.
    NGINX.CONF # Map theHTTP header X-Requested-Api-Stack: <value> # to a named location in nginx map $http_x_requested_api_stack $requested_stack_default_php { default @php; python @python; }
  • 32.
    NGINX.CONF # Map theHTTP header X-Requested-Api-Stack: <value> # to a named location in nginx map $http_x_requested_api_stack $requested_stack_default_php { default @php; python @python; }
  • 33.
    NGINX.CONF # Map theHTTP header X-Requested-Api-Stack: <value> # to a named location in nginx map $http_x_requested_api_stack $requested_stack_default_php { default @php; python @python; }
  • 34.
    NGINX.CONF # Map theHTTP header X-Requested-Api-Stack: <value> # to a named location in nginx map $http_x_requested_api_stack $requested_stack_default_php { default @php; python @python; } location @python { proxy_pass 127.0.0.1:5555; } location @php { proxy_pass 127.0.0.1:12345; }
  • 35.
    NGINX.CONF # Map theHTTP header X-Requested-Api-Stack: <value> # to a named location in nginx map $http_x_requested_api_stack $requested_stack_default_php { default @php; python @python; } location @python { proxy_pass 127.0.0.1:5555; } location @php { proxy_pass 127.0.0.1:12345; } location ~ / { try_files Kwijibo $requested_stack_default_php; }
  • 36.
    SPOT THE DIFFERENCE GET /2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaa/Applications/APaaaaaaaaaaaaaaaaaaa {'TwilioResponse': {'TwilioResponse': [{'Application': [ [{'Application': [ {'Sid': 'APaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}, {'Sid': 'APaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}, {'DateCreated': 'Mon, 22 Aug 2011 20:58:45 +0000'}, {'DateCreated': 'Mon, 22 Aug 2011 20:58:45 +0000'}, {'DateUpdated': 'Mon, 22 Aug 2011 20:58:45 +0000'}, {'DateUpdated': 'Mon, 22 Aug 2011 20:58:45 +0000'}, {'AccountSid': 'ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}, {'AccountSid': 'ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}, {'FriendlyName': 'Application Friendly Name'}, {'FriendlyName': 'Application Friendly Name'}, {'ApiVersion': '2010-04-01'}, {'ApiVersion': '2010-04-01'}, {'VoiceUrl': 'http://www.example.com/voice'}, {'VoiceUrl': 'http://www.example.com/voice'}, {'VoiceMethod': 'GET'}, {'VoiceMethod': 'GET'}, {'VoiceFallbackUrl': 'http://www.example.com/voice-callback'}, {'VoiceFallbackUrl': 'http://www.example.com/voice-callback'}, {'VoiceFallbackMethod': 'GET'}, {'VoiceFallbackMethod': 'GET'}, {'StatusCallback': 'http://www.example.com/status-callback'}, {'StatusCallback': 'http://www.example.com/status-callback'}, {'StatusCallbackMethod': 'GET'}, {'StatusCallbackMethod': 'GET'}, {'VoiceCallerIdLookup': 'false'}, {'VoiceCallerIdLookup': 'False'}, {'SmsUrl': 'http://www.example.com/sms'}, {'SmsUrl': 'http://www.example.com/sms'}, {'SmsMethod': 'GET'}, {'SmsMethod': 'GET'}, {'SmsFallbackUrl': 'http://www.example.com/sms-fallback'}, {'SmsFallbackUrl': 'http://www.example.com/sms-fallback'}, {'SmsFallbackMethod': 'GET'}, {'SmsFallbackMethod': 'GET'}, {'SmsStatusCallback': 'http://www.example.com/sms-status-callback'}, {'SmsStatusCallback': 'http://www.example.com/sms-status-callback'},
  • 37.
    SPOT THE DIFFERENCE GET /2010-04-01/Accounts/ACaaaaaaaaaaaaaaaaaaaaaaaaa/Applications/APaaaaaaaaaaaaaaaaaaa {'TwilioResponse': [{'Application': [ {'Sid': 'APaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}, {'DateCreated': 'Mon, 22 Aug 2011 20:58:45 +0000'}, {'DateUpdated': 'Mon, 22 Aug 2011 20:58:45 +0000'}, {'AccountSid': 'ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'}, {'FriendlyName': 'Application Friendly Name'}, {'ApiVersion': '2010-04-01'}, {'VoiceUrl': 'http://www.example.com/voice'}, {'VoiceMethod': 'GET'}, {'VoiceFallbackUrl': 'http://www.example.com/voice-callback'}, {'VoiceFallbackMethod': 'GET'}, {'StatusCallback': 'http://www.example.com/status-callback'}, {'StatusCallbackMethod': 'GET'}, - {'VoiceCallerIdLookup': 'false'}, ? ^ + {'VoiceCallerIdLookup': 'False'}, ? ^ {'SmsUrl': 'http://www.example.com/sms'}, {'SmsMethod': 'GET'}, {'SmsFallbackUrl': 'http://www.example.com/sms-fallback'}, {'SmsFallbackMethod': 'GET'}, {'SmsStatusCallback': 'http://www.example.com/sms-status-callback'},
  • 38.
    ALL GROWN UP #twiliocon
  • 39.
    ALL GROWN UP API #twiliocon
  • 40.
    ALL GROWN UP API UNIT TESTS #twiliocon
  • 41.
    ALL GROWN UP API UNIT TESTS FUNCTIONAL TESTS #twiliocon
  • 42.
    ALL GROWN UP API UNIT TESTS FUNCTIONAL TESTS DIFF TESTS #twiliocon
  • 43.
    ALL GROWN UP API UNIT TESTS FUNCTIONAL TESTS DIFF TESTS CANARY DEPLOY #twiliocon
  • 44.
    ALL GROWN UP API UNIT TESTS FUNCTIONAL TESTS DIFF TESTS CANARY DEPLOY API #twiliocon
  • 45.
    ALL GROWN UP API UNIT TESTS FUNCTIONAL TESTS DIFF TESTS CANARY DEPLOY API #twiliocon
  • 46.
    Flask-RESTful • A simpleREST resource framework for python Flask applications • Simplifies argument parsing, generating output, & defining resources • ORM / Library independent. It’s only dependency is Flask itself. http://www.twilio.com/open-source #twiliocon
  • 47.
    HI. I’M ADAM I’mMicroTransactions-Team Engineering Lead at twilio #twiliocon
  • 48.
  • 49.
    NERDY TELECOM STARTUP SEEKS RELIABLE HA #twiliocon
  • 50.
  • 51.
  • 52.
  • 53.
    TRANSACTIONS @TWILIO V1 BALANCE AGGREGAT UPDATER ION DEQUEUER DEQUEUER TX QUEUE DEQUEUER AGG DEQUEUER TX MYSQL AGG DEQUEUER DEQUEUER #twiliocon
  • 54.
    TRANSACTIONS @TWILIO V1 BALANCE AGGREGAT UPDATER ION DEQUEUER DEQUEUER TX QUEUE DEQUEUER AGG DEQUEUER TX MYSQL AGG DEQUEUER DEQUEUER #twiliocon
  • 55.
    TRANSACTIONS @TWILIO V1 BALANCE AGGREGAT UPDATER ION DEQUEUER DEQUEUER TX QUEUE DEQUEUER AGG DEQUEUER TX MYSQL AGG DEQUEUER DEQUEUER #twiliocon
  • 57.
    TX- TX- MYSQL INSERT only log data Automatically increments MY SQL balance and counters REDIS MYSQL REDIS POST-FLIGHT #twiliocon
  • 58.
  • 59.
    ONLINE SYSTEM UPGRADES •Double book-keeping with Shadow Mode #twiliocon
  • 60.
    ONLINE SYSTEM UPGRADES •Double book-keeping with Shadow Mode • Rollback support with Account Flags & Feature Flags #twiliocon
  • 61.
    ONLINE SYSTEM UPGRADES •Double book-keeping with Shadow Mode • Rollback support with Account Flags & Feature Flags • Gradual rollout #twiliocon
  • 62.
    1. SHADOW MODE TX TX-MASTER- QUEUE DEQUEUER NGB LEGACY QUEUE QUEUE NGB- LEGACY- DEQUEUER DEQUEUER #twiliocon
  • 63.
    1. SHADOW MODE TX TX-MASTER- QUEUE DEQUEUER ATOMIC faucet controls NGB LEGACY QUEUE QUEUE NGB- LEGACY- DEQUEUER DEQUEUER #twiliocon
  • 64.
    Sid Account Sid Old System New System SV... AC... 29.5500 29.5500 SV... AC... 144.8200 144.8200 COMPARING SV... SV... AC... AC... 35.6700 30.4200 35.6700 30.4200 FAUCETS: VS. OLD SYSTEM SV... SV... AC... AC... 106.9100 109.7900 106.9100 109.7900 NEW SYSTEM SV... SV... AC... AC... 5.5900 10.8400 5.5900 10.8400 SV... AC... 74.0250 73.9450 SV... AC... 29.00 49.00 SV... AC... 13.9600 13.1200 SV... AC... 71.3800 91.2400 SV... AC... 671.0650 646.5050 SV... AC... 71.2600 71.2600 SV... AC... 44.5000 44.5000
  • 65.
    Sid Account Sid Old System New System SV... AC... 29.5500 29.5500 SV... AC... 144.8200 144.8200 COMPARING SV... SV... AC... AC... 35.6700 30.4200 35.6700 30.4200 FAUCETS: VS. OLD SYSTEM SV... SV... AC... AC... 106.9100 109.7900 106.9100 109.7900 NEW SYSTEM SV... SV... AC... AC... 5.5900 10.8400 5.5900 10.8400 SV... AC... 74.0250 73.9450 SV... AC... 29.00 49.00 SV... AC... 13.9600 13.1200 SV... AC... 71.3800 91.2400 SV... AC... 671.0650 646.5050 SV... AC... 71.2600 71.2600 SV... AC... 44.5000 44.5000
  • 66.
    2. ROLLBACK SUPPORT Toggle Account Features #twiliocon
  • 67.
    2. ROLLBACK SUPPORT Toggle Cluster Features #twiliocon
  • 68.
    3. GRADUAL ROLLOUT(PRACTICE e Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 # of Errors Accounts on Next Gen Billing
  • 69.
    BETTER TOOLS Problem Solution • Redis killed by Linux OOM Killer • Rollback account flags • Inaccurate balances on NGB • Fix bug, deploy, compare • Credit card purchases failing • Rollback account flags • Failing to write log entries for NGB • Rollback feature flags #twiliocon
  • 70.
    THANK YOU COME ASKUS QUESTIONS!

Editor's Notes

  • #2 \n
  • #3 \n
  • #4 \n
  • #5 \n
  • #6 \n
  • #7 \n
  • #8 \n
  • #9 \n
  • #10 \n
  • #11 \n
  • #12 \n
  • #13 \n
  • #14 \n
  • #15 \n
  • #16 \n
  • #17 \n
  • #18 \n
  • #19 \n
  • #20 \n
  • #21 \n
  • #22 \n
  • #23 \n
  • #24 \n
  • #25 \n
  • #26 \n
  • #27 \n
  • #28 \n
  • #29 \n
  • #30 \n
  • #31 \n
  • #32 \n
  • #33 \n
  • #34 \n
  • #35 \n
  • #36 \n
  • #37 \n
  • #38 \n
  • #39 \n
  • #40 \n
  • #41 \n
  • #42 \n
  • #43 \n
  • #44 Lets rewind to 2009 and take a look at what we built.\n
  • #45 At Twilio\n- \nnerd out to billing everyday. \n\nIts odd, quirky, but super powerful \n\n- mission critical to the advancement of twilio.\n\nWe decided to invest in building this early on in the company.\n
  • #46 \n
  • #47 high availability\nalways be processing\n
  • #48 Realtime scoreboards, usage, metrics\n
  • #49 Two piggy banks\nSingle process dequeuers - limited by the number of transactions you can process on a single database\n
  • #50 Two piggy banks\nSingle process dequeuers - limited by the number of transactions you can process on a single database\n
  • #51 \n
  • #52 Once we understood the problems...\n\nWent to the drawing board...\n\nWe built something new to replace the infrastructure alongside it.\n\n&lt;button&gt;\n\nA system that disconnects our dequeuers from our processors. \n\n&lt;button&gt; \n\nProcessors powered by a REST API and uses status codes for success &amp; error resolution.\n\n&lt;button&gt;\n\nOnly inserting into our databases as a log server.\n\n&lt;button&gt;\n\nAnd to fix our realtime issue, we&amp;#x2019;re using redis as an in-flight datastore to atomically process metrics as we process transactions.\n
  • #53 \n
  • #54 \n
  • #55 \n
  • #56 Now we have two systems side-by-side, but we need to compare the two.\n&lt;click&gt;\nDouble book keeping lets us compare balances and metrics.\n\nIf we have a bug,\n&lt;click&gt;\nwe need to rollback. \n\nAnd we don&amp;#x2019;t want to do it all at once.\n&lt;click&gt;\n\n
  • #57 Now we have two systems side-by-side, but we need to compare the two.\n&lt;click&gt;\nDouble book keeping lets us compare balances and metrics.\n\nIf we have a bug,\n&lt;click&gt;\nwe need to rollback. \n\nAnd we don&amp;#x2019;t want to do it all at once.\n&lt;click&gt;\n\n
  • #58 Now we have two systems side-by-side, but we need to compare the two.\n&lt;click&gt;\nDouble book keeping lets us compare balances and metrics.\n\nIf we have a bug,\n&lt;click&gt;\nwe need to rollback. \n\nAnd we don&amp;#x2019;t want to do it all at once.\n&lt;click&gt;\n\n
  • #59 With so much throughput, we couldn&amp;#x2019;t just shut down the billing system. We also couldn&amp;#x2019;t lose a billing event.\n&lt;click&gt;\nSo we built an abstraction between the two systems that would allow us to atomically control transactions.\n&lt;click&gt;\nWhen the faucet is turned off, it will wait till both queues are drained and send us a report.\n&lt;click&gt;\n
  • #60 With so much throughput, we couldn&amp;#x2019;t just shut down the billing system. We also couldn&amp;#x2019;t lose a billing event.\n&lt;click&gt;\nSo we built an abstraction between the two systems that would allow us to atomically control transactions.\n&lt;click&gt;\nWhen the faucet is turned off, it will wait till both queues are drained and send us a report.\n&lt;click&gt;\n
  • #61 Check the books. Can we turn any accounts online?\n
  • #62 Nope, we should not enable any accounts yet.\n
  • #63 If we had moved accounts, with a click, we can migrate them back.\n
  • #64 Or if the cluster catches fire, we can turn off the entire new system and reroute billing traffic back to its legacy system.\n\n
  • #65 &lt;TODO&gt;\nGraph build out each week with the story.\n\n\n\n\nPractice is good.\nWe tested our platform thoroughly in a practice mode with no account flags turned on.\n\nAs we progressed and fixed the edge-cases, we migrated 1%, %5, all the way up to all accounts over a period of time.\n\nPlanning with your tools lets you build a gradual deployment with ease.\n\n
  • #66 Just to follow up\n\nbetter tools equal better deployments.\n\nWhen we had issues with our new in-flight store, we had a way to rollback.\nWhen we were seeing discrepancies in balances, we would investigate, fix, deploy, and compare.\n\nSo you get the idea, to migrate to a new micro-payments platform,\nwe must engineer tools that let us migrate back and\nforth with ease so that we can spend time on the solutions.\n
  • #67 \n