Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ACLU Partners with Tag1 to Raise Most-Ever $120M in Donations at Mission-Critical Moments

68 views

Published on

News events led to dramatically increased traffic, causing the ACLU’s donation platform to go down under load, impacting revenue and supporter engagement at a critical time for the organization. Performance tuning under normal circumstances is difficult, but even more so while under extreme load and experiencing downtime with millions of dollars being lost by the hour.

The ACLU called on Tag1’s Technical Architecture and Leadership to perform emergency support and rescue work to get the ACLU Action website back online as quickly as possible and to help it withstand even bigger traffic spikes in the future. The results of Tag1’s efforts were 3,000% increase in donations from a yearly average of $4mm to $120mm, $24mm in donations on a single weekend, 57% faster database response times, 900% throughput increase in requests per minute. In addition, systems stay online and perform quickly under extreme loads.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ACLU Partners with Tag1 to Raise Most-Ever $120M in Donations at Mission-Critical Moments

  1. 1. ACLU.org in 2017 Patrick Jensen (ACLU), Narayan Newton (Tag1 Consulting), & Matthew Cheney (Pantheon) Handling a Big Year
  2. 2. ACLU ● Nonprofit founded 1920 with over 3 million supporters ● Defend individual rights and liberties ● Famous cases ○ Led fight against Japanese-American internment camps ○ 1996 Communications Decency Act ○ Marriage equality image
  3. 3. ACLU Action Website ● Act ○ Sign petitions ○ Send messages ○ Request legal aid ● Support ○ Donate ○ Sign up to volunteer ● Accomplished via form submissions ● Drupal 6 (now Drupal 7) image
  4. 4. Before Pantheon Instability and Uncertainty 2013 ● Database Strain ○ Using core Drupal search ● Hardware upgrades took weeks ● Maintenance was onerous ○ test and development environments ○ infrastructure (e.g. varnish)
  5. 5. Hosting Websites is Hard Work image ● Need to Know Lots of Technology ○ Linux, LXC, NGINX, MariaDB, PHP, Redis, Solr, Git, Varnish, New Relic ● Need to Do Lots of Things ○ Workflow, Branches, Backups, Scalability, Performance, Security ● 24 hours a day, 7 days a week
  6. 6. What Does Git Have to do with Civil Rights?
  7. 7. Putting Organizational Mission at Top of Stack There is already so much to do! ● The World is Already Full of Challenges ● Don’t be “ambitious” about a backup system or your load balancers ● Leverage the Experience of Others ● Be the Pyramidion you want to be in the world!
  8. 8. That Is Why Folks Like the ACLU Use Drupal Stand on the Shoulders of Giants ● Leverage the Expertise of Others ○ Drupal Core ○ Contrib Modules ○ External Libraries ● Benefit from Community of Practice ○ Best Practices, Security Process, Performance, Documentation
  9. 9. And Why Folks Use Managed Cloud Services Free up Time & Resources to Focus ● Drupal is Getting More Complicated & The Web is Getting More Ambitious ● Leverage Pre-Built Feature Sets ○ Redis (Object Caching), Solr (Search Indexing, Dev->Test->Live (Workflow) ● Use Best In Class Security Processes + Performance/Scalability Tooling
  10. 10. And Be Prepared. Now and in the Future. Behold the Power of Containerization!
  11. 11. And Be Prepared. Now and in the Future. Behold the Power of Containerization!
  12. 12. And Be Prepared. Now and in the Future. Behold the Power of Containerization!
  13. 13. Be Prepared. You Never Know What Is Going to Happen Andrew Lowery “ “
  14. 14. Donald Trump Elected ● Donations in the 5 days after election ■ 2012: $25,000 ■ 2016: $7,200,000 ● Page views Nov. 9 - 13 ■ 2015: 400,000 ■ 2016: 4,250,000
  15. 15. Nov 16, 2016: The wake-up call Site outage Formsubmissionsperminute
  16. 16. 300 form submissions per minute Nov 16, 2016: The wake-up call
  17. 17. Post-Maddow Emergency Improvements
  18. 18. Outage Review Tag1 Consulting brought in to review outage after Rachel Maddow interview Specifically -- ● Fabian Franz (d.o.: fabianx) ● Narayan Newton (d.o.: nnewton) ● Jeremy Andrews (d.o.: Jeremy) Overall issue was clear and was somewhat on-going. Immediately transitioned into developing and deploying fixes. image
  19. 19. Example Query Fix +------+-------------+-------+--------+----------------------+---------+---------+---------------------+--------++ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+----------------------+---------+---------+---------------------+--------++ | 1 | SIMPLE | fo | ALL | NULL | NULL | NULL | NULL | 282880 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | o | eq_ref | PRIMARY,order_status | PRIMARY | 4 | aclu.fo.oid | 1 | Using where | | 1 | SIMPLE | os | eq_ref | PRIMARY | PRIMARY | 98 | aclu.o.order_status| 1 | | +------+-------------+-------+--------+----------------------+---------+---------+---------------------+--------++ SELECT o.order_id, o.uid, o.billing_first_name, o.billing_last_name, o.order_total, o.order_status, o.created, os.title FROM uc_orders o INNER JOIN fundraiser_og fo ON fo.oid = o.order_id AND fo.gid IN (8888,9999) LEFT JOIN uc_order_statuses os ON o.order_status = os.order_status_id WHERE o.order_status IN ('refunded', 'pending', 'processing', 'payment_received', 'completed') ORDER BY o.order_id DESC LIMIT 0, 30;
  20. 20. Index Solution | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+----------------------+--------------+---------+---------------------+------++ | 1 | SIMPLE | o | range | PRIMARY,order_status | order_status | 98 | NULL | 76 | Using index condition; Using filesort | | 1 | SIMPLE | os | eq_ref | PRIMARY | PRIMARY | 98 | aclu.o.order_status | 1 | | | 1 | SIMPLE | fo | ref | test | test | 4 | aclu.o.order_id | 1 | Using where; Using index | +------+-------------+-------+--------+----------------------+--------------+---------+---------------------+------++ + db_add_primary_key($ret, 'fundraiser_og', array('oid', 'gid', 'nid')); + db_add_index($ret, 'fundraiser_og', 'idx_gid', array('gid')); + db_add_index($ret, 'fundraiser_og', 'idx_nid', array('nid')); ALTER TABLE fundraiser_og ADD INDEX test (oid,gid,nid);
  21. 21. Result
  22. 22. Patchset Results Before Patchset: ~ 1400ms response time After Patchset: ~ 650ms response time
  23. 23. It Works on My Local (cluster) Performance Testing For Complex Sites ● Performance Testing is Complicated ○ Varnish/CDN ○ Redis/APC ○ PHP, MariaDB ● Production Parity Testing! ● But Replicating a Cluster is Hard Work ○ Nobody has time for that!
  24. 24. Let the Robots Do the Work! They already do so much. What’s a little more SysAdmin?
  25. 25. On Demand Environments are Solution
  26. 26. Surviving and Learning from Even Bigger Traffic Spikes
  27. 27. Source: http://fortune.com/2017/01/31/uber-boycott-trump/
  28. 28. Traffic spiked to 85x normal levels
  29. 29. How did our site handle the traffic? Site outage Formsubmissionsperminute
  30. 30. Mitigating a Site Outage
  31. 31. Load Testing
  32. 32. Results Before Code Changes After Code Changes
  33. 33. Payment Gateway Toolkit ● curl_log ○ Adding verbose logging to the curl requests ○ Logging to a table in the DB ○ In-flight sanitization of user information ● curl_loadbalance ○ Decaying ticket-based curl endpoints load balancer ○ Removes failing endpoints for a window of time after X failures ○ Specifically designed to always have at least one endpoint
  34. 34. Performance Next Steps ● query_cache ○ Caching “shim” to adding db_query caching to contrib modules without patching them ○ Ability to map queries to a single base query ○ Moves read-only traffic from the DB to the object cache ● rate_limit ○ An in-drupal solution to rate limiting specific types of requests ○ Webform protection ○ Search protection
  35. 35. The Payoff
  36. 36. Source: http://fortune.com/2018/01/06/google-microsoft-amazon-internet-association-net-neutrality/
  37. 37. Site outage Formsubmissionsperminute Previous Failures
  38. 38. Site outage Formsubmissionsperminute Dec 2017: No Failure at 1,900 submissions/min
  39. 39. The ACLU is ready. We have to be. We’re in for the fight of our lives. Anthony Romero, ACLU Exec. Dir. “ “
  40. 40. Questions
  41. 41. Join us for contribution sprints Friday, April 13, 2018 9:00-12:00 Room: Stolz 2 Mentored Core sprint First time sprinter workshop General sprint #drupalsprint 9:00-12:00 Room: Stolz 2 9:00-12:00 Room: Stolz 2
  42. 42. What did you think? Locate this session at the DrupalCon Nashville website: http://nashville2018.drupal.org/schedule Take the Survey! https://www.surveymonkey.com/r/DrupalConNashville

×