SlideShare a Scribd company logo
1 of 68
Download to read offline
Optimized for change:
                Architecture @ Etsy
                Kellan Elliott-McCrea
                @kellan
                CTO, Etsy




Monday, June 18, 12
Monday, June 18, 12
Launched June 18, 2005
                      875,000 active sellers
                      33.5MM items for sale
                      $65.9MM in sales, in May
                      1.4B page views, in May
                      102 engineers
                      32 releases, last Friday



Monday, June 18, 12
LAMP
                                                 any questions?




8BitLit, http://www.etsy.com/listing/90066890/
Monday, June 18, 12
Why?

Monday, June 18, 12
3 inevitabilities we design for:

         1. Things break, unexpectedly
         2. What we're building changes
         3. We don't get to start over



Monday, June 18, 12
2 years of change.




Monday, June 18, 12
Architectural Principles
               * Don't bet against the future.
               * Our customers are humans.
               * Simplicity always wins, in the end.
               * Favor global vs local optimization.
               * Ambiguity kills momentum.
               * Make failure cheap.
               * Technical debt is an inevitable by-product
               of shipping code.
               * Optimize for change.


Monday, June 18, 12
Cleverness
Ckrickett, http://www.etsy.com/listing/90611466
Monday, June 18, 12
Complex systems and change
         1. Distributed systems are inherently complex.

       2. The outcome of change in complex systems is hard to
       predict.

        3. The outcome of small, frequent, measurable changes
       are easier to predict, easier to recover from, and promote
       learning.




Ckrickett, http://www.etsy.com/listing/90611466
Monday, June 18, 12
Continuous deployment, Metrics
Driven Development, Blameless
        Post-Mortems



Ckrickett, http://www.etsy.com/listing/90611466
Monday, June 18, 12
Continuous deployment: Small,
            frequent changes to production




Ckrickett, http://www.etsy.com/listing/90611466
Monday, June 18, 12
Continuous Deployment:
                                  No branching.

       “All existing revision control systems were
       built by people who build installed
       software”
       - Paul Hammond,
       Always Ship Trunk, Velocity 2010
       Thursday, March 17, 2011




Monday, June 18, 12
Continuous Deployment:

                       feature flags
               if ($cfg[‘awesome_new_search’]) {
                   # new hotness
                   $rsp = do_solr();
               } else {
                   # boring old stuff
                   $rsp = do_grep();
               }



Monday, June 18, 12
Continuous Deployment:
                      Ramp - ups
                      (on top of feature flags)


        1. Launch to staff only
        2. Launch to 1% of all users
        3. Launch to members of a beta group




Monday, June 18, 12
Continuous Deployment:


                      any engineer can launch a feature to

                      1% of users


Monday, June 18, 12
Continuous Deployment:


           ~200 experiments
           live right now


Monday, June 18, 12
Metrics driven development:

           introspection isn’t
           optional.
           measure everything,
           log everything
Monday, June 18, 12
Metrics driven development:

           Metrics happen when
           you make it easy. And
           visible.

Monday, June 18, 12
Metrics driven development:
   Teach computer to read graphs



                      holtWintersConfidence(Upper|Lower)




Monday, June 18, 12
Metrics driven development:
    More info: http://www.slideshare.net/
    mikebrittain/metricsdriven-engineering




Monday, June 18, 12
Optimize for MTTR, not MTBF


Monday, June 18, 12
How?

Monday, June 18, 12
Etsy



Monday, June 18, 12
Etsy

                                        EMR/S3
                      PCI

                            BCP, Cold


Monday, June 18, 12
inbound request
                                      CDNs - diversified at the DNS level

                                    Internet providers - diversified at borders
                                                                                                          AWS
 Etsy                 network appliances
                                                                                            analytics     imstor
                                                        etsystatic.com/
                       etsy.com/        bcn.etsy.com                                          EMR         S3
                                                            photos
                      api.etsy.com                                                            JRuby/
                         /atlas                        Squid                                  Cascading
                                         apache
                      apache                           apache                                 S3
                                         logs
                      php application                  php                                    PHP
                                         logrotate
                        MySQL                           imstor                                MySQL
                                         HDFS
                        search           analytics      NFS
                        memcache
                        async http
                        StatsD
                        sqlite
                        gearman
                      logs
         MySQL        server/OS           search       mail out                                 PCI
                      hardware        Thrift         SMTP
        dbindex                       Jetty
        dbshards                                     X-Yarnblaster
                                       Solr slaves                               via jsonp,
        dbaux                          datasets                                  no privileged access
        dbdata                        Solr master        etc
                                      HBase
                                      sharded MySQL
Monday, June 18, 12
CDNs: Put a slider on it




   Just works via weighted DNS


Monday, June 18, 12
Apache
        * Well known
        * PHP is native
        * apache_note
        * fast start time
        * cheap in place replacement
        * .htaccess
        * Challenge: memory usage




Monday, June 18, 12
Apache: apache_note
                                  intr Addit
                                      osp       ive!
                                          ecti       insa
                                               on        nely
                                                  thro
              apache_note('etsy_uaid', $id);            ugh usefu
                                                            the l!
                                                               life
                                                                    cyc
                                                                       le




Monday, June 18, 12
Apache: log format

      LogFormat "%{X-Forwarded-For}i %
      {True-Client-IP}i %l %u %t "%r"
      %>s %b "%{Referer}i" "%{User-
      Agent}i" % {etsy_shop_id}n %
      {etsy_uaid}n %V %
      {etsy_ab_selections}n %
      {etsy_request_uuid}n %
      {etsy_api_consumer_key}n %
      {etsy_api_method_name}n %
      {php_memory_usage_bytes}n %
      {php_time_microsec}n %D" combined



Monday, June 18, 12
Etsy: the App

         * 487,000 lines of PHP
         * 214,000 lines of Javascript
         * Monolithic codebase
         * 3 front ends, Etsy.com, API, Atlas




Monday, June 18, 12
Etsy: the App

         * routing handled by Apache
         * scripts fronting OO PHP5
         * PHP, fast by default
         * opcode caching
         * Challenge: liveliness when calling services




Monday, June 18, 12
Etsy: coding patterns

         * light weight, home rolled “framework”
         * ORM handles DAO across backends
         * config and feature flags systems used
         everywhere
         * small slow moving datasets stored as PHP
         arrays
         * A/B tests
         * Smarty
         * StatsD
         * Concurrency
         * memcache

Monday, June 18, 12
Etsy: A/B tests

        * beaconed
        * inserted into logs via apache_note
        * conditionalized on feature flags
        * nightly reports on conversion, bounce rate,
        etc
        * nightly reports on page speed, memory
        usage, etc




Monday, June 18, 12
Etsy: Smarty

        * pre-compiled
        * pre-compiled per language




Monday, June 18, 12
Etsy: StatsD


      StatsD::increment("logins.success");
      StatsD::timing("gearman.time", $msec);


       * 340,000 application metrics




Monday, June 18, 12
Etsy: Concurrency

        * no native concurrency in PHP
        * asynchronous HTTP calls
        * Gearman




Monday, June 18, 12
Etsy: Async HTTP calls

       * curl_multi_exec
       * non-blocking, per request time outs
       * used for optional aspects of a page
       * curl against http://localhost to avoid
       network overhead




Monday, June 18, 12
Etsy: Gearman

      * language agnostic job server
      * don’t use an MQ when you want a job
      server
      * 150 job types
      * persistent jobs flushed to MySQL, read
      from memory
      * non-persistent jobs just stored in memory
      * NP queue is wicked fast.




Monday, June 18, 12
Etsy: Gearman

      * scaling CPU of cron jobs
      * denormalizing data
      * pushing to 3rd party services




Monday, June 18, 12
Etsy: Challenges

      * Apache memory usage
      * liveliness talking to services, no
      concurrency, blocking by default




Monday, June 18, 12
Etsy: graph of distributed failure




Monday, June 18, 12
Etsy: Challenges
      * Apache memory usage
      * liveliness talking to services: no
      concurrency, blocking by default



    Enforce liveliness with a judicious
           application of force



Monday, June 18, 12
Etsy: judicious application of force


      list($v, $res, $shar) = @fopen(‘/proc/self/statm', 'r');
      $mine = $res-$shar;
      if ($mine > $cfg[‘sizelimit’]) {
        $pid = getmypid();
        @exec("kill -USR1 $pid");
      }




Monday, June 18, 12
Etsy: judicious application of force

        Bowhunter
        * Find long running PHP processes
        * Try to avoid those mid-post


        open(APACHE, "/usr/bin/curl -s http://localhost/server-
        status|") || die "$!";




Monday, June 18, 12
Etsy: judicious application of force


        Query_killer
        * Same idea, long running queries
        * MySQL “SHOW PROCESSLIST();”




Monday, June 18, 12
Memcache

       * Caching, obviously
       * Cache invalidation is hard
       * Write buffering
       * multi_get
       * rate limits




Monday, June 18, 12
Memcache

       * atomic INCR is awesome
       * slice your time windows to reduce risk of
       cache eviction
       * we’ve been unlucky, lots of segfaults :(
       * multi_get slows down the more boxes in the
       pool




Monday, June 18, 12
MySQL: By the numbers

      * 25K+ queries/sec avg
      * 3TB InnoDB buffer pool
      * 15TB + data stored
      * 50 servers
      * 99.99% queries under 1ms




Monday, June 18, 12
MySQL: a NotMuchSQL server
      * no joins
      * no foreign keys
      * no transactions or locks
      * no sub-selects
      * store data like you want to read it.
      * also: no auto_increment




Monday, June 18, 12
MySQL: a NotMuchSQL server




               “Normalization is for sissie.”
                          - Cal Henderson, Flickr




Monday, June 18, 12
MySQL: scale horizontally

        * objects shared by key
        * lookups maintained in dbindex (MySQL is a
        FAST key-value store)
        * avoid key hashing, range partitions, and
        partitioning functions


        more: http://www.slideshare.net/jgoulah/the-etsy-shard-architecture-starts-with-s-and-ends-with-hard




Monday, June 18, 12
MySQL: Master-Master

        * objects hashed to a side, avoid split brain
        * allows in place schema upgrades without
        slave promotion
        * simplified capacity planning


        more: http://codeascraft.etsy.com/2012/04/20/two-sides-for-salvation/




Monday, June 18, 12
MySQL: Introspection
  web0038 : [Mon Jun 18 09:58:38 2012] [error] [client 10.101.1.12]
  [C6kds9y1MVptEDMoOe5KCYha9VWl] [error] [ORM_LONG_QUERY] [/var/etsy/
  current/phplib/EtsyORM/Query/RawSql.php:752] [15877310] Query exceeded 10
  seconds: long_query_time=83.0927 long_query_string='/* [etsy_shard_005_A] [/
  remove_favorite_listing.php] */ DELETE FROM `users_favoritelistings` WHERE
  `user_id` = ? AND `listing_id` = ?' long_query_trace='#10 __construct() /EtsyModel/
  UserFavoriteListingMirror.php:310 #4 delete() /EtsyModel/UserFavoriteListing.php:39
  #3 delete() /EtsyModel/User.php:1840 #2 unfavoriteListing() /Controller/
  Favorites.php:344 #1 removeFavoriteListingRecord() /Controller/Favorites.php:94 #0
  performRemoveFavoriteListing() /var/etsy/current/htdocs/remove_favorite_listing.php:
  9', referer: http://www.etsy.com/people/kellanem/favorites?page=5




   SQL Comments are awesome!



Monday, June 18, 12
MySQL: Deletes are expensive


        * update objects to state=‘deleted’
        * use partitions
        * truncatenator - on ext3, hard link file, move,
        delete slowly.




Monday, June 18, 12
Anatomy of a feature: Shop Stats




Monday, June 18, 12
Anatomy of a feature: Shop Stats



              “Never get into a land war in Asia, and never
                build an analytics tool on top of MySQL.




Monday, June 18, 12
Anatomy of a feature: Shop Stats


        * buffer writes in Memcache using
        predictable keys
        * flush to MySQL tables periodically via cron
        * bake old data into all possible date ranges,
        and archived to S3
        * truncate tables




Monday, June 18, 12
Monday, June 18, 12
bcn.etsy.com: beaconed event stream

        * Server-side and javascript event stream
        * At least one per page view
        * Apache serving static assets
        * Aggregated on HDFS via logrotate
        * Archived on S3
        * Analyzed via JRuby/Cascading on Hadoop
        * Doesn’t use: Flume, Scribe, etc




Monday, June 18, 12
bcn.etsy.com: beaconed event stream

    {"event_guid":"c2ffb51808b.6d2be52959ef{".user_id":
    8528531,"php_event_name":"s2","php_unique_id":"4fdf1cb5d5c078.37523961","php_event_dat
    e":"18/Jun/2012:08:19:01","locale_currency_code":"USD","pref_language":"en-
    US","region":"US","detected_region":"US","accept-languages":"en-
    US,en","isMobileDevice":"0","isMobileSupported":"0","isTabletSupported":"0","isTouch":"0","isEt
    syApp":"0","listing_ids":[60274277,101504389,98682771,88585080],"cids":
    [14103953,14239293,14247717,14209614],"query":"blue","keywords":
    ["blue","blue","blue","blue"],"position":1,"replay_number":1,"s2_cached":
    1,"php_ab_test_names":"orm_record_instance_caching;mobile_detector.all_blackberry;multila
    ng_shops_listings.view;ga_replacement_cookie;disable_search_autosuggest;admin_toolbar;tra
    nslations.live_translations;ab_analytics_test;search_type_experiment;search_ads.max_replays_
    less;search_diversity_experiment;search_cached_listing_cards;placefinder.cache_memcached_
    migration;search_stream_a;search_all_items_ignores_supplies;search_default_type;search.two
    _cluster_deploy;search_parameter_sample;thrift_category2_transform;search.similar_listing_b
    rowse_page;orm_replicant_safe_find_many;bottom_first;foreign_language_carousel;search.rel
    ated_searches_all_items;weddings.srp_promos;search_log_page_position;newrelic;clientlog;go
    ogle_analytics_async;personalized_endpoint;search_no_dropdown;community_nav_popout;se
    curity_settings;search_changes_tooltip;inline_listing_hearts;framelogger;log_normal;analytics_
    second_beacon;analytics_second_beacon_privileged;analytics_second_beacon_mobile","php_a
    b_var_names":"1;1;1;1;control;1;0;A;ponycorn_v3;1;threshold_off;1;1;1;0;all_sans_supplies;
    0;1;1;1;1;0;top;0;0;1;0;1;0;1;1;1;0;1;1;1;0;1;0;1","php_ab_selector_names":"




Monday, June 18, 12
Search
                      Search Master

                                BitTorrent to distribute indexes


                                                        Thrift, with server affinity
                                 Search Slave01                                                         Web01
                                                        to improve cache hit ratio,
                                                        just returns ids
                                 Search Slave02                                                         Web02

                                 Search SlaveNN                                                         WebNN

                               100% of all indexes
                                 on each slave
        incremental index, every 7 minutes,
          avoid even numbered cron times                                              hydrate IDs via multi-get,
                                                                                        ignore a few failures


                                   pull via cron,
                                 push via gearman




         denormalized listing store,                                         databases and memcache
         transition from MySQL to
           Hbase, not user facing


Monday, June 18, 12
Search
               * Solr trunk
               * Custom ranking via crunched datasets
               * BitSet fields for personalized search
               * Scaling the JVM
               * 32% of visits, 40% of sales
               * Also powers categories, unshardable
               queries
               * Next time, just use HTTP
               * Up next: custom codecs
               * Avoiding sharding


Monday, June 18, 12
Search
               * JVM slow start
               * Search deployinator does rolling restart
               * HotSpot and GC causes unpredictable
               throughput
               * Overfetch - ask multiple servers, go with 1st
               response
               * Index size is important. Don’t store too
               much.




Monday, June 18, 12
Photos
                                                      * 400 million photos
                                                      * Uploaded locally, then
                                                      streamed to S3
                                                      * GraphicsMagick FTW
                                                      * Working set is tiny, served
                                                      out of Squid
                                                      * 2% read failure rate during
                                                      full S3 outage.
                                                      * 0% write failure rate
                                                      during full S3 outage.




JonathanOtis, http://www.etsy.com/listing/96361102/

Monday, June 18, 12
Technology no longer part of the stack

       * Python Twisted
       * PostgreSQL and stored procedures
       * Scala and MongoDB
       * Clojure and Tokyo Tyrant
       * Rails
       * ActiveMQ
       * RabbitMQ
       * a "Routes" framework
       * building RPMs
       * Lighttpd

Monday, June 18, 12
Take aways
       1. A few simple, boring, well known
       components
       2. Extensive instrumentation
       3. Rapid iteration and feedback loops
       4. Human centric
       5. A few tweaks on the classics for scale
       6. Technology supports business goals

Monday, June 18, 12
Questions?

       More info:
       http://codeascraft.etsy.com
       http://slideshare.net/etsy
       http://github.com/etsy
       http://www.etsy.com/jobs
       kellan@etsy.com

Monday, June 18, 12

More Related Content

Similar to Architecting for Change: QCONNYC 2012

BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud
 
STP201 Efficiency at Scale - AWS re: Invent 2012
STP201 Efficiency at Scale - AWS re: Invent 2012STP201 Efficiency at Scale - AWS re: Invent 2012
STP201 Efficiency at Scale - AWS re: Invent 2012Amazon Web Services
 
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian KöhntoppOSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian KöhntoppNETWAYS
 
Camel and JBoss
Camel and JBossCamel and JBoss
Camel and JBossJBug Italy
 
89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram
89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram
89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagramferreroroche11
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagramiammutex
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcachedelliando dias
 
Engineering Change
Engineering ChangeEngineering Change
Engineering ChangeKellan
 
Multi Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesMulti Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesOhyama Masanori
 
Log everything!
Log everything!Log everything!
Log everything!ICANS GmbH
 
Klmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDBKlmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDBRoss Affandy
 
Pinterest arch summit august 2012 - scaling pinterest
Pinterest arch summit   august 2012 - scaling pinterestPinterest arch summit   august 2012 - scaling pinterest
Pinterest arch summit august 2012 - scaling pinterestdrewz lin
 
3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMR3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMRFaizan Javed
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Sujee Maniyam
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!Andraz Tori
 
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-HealingApplying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-HealingAndreas Grabner
 

Similar to Architecting for Change: QCONNYC 2012 (20)

BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
 
STP201 Efficiency at Scale - AWS re: Invent 2012
STP201 Efficiency at Scale - AWS re: Invent 2012STP201 Efficiency at Scale - AWS re: Invent 2012
STP201 Efficiency at Scale - AWS re: Invent 2012
 
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian KöhntoppOSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Camel and JBoss
Camel and JBossCamel and JBoss
Camel and JBoss
 
89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram
89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram
89025069 mike-krieger-instagram-at-the-airbnb-tech-talk-on-scaling-instagram
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcached
 
Engineering Change
Engineering ChangeEngineering Change
Engineering Change
 
Multi Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesMulti Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on Kubernetes
 
Log everything!
Log everything!Log everything!
Log everything!
 
Klmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDBKlmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDB
 
Pinterest arch summit august 2012 - scaling pinterest
Pinterest arch summit   august 2012 - scaling pinterestPinterest arch summit   august 2012 - scaling pinterest
Pinterest arch summit august 2012 - scaling pinterest
 
3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMR3rd meetup - Intro to Amazon EMR
3rd meetup - Intro to Amazon EMR
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
 
KubeSecOps
KubeSecOpsKubeSecOps
KubeSecOps
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-HealingApplying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
 
April JavaScript Tools
April JavaScript ToolsApril JavaScript Tools
April JavaScript Tools
 

More from Kellan

More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.Kellan
 
Optimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerceOptimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerceKellan
 
Optimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerceOptimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerceKellan
 
More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.Kellan
 
Future of handmade
Future of handmadeFuture of handmade
Future of handmadeKellan
 
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)Kellan
 
Solving the "Brooklyn Problem"
Solving the "Brooklyn Problem" Solving the "Brooklyn Problem"
Solving the "Brooklyn Problem" Kellan
 
Social Software For Robots
Social Software For RobotsSocial Software For Robots
Social Software For RobotsKellan
 
Beyond REST? Building data services with XMPP
Beyond REST? Building data services with XMPPBeyond REST? Building data services with XMPP
Beyond REST? Building data services with XMPPKellan
 
Advanced OAuth Wrangling
Advanced OAuth WranglingAdvanced OAuth Wrangling
Advanced OAuth WranglingKellan
 
Casual Privacy (Ignite Web2.0 Expo)
Casual Privacy (Ignite Web2.0 Expo)Casual Privacy (Ignite Web2.0 Expo)
Casual Privacy (Ignite Web2.0 Expo)Kellan
 

More from Kellan (11)

More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.
 
Optimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerceOptimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerce
 
Optimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerceOptimizing for change: Taking risks safely & e-commerce
Optimizing for change: Taking risks safely & e-commerce
 
More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.More women in engineering: Something that ACTUALLY WORKED.
More women in engineering: Something that ACTUALLY WORKED.
 
Future of handmade
Future of handmadeFuture of handmade
Future of handmade
 
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
 
Solving the "Brooklyn Problem"
Solving the "Brooklyn Problem" Solving the "Brooklyn Problem"
Solving the "Brooklyn Problem"
 
Social Software For Robots
Social Software For RobotsSocial Software For Robots
Social Software For Robots
 
Beyond REST? Building data services with XMPP
Beyond REST? Building data services with XMPPBeyond REST? Building data services with XMPP
Beyond REST? Building data services with XMPP
 
Advanced OAuth Wrangling
Advanced OAuth WranglingAdvanced OAuth Wrangling
Advanced OAuth Wrangling
 
Casual Privacy (Ignite Web2.0 Expo)
Casual Privacy (Ignite Web2.0 Expo)Casual Privacy (Ignite Web2.0 Expo)
Casual Privacy (Ignite Web2.0 Expo)
 

Recently uploaded

Instruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics TradeInstruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics TradeOptics-Trade
 
Turkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdf
Turkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdfTurkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdf
Turkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdfEticketing.co
 
Instruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics Trade
Instruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics TradeInstruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics Trade
Instruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics TradeOptics-Trade
 
Expert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FLExpert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FLAll American Billiards
 
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited MoneyReal Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited MoneyApk Toly
 
Austria vs France David Alaba Switches Position to Defender in Austria's Euro...
Austria vs France David Alaba Switches Position to Defender in Austria's Euro...Austria vs France David Alaba Switches Position to Defender in Austria's Euro...
Austria vs France David Alaba Switches Position to Defender in Austria's Euro...Eticketing.co
 
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfJORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfArturo Pacheco Alvarez
 
France's UEFA Euro 2024 Ambitions Amid Coman's Injury.docx
France's UEFA Euro 2024 Ambitions Amid Coman's Injury.docxFrance's UEFA Euro 2024 Ambitions Amid Coman's Injury.docx
France's UEFA Euro 2024 Ambitions Amid Coman's Injury.docxEuro Cup 2024 Tickets
 
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docxItaly Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docxWorld Wide Tickets And Hospitality
 
Technical Data | ThermTec Wild 335 | Optics Trade
Technical Data | ThermTec Wild 335 | Optics TradeTechnical Data | ThermTec Wild 335 | Optics Trade
Technical Data | ThermTec Wild 335 | Optics TradeOptics-Trade
 
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesMysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Servicesnajka9823
 
IPL Quiz ( weekly quiz) by SJU quizzers.
IPL Quiz ( weekly quiz) by SJU quizzers.IPL Quiz ( weekly quiz) by SJU quizzers.
IPL Quiz ( weekly quiz) by SJU quizzers.SJU Quizzers
 
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...World Wide Tickets And Hospitality
 
PPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports RivalryPPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports Rivalryanirbannath184
 
Introduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint PresentationIntroduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint PresentationJuliusMacaballug
 
Austria VS France Injury Woes a Look at Euro 2024 Qualifiers.docx
Austria VS France Injury Woes a Look at Euro 2024 Qualifiers.docxAustria VS France Injury Woes a Look at Euro 2024 Qualifiers.docx
Austria VS France Injury Woes a Look at Euro 2024 Qualifiers.docxWorld Wide Tickets And Hospitality
 

Recently uploaded (16)

Instruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics TradeInstruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
Instruction Manual | ThermTec Wild Thermal Monoculars | Optics Trade
 
Turkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdf
Turkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdfTurkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdf
Turkiye Vs Georgia Turkey's UEFA Euro 2024 Journey with High Hopes.pdf
 
Instruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics Trade
Instruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics TradeInstruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics Trade
Instruction Manual | ThermTec Hunt Thermal Clip-On Series | Optics Trade
 
Expert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FLExpert Pool Table Refelting in Lee & Collier County, FL
Expert Pool Table Refelting in Lee & Collier County, FL
 
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited MoneyReal Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
Real Moto 2 MOD APK v1.1.721 All Bikes, Unlimited Money
 
Austria vs France David Alaba Switches Position to Defender in Austria's Euro...
Austria vs France David Alaba Switches Position to Defender in Austria's Euro...Austria vs France David Alaba Switches Position to Defender in Austria's Euro...
Austria vs France David Alaba Switches Position to Defender in Austria's Euro...
 
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfJORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
 
France's UEFA Euro 2024 Ambitions Amid Coman's Injury.docx
France's UEFA Euro 2024 Ambitions Amid Coman's Injury.docxFrance's UEFA Euro 2024 Ambitions Amid Coman's Injury.docx
France's UEFA Euro 2024 Ambitions Amid Coman's Injury.docx
 
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docxItaly Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
 
Technical Data | ThermTec Wild 335 | Optics Trade
Technical Data | ThermTec Wild 335 | Optics TradeTechnical Data | ThermTec Wild 335 | Optics Trade
Technical Data | ThermTec Wild 335 | Optics Trade
 
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best ServicesMysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
Mysore Call Girls 7001305949 WhatsApp Number 24x7 Best Services
 
IPL Quiz ( weekly quiz) by SJU quizzers.
IPL Quiz ( weekly quiz) by SJU quizzers.IPL Quiz ( weekly quiz) by SJU quizzers.
IPL Quiz ( weekly quiz) by SJU quizzers.
 
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
 
PPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports RivalryPPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports Rivalry
 
Introduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint PresentationIntroduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint Presentation
 
Austria VS France Injury Woes a Look at Euro 2024 Qualifiers.docx
Austria VS France Injury Woes a Look at Euro 2024 Qualifiers.docxAustria VS France Injury Woes a Look at Euro 2024 Qualifiers.docx
Austria VS France Injury Woes a Look at Euro 2024 Qualifiers.docx
 

Architecting for Change: QCONNYC 2012

  • 1. Optimized for change: Architecture @ Etsy Kellan Elliott-McCrea @kellan CTO, Etsy Monday, June 18, 12
  • 3. Launched June 18, 2005 875,000 active sellers 33.5MM items for sale $65.9MM in sales, in May 1.4B page views, in May 102 engineers 32 releases, last Friday Monday, June 18, 12
  • 4. LAMP any questions? 8BitLit, http://www.etsy.com/listing/90066890/ Monday, June 18, 12
  • 6. 3 inevitabilities we design for: 1. Things break, unexpectedly 2. What we're building changes 3. We don't get to start over Monday, June 18, 12
  • 7. 2 years of change. Monday, June 18, 12
  • 8. Architectural Principles * Don't bet against the future. * Our customers are humans. * Simplicity always wins, in the end. * Favor global vs local optimization. * Ambiguity kills momentum. * Make failure cheap. * Technical debt is an inevitable by-product of shipping code. * Optimize for change. Monday, June 18, 12
  • 10. Complex systems and change 1. Distributed systems are inherently complex. 2. The outcome of change in complex systems is hard to predict. 3. The outcome of small, frequent, measurable changes are easier to predict, easier to recover from, and promote learning. Ckrickett, http://www.etsy.com/listing/90611466 Monday, June 18, 12
  • 11. Continuous deployment, Metrics Driven Development, Blameless Post-Mortems Ckrickett, http://www.etsy.com/listing/90611466 Monday, June 18, 12
  • 12. Continuous deployment: Small, frequent changes to production Ckrickett, http://www.etsy.com/listing/90611466 Monday, June 18, 12
  • 13. Continuous Deployment: No branching. “All existing revision control systems were built by people who build installed software” - Paul Hammond, Always Ship Trunk, Velocity 2010 Thursday, March 17, 2011 Monday, June 18, 12
  • 14. Continuous Deployment: feature flags if ($cfg[‘awesome_new_search’]) { # new hotness $rsp = do_solr(); } else { # boring old stuff $rsp = do_grep(); } Monday, June 18, 12
  • 15. Continuous Deployment: Ramp - ups (on top of feature flags) 1. Launch to staff only 2. Launch to 1% of all users 3. Launch to members of a beta group Monday, June 18, 12
  • 16. Continuous Deployment: any engineer can launch a feature to 1% of users Monday, June 18, 12
  • 17. Continuous Deployment: ~200 experiments live right now Monday, June 18, 12
  • 18. Metrics driven development: introspection isn’t optional. measure everything, log everything Monday, June 18, 12
  • 19. Metrics driven development: Metrics happen when you make it easy. And visible. Monday, June 18, 12
  • 20. Metrics driven development: Teach computer to read graphs holtWintersConfidence(Upper|Lower) Monday, June 18, 12
  • 21. Metrics driven development: More info: http://www.slideshare.net/ mikebrittain/metricsdriven-engineering Monday, June 18, 12
  • 22. Optimize for MTTR, not MTBF Monday, June 18, 12
  • 25. Etsy EMR/S3 PCI BCP, Cold Monday, June 18, 12
  • 26. inbound request CDNs - diversified at the DNS level Internet providers - diversified at borders AWS Etsy network appliances analytics imstor etsystatic.com/ etsy.com/ bcn.etsy.com EMR S3 photos api.etsy.com JRuby/ /atlas Squid Cascading apache apache apache S3 logs php application php PHP logrotate MySQL imstor MySQL HDFS search analytics NFS memcache async http StatsD sqlite gearman logs MySQL server/OS search mail out PCI hardware Thrift SMTP dbindex Jetty dbshards X-Yarnblaster Solr slaves via jsonp, dbaux datasets no privileged access dbdata Solr master etc HBase sharded MySQL Monday, June 18, 12
  • 27. CDNs: Put a slider on it Just works via weighted DNS Monday, June 18, 12
  • 28. Apache * Well known * PHP is native * apache_note * fast start time * cheap in place replacement * .htaccess * Challenge: memory usage Monday, June 18, 12
  • 29. Apache: apache_note intr Addit osp ive! ecti insa on nely thro apache_note('etsy_uaid', $id); ugh usefu the l! life cyc le Monday, June 18, 12
  • 30. Apache: log format LogFormat "%{X-Forwarded-For}i % {True-Client-IP}i %l %u %t "%r" %>s %b "%{Referer}i" "%{User- Agent}i" % {etsy_shop_id}n % {etsy_uaid}n %V % {etsy_ab_selections}n % {etsy_request_uuid}n % {etsy_api_consumer_key}n % {etsy_api_method_name}n % {php_memory_usage_bytes}n % {php_time_microsec}n %D" combined Monday, June 18, 12
  • 31. Etsy: the App * 487,000 lines of PHP * 214,000 lines of Javascript * Monolithic codebase * 3 front ends, Etsy.com, API, Atlas Monday, June 18, 12
  • 32. Etsy: the App * routing handled by Apache * scripts fronting OO PHP5 * PHP, fast by default * opcode caching * Challenge: liveliness when calling services Monday, June 18, 12
  • 33. Etsy: coding patterns * light weight, home rolled “framework” * ORM handles DAO across backends * config and feature flags systems used everywhere * small slow moving datasets stored as PHP arrays * A/B tests * Smarty * StatsD * Concurrency * memcache Monday, June 18, 12
  • 34. Etsy: A/B tests * beaconed * inserted into logs via apache_note * conditionalized on feature flags * nightly reports on conversion, bounce rate, etc * nightly reports on page speed, memory usage, etc Monday, June 18, 12
  • 35. Etsy: Smarty * pre-compiled * pre-compiled per language Monday, June 18, 12
  • 36. Etsy: StatsD StatsD::increment("logins.success"); StatsD::timing("gearman.time", $msec); * 340,000 application metrics Monday, June 18, 12
  • 37. Etsy: Concurrency * no native concurrency in PHP * asynchronous HTTP calls * Gearman Monday, June 18, 12
  • 38. Etsy: Async HTTP calls * curl_multi_exec * non-blocking, per request time outs * used for optional aspects of a page * curl against http://localhost to avoid network overhead Monday, June 18, 12
  • 39. Etsy: Gearman * language agnostic job server * don’t use an MQ when you want a job server * 150 job types * persistent jobs flushed to MySQL, read from memory * non-persistent jobs just stored in memory * NP queue is wicked fast. Monday, June 18, 12
  • 40. Etsy: Gearman * scaling CPU of cron jobs * denormalizing data * pushing to 3rd party services Monday, June 18, 12
  • 41. Etsy: Challenges * Apache memory usage * liveliness talking to services, no concurrency, blocking by default Monday, June 18, 12
  • 42. Etsy: graph of distributed failure Monday, June 18, 12
  • 43. Etsy: Challenges * Apache memory usage * liveliness talking to services: no concurrency, blocking by default Enforce liveliness with a judicious application of force Monday, June 18, 12
  • 44. Etsy: judicious application of force list($v, $res, $shar) = @fopen(‘/proc/self/statm', 'r'); $mine = $res-$shar; if ($mine > $cfg[‘sizelimit’]) { $pid = getmypid(); @exec("kill -USR1 $pid"); } Monday, June 18, 12
  • 45. Etsy: judicious application of force Bowhunter * Find long running PHP processes * Try to avoid those mid-post open(APACHE, "/usr/bin/curl -s http://localhost/server- status|") || die "$!"; Monday, June 18, 12
  • 46. Etsy: judicious application of force Query_killer * Same idea, long running queries * MySQL “SHOW PROCESSLIST();” Monday, June 18, 12
  • 47. Memcache * Caching, obviously * Cache invalidation is hard * Write buffering * multi_get * rate limits Monday, June 18, 12
  • 48. Memcache * atomic INCR is awesome * slice your time windows to reduce risk of cache eviction * we’ve been unlucky, lots of segfaults :( * multi_get slows down the more boxes in the pool Monday, June 18, 12
  • 49. MySQL: By the numbers * 25K+ queries/sec avg * 3TB InnoDB buffer pool * 15TB + data stored * 50 servers * 99.99% queries under 1ms Monday, June 18, 12
  • 50. MySQL: a NotMuchSQL server * no joins * no foreign keys * no transactions or locks * no sub-selects * store data like you want to read it. * also: no auto_increment Monday, June 18, 12
  • 51. MySQL: a NotMuchSQL server “Normalization is for sissie.” - Cal Henderson, Flickr Monday, June 18, 12
  • 52. MySQL: scale horizontally * objects shared by key * lookups maintained in dbindex (MySQL is a FAST key-value store) * avoid key hashing, range partitions, and partitioning functions more: http://www.slideshare.net/jgoulah/the-etsy-shard-architecture-starts-with-s-and-ends-with-hard Monday, June 18, 12
  • 53. MySQL: Master-Master * objects hashed to a side, avoid split brain * allows in place schema upgrades without slave promotion * simplified capacity planning more: http://codeascraft.etsy.com/2012/04/20/two-sides-for-salvation/ Monday, June 18, 12
  • 54. MySQL: Introspection web0038 : [Mon Jun 18 09:58:38 2012] [error] [client 10.101.1.12] [C6kds9y1MVptEDMoOe5KCYha9VWl] [error] [ORM_LONG_QUERY] [/var/etsy/ current/phplib/EtsyORM/Query/RawSql.php:752] [15877310] Query exceeded 10 seconds: long_query_time=83.0927 long_query_string='/* [etsy_shard_005_A] [/ remove_favorite_listing.php] */ DELETE FROM `users_favoritelistings` WHERE `user_id` = ? AND `listing_id` = ?' long_query_trace='#10 __construct() /EtsyModel/ UserFavoriteListingMirror.php:310 #4 delete() /EtsyModel/UserFavoriteListing.php:39 #3 delete() /EtsyModel/User.php:1840 #2 unfavoriteListing() /Controller/ Favorites.php:344 #1 removeFavoriteListingRecord() /Controller/Favorites.php:94 #0 performRemoveFavoriteListing() /var/etsy/current/htdocs/remove_favorite_listing.php: 9', referer: http://www.etsy.com/people/kellanem/favorites?page=5 SQL Comments are awesome! Monday, June 18, 12
  • 55. MySQL: Deletes are expensive * update objects to state=‘deleted’ * use partitions * truncatenator - on ext3, hard link file, move, delete slowly. Monday, June 18, 12
  • 56. Anatomy of a feature: Shop Stats Monday, June 18, 12
  • 57. Anatomy of a feature: Shop Stats “Never get into a land war in Asia, and never build an analytics tool on top of MySQL. Monday, June 18, 12
  • 58. Anatomy of a feature: Shop Stats * buffer writes in Memcache using predictable keys * flush to MySQL tables periodically via cron * bake old data into all possible date ranges, and archived to S3 * truncate tables Monday, June 18, 12
  • 60. bcn.etsy.com: beaconed event stream * Server-side and javascript event stream * At least one per page view * Apache serving static assets * Aggregated on HDFS via logrotate * Archived on S3 * Analyzed via JRuby/Cascading on Hadoop * Doesn’t use: Flume, Scribe, etc Monday, June 18, 12
  • 61. bcn.etsy.com: beaconed event stream {"event_guid":"c2ffb51808b.6d2be52959ef{".user_id": 8528531,"php_event_name":"s2","php_unique_id":"4fdf1cb5d5c078.37523961","php_event_dat e":"18/Jun/2012:08:19:01","locale_currency_code":"USD","pref_language":"en- US","region":"US","detected_region":"US","accept-languages":"en- US,en","isMobileDevice":"0","isMobileSupported":"0","isTabletSupported":"0","isTouch":"0","isEt syApp":"0","listing_ids":[60274277,101504389,98682771,88585080],"cids": [14103953,14239293,14247717,14209614],"query":"blue","keywords": ["blue","blue","blue","blue"],"position":1,"replay_number":1,"s2_cached": 1,"php_ab_test_names":"orm_record_instance_caching;mobile_detector.all_blackberry;multila ng_shops_listings.view;ga_replacement_cookie;disable_search_autosuggest;admin_toolbar;tra nslations.live_translations;ab_analytics_test;search_type_experiment;search_ads.max_replays_ less;search_diversity_experiment;search_cached_listing_cards;placefinder.cache_memcached_ migration;search_stream_a;search_all_items_ignores_supplies;search_default_type;search.two _cluster_deploy;search_parameter_sample;thrift_category2_transform;search.similar_listing_b rowse_page;orm_replicant_safe_find_many;bottom_first;foreign_language_carousel;search.rel ated_searches_all_items;weddings.srp_promos;search_log_page_position;newrelic;clientlog;go ogle_analytics_async;personalized_endpoint;search_no_dropdown;community_nav_popout;se curity_settings;search_changes_tooltip;inline_listing_hearts;framelogger;log_normal;analytics_ second_beacon;analytics_second_beacon_privileged;analytics_second_beacon_mobile","php_a b_var_names":"1;1;1;1;control;1;0;A;ponycorn_v3;1;threshold_off;1;1;1;0;all_sans_supplies; 0;1;1;1;1;0;top;0;0;1;0;1;0;1;1;1;0;1;1;1;0;1;0;1","php_ab_selector_names":" Monday, June 18, 12
  • 62. Search Search Master BitTorrent to distribute indexes Thrift, with server affinity Search Slave01 Web01 to improve cache hit ratio, just returns ids Search Slave02 Web02 Search SlaveNN WebNN 100% of all indexes on each slave incremental index, every 7 minutes, avoid even numbered cron times hydrate IDs via multi-get, ignore a few failures pull via cron, push via gearman denormalized listing store, databases and memcache transition from MySQL to Hbase, not user facing Monday, June 18, 12
  • 63. Search * Solr trunk * Custom ranking via crunched datasets * BitSet fields for personalized search * Scaling the JVM * 32% of visits, 40% of sales * Also powers categories, unshardable queries * Next time, just use HTTP * Up next: custom codecs * Avoiding sharding Monday, June 18, 12
  • 64. Search * JVM slow start * Search deployinator does rolling restart * HotSpot and GC causes unpredictable throughput * Overfetch - ask multiple servers, go with 1st response * Index size is important. Don’t store too much. Monday, June 18, 12
  • 65. Photos * 400 million photos * Uploaded locally, then streamed to S3 * GraphicsMagick FTW * Working set is tiny, served out of Squid * 2% read failure rate during full S3 outage. * 0% write failure rate during full S3 outage. JonathanOtis, http://www.etsy.com/listing/96361102/ Monday, June 18, 12
  • 66. Technology no longer part of the stack * Python Twisted * PostgreSQL and stored procedures * Scala and MongoDB * Clojure and Tokyo Tyrant * Rails * ActiveMQ * RabbitMQ * a "Routes" framework * building RPMs * Lighttpd Monday, June 18, 12
  • 67. Take aways 1. A few simple, boring, well known components 2. Extensive instrumentation 3. Rapid iteration and feedback loops 4. Human centric 5. A few tweaks on the classics for scale 6. Technology supports business goals Monday, June 18, 12
  • 68. Questions? More info: http://codeascraft.etsy.com http://slideshare.net/etsy http://github.com/etsy http://www.etsy.com/jobs kellan@etsy.com Monday, June 18, 12