How Flipkart
 Scales PHP
   Tips & Tricks
About Flipkart


• Flipkart.com
• India’s largest eCommerce website*
• Millions of requests/day
About Me


• Siddhartha Reddy
• Tech Lead at Flipkart
• Taking care of Search & Browse experience
Today



• Flipkart.com’s Architecture
   • How each component addresses scale
Flipkart.com’s
           Architecture
                                   Session
                        Database
                                    Store



                                   CMS
 Load         Web
Balancer    Server(s)

                                   Search



                        Caches     OMS
What is Scaling?
What is Scaling?


• Handle large amount of traffic
• High availability
• Good response times
How Flipkart Scales
How Flipkart Scales
• Logging and measuring
• Horizontal Scaling
   • Load-balancing
• High Availability setup
• Web-server optimizations
• Caching
How Flipkart Scales
How Flipkart Scales
• Logging and measuring
• Horizontal Scaling
   • Load-balancing
• High Availability setup
• Web-server optimizations
• Caching
Logging & Measuring
Logging & Measuring
How Flipkart Scales

• Horizontal Scaling
   • Load-balancing
• High Availability setup
• Web-server optimizations
• Caching
Traffic
Servers: Chuck Norris Style
Servers: Spartan Style
Horizontal Scaling
Horizontal Scaling


      Traffic




   web-1       web-2   ...   web-n
Horizontal Scaling


                    Load-
      Traffic       balancer




   web-1       web-2          ...   web-n
Horizontal Scaling


                    Load-
      Traffic       balancer




   web-1       web-2          ...   web-n
Horizontal Scaling


                    Load-
      Traffic       balancer




   web-1       web-2          ...   web-n
Horizontal Scaling


                    Load-
      Traffic       balancer




   web-1       web-2          ...   web-n
Horizontal Scaling


                    Load-
      Traffic       balancer




   web-1       web-2          ...   web-n
How Flipkart Scales

• Horizontal Scaling
   • Load-balancing
• High Availability setup
• Web-server optimizations
• Caching
Load Balancer
Load Balancer


• Internal binary services (Apache Thrift)
• Internal web services (HTTP)
• External web traffic (HTTP/HTTPS)
Load Balancer


• Internal binary services (Apache Thrift)
• Internal web services (HTTP)
• External web traffic (HTTP/HTTPS)
Load Balancer
• Internal binary services (Apache Thrift)
   • Software: HAProxy
   • Transport layer (TCP) load balancer
   • TCP connect for availability check
• Internal web services (HTTP)
• External web traffic (HTTP/HTTPS)
Load Balancer


• Internal binary services (Apache Thrift)
• Internal web services (HTTP)
• External web traffic (HTTP/HTTPS)
Load Balancer
• Internal binary services (Apache Thrift)
• Internal web services (HTTP)
    • Software: Varnish
    • Application Layer (HTTP) load balancer
    • Configurable/scriptable caching
    • HTTP GET for availability check
• External web traffic (HTTP/HTTPS)
Load Balancer


• Internal binary services (Apache Thrift)
• Internal web services (HTTP)
• External web traffic (HTTP/HTTPS)
Load Balancer
• Internal binary services (Apache Thrift)
• Internal web services (HTTP)
• External web traffic (HTTP/HTTPS)
    • Software: Nginx
    • Application Layer (HTTP/HTTPS) load balancer
    • Caches and serves static content (JS/images)
    • HTTP GET for availability check
Load Balancer:
Session Affinity?

               Load-
              balancer




                         ...   web-n
  web-1   web-2
Load Balancer:
Session Affinity?

     SID: 123 (1)
                         Load-
                        balancer




                                   ...   web-n
  web-1             web-2
Load Balancer:
Session Affinity?

     SID: 123 (1)
                         Load-
                        balancer




                                   ...   web-n
  web-1             web-2
Load Balancer:
Session Affinity?

     SID: 123 (1)
                         Load-
                        balancer


                                   SID: 123 (1)




                                   ...            web-n
  web-1             web-2
Load Balancer:
Session Affinity?

     SID: 123 (1)
                         Load-
     SID: 123 (2)       balancer


                                   SID: 123 (1)




                                   ...            web-n
  web-1             web-2
Load Balancer:
Session Affinity?

     SID: 123 (1)
                         Load-
     SID: 123 (2)       balancer


                                   SID: 123 (1)




                                   ...            web-n
  web-1             web-2
Load Balancer:
Session Affinity?

     SID: 123 (1)
                           Load-
     SID: 123 (2)         balancer


                                     SID: 123 (1)
                      SID: 123 (2)




                                     ...            web-n
  web-1             web-2
HTTP Load Balancing:
  Session Affinity
HTTP Load Balancing:
    Session Affinity

• (+) Sessions can be stored on the web server

• (+) Different code on different web servers

    • Useful for testing new code on a subset of users

• (-) If any web server goes down, need to migrate sessions

• (-) Not easy to take a web server out-of-rotation
HTTP Load Balancer:
Session Non-affinity
HTTP Load Balancer:
   Session Non-affinity
• Sessions stored in a common session-store
• (+) Easy to manage -- all web servers have identical state
• (+) If a web server goes down -- no problem
• (+) Can take any web server out-of-rotation
• (-) Can’t deploy to a subset of servers for testing
• (-) Central session-store could be slower
HTTP Load Balancer:
Session Some-affinity
HTTP Load Balancer:
 Session Some-affinity


• Sessions sticky to a subset (group) of machines
• Use a common session-store
How Flipkart Scales

• Horizontal Scaling
   • Load-balancing
• High Availability setup
• Web-server optimizations
• Caching
Load Balancer:
  Hardware?
Load Balancer:
            Hardware?

• Expensive

• Inflexible to manage

• They run Linux anyway!
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
   Balancers
Flipkart’s Load-
       balancers



 bond:1            bond:1
(10.3.1.1)        (10.3.1.2)


 bond:0            bond:0
(10.3.0.1)        (10.3.0.2)
Flipkart’s Load-
        balancers



  bond:1            bond:1
 (10.3.1.1)        (10.3.1.2)


  bond:0            bond:0
 (10.3.0.1)        (10.3.0.2)




Active
Flipkart’s Load-
        balancers



  bond:1              bond:1
 (10.3.1.1)          (10.3.1.2)


  bond:0              bond:0
 (10.3.0.1)          (10.3.0.2)




Active           Hot Stand-by
Flipkart’s Load-
        balancers



  bond:1              bond:1
 (10.3.1.1)          (10.3.1.2)


  bond:0              bond:0
 (10.3.0.1)          (10.3.0.2)


 bond:0:0
 (10.3.0.0)


Active           Hot Stand-by
Flipkart’s Load-
        balancers



  bond:1                       bond:1
 (10.3.1.1)                   (10.3.1.2)


  bond:0                       bond:0
              Heartbeat
 (10.3.0.1)                   (10.3.0.2)


 bond:0:0
 (10.3.0.0)


Active                    Hot Stand-by
How Flipkart Scales

• Horizontal Scaling
   • Load-balancing
• High Availability setup
• Web-server optimizations
• Caching
On the Web Servers
On the Web Servers


• PHP5
On the Web Servers


• PHP5
• Apache/mod_php
On the Web Servers


• PHP5
• Apache/mod_php
On the Web Servers


• PHP5
• Apache/mod_php
• Apache/mod_fcgid (FastCGI)
On the Web Servers


• PHP5
• Apache/mod_php
• Apache/mod_fcgid (FastCGI)
On the Web Servers

• PHP5
• Apache/mod_php
• Apache/mod_fcgid (FastCGI)
• PHP-FPM
On the Web Servers
• PHP5
• Apache/mod_php
• Apache/mod_fcgid (FastCGI)
• PHP-FPM
    • PHP FastCGI Process Manager
    • Great process management
    • Security
    • Configurability & control
    • Adaptive process spawning (5.3.3RC1+)
On the Web Server:
   fk-w3-agent
On the Web Server:
      fk-w3-agent

• Simple Java “middleware” daemon
On the Web Server:
      fk-w3-agent

• Simple Java “middleware” daemon
• Deployed on each web server
On the Web Server:
      fk-w3-agent

• Simple Java “middleware” daemon
• Deployed on each web server
• Communicates with PHP through local socket
On the Web Server:
      fk-w3-agent

• Simple Java “middleware” daemon
• Deployed on each web server
• Communicates with PHP through local socket
• Hosts pluggable “handlers”
On the Web Server:
      fk-w3-agent

• Simple Java “middleware” daemon
• Deployed on each web server
• Communicates with PHP through local socket
• Hosts pluggable “handlers”
fk-w3-agent: Why?
fk-w3-agent: Why?
• “I wish PHP could ...”
fk-w3-agent: Why?
• “I wish PHP could ...”
• Logging
fk-w3-agent: Why?
• “I wish PHP could ...”
• Logging
• Connection pools
fk-w3-agent: Why?
• “I wish PHP could ...”
• Logging
• Connection pools
• Parallelization
fk-w3-agent: Why?
• “I wish PHP could ...”
• Logging
• Connection pools
• Parallelization
• Config & discovery
fk-w3-agent: Why?
• “I wish PHP could ...”
• Logging
• Connection pools
• Parallelization
• Config & discovery
• Running statistics
fk-w3-agent: Logging
fk-w3-agent: Logging
• We like to log, a lot
fk-w3-agent: Logging
• We like to log, a lot
• Logs need to be flushed to disk
fk-w3-agent: Logging
• We like to log, a lot
• Logs need to be flushed to disk
• PHP resources wasted
    • waiting for disk I/O
    • waiting for lock on log files

    • or logs go missing
fk-w3-agent: Logging
fk-w3-agent: Logging


• PHP processes send logs to fk-w3-agent
• fk-w3-agent writes logs to files
fk-w3-agent:
Connection Pooling
fk-w3-agent:
   Connection Pooling

• Many service calls / request
fk-w3-agent:
   Connection Pooling

• Many service calls / request
• Cost of recreating connections high
fk-w3-agent:
   Connection Pooling

• Many service calls / request
• Cost of recreating connections high
• PHP does not support connection pools*
fk-w3-agent:
Connection Pooling
fk-w3-agent:
   Connection Pooling

• fk-w3-agent maintains persistent connection pools
  to various services
• PHP accesses services by communicating with fk-
  w3-agent
fk-w3-agent:
Parallelization
fk-w3-agent:
        Parallelization


• Service calls can be parallelized
fk-w3-agent:
        Parallelization


• Service calls can be parallelized
• Multi-threading in PHP is ...
fk-w3-agent:
Parallelization
fk-w3-agent:
        Parallelization


• PHP sends request to fk-w3-agent
• fk-w3-agent uses multiple threads to hit services in
  parallel
fk-w3-agent:
Config & Discovery
fk-w3-agent:
    Config & Discovery

• Change config without deployment?
fk-w3-agent:
    Config & Discovery

• Change config without deployment?
• Simple: store config in a database
fk-w3-agent:
    Config & Discovery

• Change config without deployment?
• Simple: store config in a database
• But: reloading config on each request: expensive
fk-w3-agent:
Config & Discovery
fk-w3-agent:
    Config & Discovery

• fk-w3-agent loads config from database
   • Caches it in memory
   • Keeps it up-to-date
• PHP gets config from fk-w3-agent
fk-w3-agent:
Running Statistics
fk-w3-agent:
    Running Statistics

• PHP sends basic stats to fk-w3-agent for every
  request
• fk-w3-agent aggregates these stats
• Used for monitoring
How Flipkart Scales

• Horizontal Scaling
   • Load-balancing
• Web-server optimizations
• Caching
To Cache or not to
     Cache?
To Cache or not to
          Cache?
• “There are only two hard things in Computer
  Science: cache invalidation and naming things”. --
  Tim Bray quoting Phil Karlton
To Cache or not to
          Cache?
• “There are only two hard things in Computer
  Science: cache invalidation and naming things”. --
  Tim Bray quoting Phil Karlton
• “There are only two hard problems in Computer
  Science: cache invalidation, naming things and off-by-1
  errors.”
Caches Invalidation
Caches Invalidation

• Objects updated => Notification Message
Caches Invalidation

• Objects updated => Notification Message
• Remove updated objects from cache
Caches Invalidation

• Objects updated => Notification Message
• Remove updated objects from cache

• Replace updated objects in cache
Caches Invalidation

• Objects updated => Notification Message
• Remove updated objects from cache

• Replace updated objects in cache
• Aggressively populate cold cache
Caches at Flipkart
Caches at Flipkart

• Memcached -- in memory caching
Caches at Flipkart

• Memcached -- in memory caching
• Redis -- persistent store/cache
Caches at Flipkart

• Memcached -- in memory caching
• Redis -- persistent store/cache
• Varnish -- HTTP caching
Caches at Flipkart

• Memcached -- in memory caching
• Redis -- persistent store/cache
• Varnish -- HTTP caching
• fk-w3-agent -- Config caching
Let’s continue the
         discussion

• @sids on Twitter
• siddhartha@flipkart.com
• #phpcloud on irc.freenode.net

How Flipkart scales PHP