HTTP accelerator
          Cachos 2.0
    Luis Henrique Okama
         Tollef Fog Heen
          Mario Carvalho
What is the problem?
  GET / HTTP/1.1
    Hang on, I need to look up a few hundred things in my
    database and then do a lot of editing controlled by a
    scripting language
  HEAD / HTTP/1.1
    Hang on, I need to look up a few hundred things in my
    database and then do a lot of editing controlled by a
    scripting language, and then I will throw the result
    away.
Why are CMS-es slow?

   Complex content generation process
   Single database prevents clustering
   Expensive software ditto

This means: we need server side caching
Why are the existing solutions not good enough?

  Squid, ancient design, forward proxy
  Apache, not what it's built for, not what it's good at
  Akamai (and similar), expensive, vendor lock-in
What is Varnish?

 Dedicated HTTP accelerator
 Focus on server-side speedups
 Policy control
 High performance
 Varnish Configuration Language
 Shared memory log
High performance

 11 syscalls + 7 locks for a cache hit
 Work with the OS, not against it.
 Multi-CPU, multi-core
 64 bit
 Use advanced OS features:
    Accept filters
    madvise(MADV_RANDOM)
    kqueue, epoll
 Don't copy data if you don't have to
 Use workspaces, not malloc/free
 Compiled configuration
Policy control

  Override TTLs
  Add, remove or change headers
  Strip cookies
  Rewrite URLs
  Invalidate objects in the cache
Varnish Configuration Language

 Simple domain specific language
 Compiled via C language to binary
     Transparently!
 Dynamically loaded
 Multiple configs loaded concurrently
 Instant switch from one VCL to another.
     Can be done from VCL
VCL Example

  acl some_acl { "10.0.0.0/8" }
  if (client.ip ~ some_acl) {
      pass;
  }

  if (req.http.host ~ "foo.com$") {
      set req.http.host =
          regsub(req.http.host,
              "foo.com", "bar.org");
  }

  if (obj.valid) {
      set obj.ttl = 10m;
      deliver;
  }
Shared memory logfile

 Fast
 Custom log tailers
    varnishtop
    varnishlog
    varnishhist
    varnishncsa
    varnishstat
Globo.com cache infrastructure
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Comparative
  scalability
  setup time
  cost
  open source
  throughput
  physical space
Some data
Conclusion
  software customization
  work set
Real life performance
Questions
Contacts

  Luis H. Okama
  okama@corp.globo.com (www.globo.com)
  Mario Carvalho
  mariocar@corp.globo.com (www.globo.com)
  Tollef Fog Heen
  tfheen@redpill-linpro.com (www.varnish-cache.com)

  @ Stand Globo.com

Globo.com & Varnish

  • 1.
    HTTP accelerator Cachos 2.0 Luis Henrique Okama Tollef Fog Heen Mario Carvalho
  • 2.
    What is theproblem? GET / HTTP/1.1 Hang on, I need to look up a few hundred things in my database and then do a lot of editing controlled by a scripting language HEAD / HTTP/1.1 Hang on, I need to look up a few hundred things in my database and then do a lot of editing controlled by a scripting language, and then I will throw the result away.
  • 3.
    Why are CMS-esslow? Complex content generation process Single database prevents clustering Expensive software ditto This means: we need server side caching
  • 4.
    Why are theexisting solutions not good enough? Squid, ancient design, forward proxy Apache, not what it's built for, not what it's good at Akamai (and similar), expensive, vendor lock-in
  • 5.
    What is Varnish? Dedicated HTTP accelerator Focus on server-side speedups Policy control High performance Varnish Configuration Language Shared memory log
  • 6.
    High performance 11syscalls + 7 locks for a cache hit Work with the OS, not against it. Multi-CPU, multi-core 64 bit Use advanced OS features: Accept filters madvise(MADV_RANDOM) kqueue, epoll Don't copy data if you don't have to Use workspaces, not malloc/free Compiled configuration
  • 7.
    Policy control Override TTLs Add, remove or change headers Strip cookies Rewrite URLs Invalidate objects in the cache
  • 8.
    Varnish Configuration Language Simple domain specific language Compiled via C language to binary Transparently! Dynamically loaded Multiple configs loaded concurrently Instant switch from one VCL to another. Can be done from VCL
  • 9.
    VCL Example acl some_acl { "10.0.0.0/8" } if (client.ip ~ some_acl) { pass; } if (req.http.host ~ "foo.com$") { set req.http.host = regsub(req.http.host, "foo.com", "bar.org"); } if (obj.valid) { set obj.ttl = 10m; deliver; }
  • 10.
    Shared memory logfile Fast Custom log tailers varnishtop varnishlog varnishhist varnishncsa varnishstat
  • 11.
  • 12.
    Comparative scalability setup time cost open source throughput physical space
  • 13.
    Comparative scalability setup time cost open source throughput physical space
  • 14.
    Comparative scalability setup time cost open source throughput physical space
  • 15.
    Comparative scalability setup time cost open source throughput physical space
  • 16.
    Comparative scalability setup time cost open source throughput physical space
  • 17.
    Comparative scalability setup time cost open source throughput physical space
  • 18.
    Comparative scalability setup time cost open source throughput physical space
  • 19.
  • 20.
    Conclusion softwarecustomization work set
  • 21.
  • 22.
  • 23.
    Contacts LuisH. Okama okama@corp.globo.com (www.globo.com) Mario Carvalho mariocar@corp.globo.com (www.globo.com) Tollef Fog Heen tfheen@redpill-linpro.com (www.varnish-cache.com) @ Stand Globo.com