SlideShare a Scribd company logo
1 of 145
Download to read offline
Scalable Web
   Architectures
Common Patterns & Approaches

          Cal Henderson
Hello




Web 2.0 Expo Berlin, 5th November 2007           2
Scalable Web Architectures?


                   What does scalable mean?

                        What’s an architecture?

                           An answer in 12 parts

Web 2.0 Expo Berlin, 5th November 2007             3
1.

                                 Scaling

Web 2.0 Expo Berlin, 5th November 2007        4
Scalability – myths and lies


      • What is scalability?




Web 2.0 Expo Berlin, 5th November 2007   5
Scalability – myths and lies


      • What is scalability not ?




Web 2.0 Expo Berlin, 5th November 2007   6
Scalability – myths and lies


      • What is scalability not ?
            – Raw Speed / Performance
            – HA / BCP
            – Technology X
            – Protocol Y



Web 2.0 Expo Berlin, 5th November 2007   7
Scalability – myths and lies


      • So what is scalability?




Web 2.0 Expo Berlin, 5th November 2007   8
Scalability – myths and lies


      • So what is scalability?
            – Traffic growth
            – Dataset growth
            – Maintainability




Web 2.0 Expo Berlin, 5th November 2007   9
Today
      • Two goals of application architecture:


                                         Scale
                                          HA
Web 2.0 Expo Berlin, 5th November 2007           10
Today
      • Three goals of application architecture:


                                         Scale
                                          HA
                               Performance
Web 2.0 Expo Berlin, 5th November 2007             11
Scalability

      • Two kinds:
            – Vertical (get bigger)
            – Horizontal (get more)




Web 2.0 Expo Berlin, 5th November 2007   12
Big Irons




                  Sunfire E20k             PowerEdge SC1435
             36x 1.8GHz processors       Dualcore 1.8 GHz processor
                                              Around $1,500
             $450,000 - $2,500,000

Web 2.0 Expo Berlin, 5th November 2007                                13
Cost vs Cost




Web 2.0 Expo Berlin, 5th November 2007   14
That’s OK
      • Sometimes vertical scaling is right

      • Buying a bigger box is quick (ish)
      • Redesigning software is not

      • Running out of MySQL performance?
            – Spend months on data federation
            – Or, Just buy a ton more RAM

Web 2.0 Expo Berlin, 5th November 2007          15
The H & the V


      • But we’ll mostly talk horizontal
            – Else this is going to be boring




Web 2.0 Expo Berlin, 5th November 2007          16
2.

                Architecture

Web 2.0 Expo Berlin, 5th November 2007        17
Architectures then?


      • The way the bits fit together
      • What grows where
      • The trade-offs between good/fast/cheap




Web 2.0 Expo Berlin, 5th November 2007           18
LAMP
      • We’re mostly talking about LAMP
            –   Linux
            –   Apache (or LightHTTPd)
            –   MySQL (or Postgres)
            –   PHP (or Perl, Python, Ruby)


      •   All open source
      •   All well supported
      •   All used in large operations
      •   (Same rules apply elsewhere)
Web 2.0 Expo Berlin, 5th November 2007        19
Simple web apps

      • A Web Application
                        – Or “Web Site” in Web 1.0 terminology




                         Interwebnet     App server   Database




Web 2.0 Expo Berlin, 5th November 2007                           20
Simple web apps

      • A Web Application
                        – Or “Web Site” in Web 1.0 terminology
                                                                 Storage array



                                                                 Cache


                         Interwobnet     App server   Database   AJAX!!!1




Web 2.0 Expo Berlin, 5th November 2007                                      21
App servers

      • App servers scale in two ways:




Web 2.0 Expo Berlin, 5th November 2007   22
App servers

      • App servers scale in two ways:

            – Really well




Web 2.0 Expo Berlin, 5th November 2007   23
App servers

      • App servers scale in two ways:

            – Really well

            – Quite badly




Web 2.0 Expo Berlin, 5th November 2007   24
App servers
      • Sessions!
            – (State)

            – Local sessions == bad
                  • When they move == quite bad


            – Centralized sessions == good

            – No sessions at all == awesome!

Web 2.0 Expo Berlin, 5th November 2007            25
Local sessions
      • Stored on disk
            – PHP sessions


      • Stored in memory
            – Shared memory block (APC)


      • Bad!
            – Can’t move users
            – Can’t avoid hotspots
            – Not fault tolerant

Web 2.0 Expo Berlin, 5th November 2007    26
Mobile local sessions
      • Custom built
            – Store last session location in cookie
            – If we hit a different server, pull our session
              information across


      • If your load balancer has sticky sessions,
        you can still get hotspots
            – Depends on volume – fewer heavier users
              hurt more

Web 2.0 Expo Berlin, 5th November 2007                         27
Remote centralized sessions
      • Store in a central database
            – Or an in-memory cache

      • No porting around of session data
      • No need for sticky sessions
      • No hot spots

      • Need to be able to scale the data store
            – But we’ve pushed the issue down the stack

Web 2.0 Expo Berlin, 5th November 2007                    28
No sessions
      • Stash it all in a cookie!

      • Sign it for safety
            – $data = $user_id . ‘-’ . $user_name;
            – $time = time();
            – $sig = sha1($secret . $time . $data);
            – $cookie = base64(“$sig-$time-$data”);

            – Timestamp means it’s simple to expire it
Web 2.0 Expo Berlin, 5th November 2007                   29
Super slim sessions
      • If you need more than the cookie (login status,
        user id, username), then pull their account row
        from the DB
            – Or from the account cache

      • None of the drawbacks of sessions
      • Avoids the overhead of a query per page
            – Great for high-volume pages which need little
              personalization
            – Turns out you can stick quite a lot in a cookie too
            – Pack with base64 and it’s easy to delimit fields

Web 2.0 Expo Berlin, 5th November 2007                              30
App servers

      • The Rasmus way
            – App server has ‘shared nothing’
            – Responsibility pushed down the stack
                  • Ooh, the stack




Web 2.0 Expo Berlin, 5th November 2007               31
Trifle




Web 2.0 Expo Berlin, 5th November 2007   32
Trifle
                                          Fruit / Presentation



                                           Cream / Markup



                                         Custard / Page Logic



                                         Jelly / Business Logic



                                          Sponge / Database



Web 2.0 Expo Berlin, 5th November 2007                            33
Trifle
                                          Fruit / Presentation



                                           Cream / Markup



                                         Custard / Page Logic



                                         Jelly / Business Logic



                                          Sponge / Database



Web 2.0 Expo Berlin, 5th November 2007                            34
App servers




Web 2.0 Expo Berlin, 5th November 2007   35
App servers




Web 2.0 Expo Berlin, 5th November 2007   36
App servers




Web 2.0 Expo Berlin, 5th November 2007   37
Well, that was easy

      • Scaling the web app server part is easy

      • The rest is the trickier part
            – Database
            – Serving static content
            – Storing static content



Web 2.0 Expo Berlin, 5th November 2007            38
The others
      • Other services scale similarly to web apps
            – That is, horizontally


      • The canonical examples:
            – Image conversion
            – Audio transcoding
            – Video transcoding
            – Web crawling
            – Compute!
Web 2.0 Expo Berlin, 5th November 2007               39
Amazon
      • Let’s talk about Amazon
            – S3 - Storage
            – EC2 – Compute! (XEN based)
            – SQS – Queueing

      • All horizontal

      • Cheap when small
            – Not cheap at scale

Web 2.0 Expo Berlin, 5th November 2007     40
3.

        Load Balancing

Web 2.0 Expo Berlin, 5th November 2007        41
Load balancing

      • If we have multiple nodes in a class, we
        need to balance between them

      • Hardware or software
      • Layer 4 or 7



Web 2.0 Expo Berlin, 5th November 2007             42
Hardware LB
      • A hardware appliance
            – Often a pair with heartbeats for HA


      • Expensive!
            – But offers high performance


      • Many brands
            – Alteon, Cisco, Netscalar, Foundry, etc
            – L7 - web switches, content switches, etc
Web 2.0 Expo Berlin, 5th November 2007                   43
Software LB
      • Just some software
            – Still needs hardware to run on
            – But can run on existing servers


      • Harder to have HA
            – Often people stick hardware LB’s in front
            – But Wackamole helps here



Web 2.0 Expo Berlin, 5th November 2007                    44
Software LB
      • Lots of options
            – Pound
            – Perlbal
            – Apache with mod_proxy


      • Wackamole with mod_backhand
            – http://backhand.org/wackamole/
            – http://backhand.org/mod_backhand/


Web 2.0 Expo Berlin, 5th November 2007            45
Wackamole




Web 2.0 Expo Berlin, 5th November 2007   46
Wackamole




Web 2.0 Expo Berlin, 5th November 2007   47
The layers
      • Layer 4
            – A ‘dumb’ balance


      • Layer 7
            – A ‘smart’ balance


      • OSI stack, routers, etc


Web 2.0 Expo Berlin, 5th November 2007   48
4.

                             Queuing

Web 2.0 Expo Berlin, 5th November 2007        49
Parallelizable == easy!

      • If we can transcode/crawl in parallel, it’s
        easy
            – But think about queuing
            – And asynchronous systems
            – The web ain’t built for slow things
            – But still, a simple problem



Web 2.0 Expo Berlin, 5th November 2007                50
Synchronous systems




Web 2.0 Expo Berlin, 5th November 2007   51
Asynchronous systems




Web 2.0 Expo Berlin, 5th November 2007   52
Helps with peak periods




Web 2.0 Expo Berlin, 5th November 2007   53
Synchronous systems




Web 2.0 Expo Berlin, 5th November 2007   54
Asynchronous systems




Web 2.0 Expo Berlin, 5th November 2007   55
Asynchronous systems




Web 2.0 Expo Berlin, 5th November 2007   56
5.

      Relational Data

Web 2.0 Expo Berlin, 5th November 2007        57
Databases

      • Unless we’re doing a lot of file serving, the
        database is the toughest part to scale

      • If we can, best to avoid the issue
        altogether and just buy bigger hardware

      • Dual Opteron/Intel64 systems with 16+GB
        of RAM can get you a long way

Web 2.0 Expo Berlin, 5th November 2007                  58
More read power

      • Web apps typically have a read/write ratio
        of somewhere between 80/20 and 90/10
      • If we can scale read capacity, we can
        solve a lot of situations
      • MySQL replication!




Web 2.0 Expo Berlin, 5th November 2007               59
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   60
Master-Slave Replication


                                         Reads and Writes




                                         Reads




Web 2.0 Expo Berlin, 5th November 2007                      61
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   62
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   63
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   64
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   65
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   66
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   67
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   68
Master-Slave Replication




Web 2.0 Expo Berlin, 5th November 2007   69
6.

                               Caching

Web 2.0 Expo Berlin, 5th November 2007        70
Caching
      • Caching avoids needing to scale!
            – Or makes it cheaper


      • Simple stuff
            – mod_perl / shared memory
                  • Invalidation is hard
            – MySQL query cache
                  • Bad performance (in most cases)


Web 2.0 Expo Berlin, 5th November 2007                71
Caching

      • Getting more complicated…
            – Write-through cache
            – Write-back cache
            – Sideline cache




Web 2.0 Expo Berlin, 5th November 2007   72
Write-through cache




Web 2.0 Expo Berlin, 5th November 2007   73
Write-back cache




Web 2.0 Expo Berlin, 5th November 2007   74
Sideline cache




Web 2.0 Expo Berlin, 5th November 2007   75
Sideline cache
      • Easy to implement
            – Just add app logic


      • Need to manually invalidate cache
            – Well designed code makes it easy


      • Memcached
            – From Danga (LiveJournal)
            – http://www.danga.com/memcached/
Web 2.0 Expo Berlin, 5th November 2007           76
Memcache schemes

      • Layer 4
            – Good: Cache can be local on a machine
            – Bad: Invalidation gets more expensive with
              node count
            – Bad: Cache space wasted by duplicate
              objects




Web 2.0 Expo Berlin, 5th November 2007                     77
Memcache schemes

      • Layer 7
            – Good: No wasted space
            – Good: linearly scaling invalidation
            – Bad: Multiple, remote connections
                  • Can be avoided with a proxy layer
                        – Gets more complicated
                           » Last indentation level!




Web 2.0 Expo Berlin, 5th November 2007                  78
7.

                              HA Data

Web 2.0 Expo Berlin, 5th November 2007        79
But what about HA?




Web 2.0 Expo Berlin, 5th November 2007   80
But what about HA?




Web 2.0 Expo Berlin, 5th November 2007   81
SPOF!

      • The key to HA is avoiding SPOFs
            – Identify
            – Eliminate


      • Some stuff is hard to solve
            – Fix it further up the tree
                  • Dual DCs solves Router/Switch SPOF


Web 2.0 Expo Berlin, 5th November 2007                   82
Master-Master




Web 2.0 Expo Berlin, 5th November 2007   83
Master-Master
      • Either hot/warm or hot/hot

      • Writes can go to either
            – But avoid collisions
            – No auto-inc columns for hot/hot
                  • Bad for hot/warm too
                  • Unless you have MySQL 5
                        – But you can’t rely on the ordering!
            – Design schema/access to avoid collisions
                  • Hashing users to servers

Web 2.0 Expo Berlin, 5th November 2007                          84
Rings
      • Master-master is just a small ring
            – With 2 nodes


      • Bigger rings are possible
            – But not a mesh!
            – Each slave may only have a single master
            – Unless you build some kind of manual
              replication


Web 2.0 Expo Berlin, 5th November 2007                   85
Rings




Web 2.0 Expo Berlin, 5th November 2007   86
Rings




Web 2.0 Expo Berlin, 5th November 2007   87
Dual trees

      • Master-master is good for HA
            – But we can’t scale out the reads (or writes!)


      • We often need to combine the read
        scaling with HA

      • We can simply combine the two models

Web 2.0 Expo Berlin, 5th November 2007                        88
Dual trees




Web 2.0 Expo Berlin, 5th November 2007   89
Cost models
      • There’s a problem here
            – We need to always have 200% capacity to
              avoid a SPOF
                  • 400% for dual sites!
            – This costs too much


      • Solution is straight forward
            – Make sure clusters are bigger than 2


Web 2.0 Expo Berlin, 5th November 2007                  90
N+M
      • N+M
            – N = nodes needed to run the system
            – M = nodes we can afford to lose


      • Having M as big as N starts to suck
            – If we could make each node smaller, we can
              increase N while M stays constant
            – (We assume smaller nodes are cheaper)


Web 2.0 Expo Berlin, 5th November 2007                     91
1+1 = 200% hardware




Web 2.0 Expo Berlin, 5th November 2007   92
3+1 = 133% hardware




Web 2.0 Expo Berlin, 5th November 2007   93
Meshed masters
      • Not possible with regular MySQL out-of-
        the-box today

      • But there is hope!
            – NBD (MySQL Cluster) allows a mesh
            – Support for replication out to slaves in a
              coming version
                  • RSN!


Web 2.0 Expo Berlin, 5th November 2007                     94
8.

                      Federation

Web 2.0 Expo Berlin, 5th November 2007        95
Data federation

      • At some point, you need more writes
            – This is tough
            – Each cluster of servers has limited write
              capacity


      • Just add more clusters!



Web 2.0 Expo Berlin, 5th November 2007                    96
Simple things first
      • Vertical partitioning
            – Divide tables into sets that never get joined
            – Split these sets onto different server clusters
            – Voila!


      • Logical limits
            – When you run out of non-joining groups
            – When a single table grows too large


Web 2.0 Expo Berlin, 5th November 2007                          97
Data federation
      • Split up large tables, organized by some
        primary object
            – Usually users


      • Put all of a user’s data on one ‘cluster’
            – Or shard, or cell


      • Have one central cluster for lookups

Web 2.0 Expo Berlin, 5th November 2007              98
Data federation




Web 2.0 Expo Berlin, 5th November 2007   99
Data federation
      • Need more capacity?
            – Just add shards!
            – Don’t assign to shards based on user_id!


      • For resource leveling as time goes on, we
        want to be able to move objects between
        shards
            – Maybe – not everyone does this
            – ‘Lockable’ objects
Web 2.0 Expo Berlin, 5th November 2007                   100
The wordpress.com approach
      • Hash users into one of n buckets
            – Where n is a power of 2


      • Put all the buckets on one server

      • When you run out of capacity, split the buckets
        across two servers
      • Then you run out of capacity, split the buckets
        across four servers
      • Etc
Web 2.0 Expo Berlin, 5th November 2007                    101
Data federation
      • Heterogeneous hardware is fine
            – Just give a larger/smaller proportion of objects
              depending on hardware


      • Bigger/faster hardware for paying users
            – A common approach
            – Can also allocate faster app servers via magic
              cookies at the LB


Web 2.0 Expo Berlin, 5th November 2007                           102
Downsides
      • Need to keep stuff in the right place
      • App logic gets more complicated
      • More clusters to manage
            – Backups, etc
      • More database connections needed per
        page
            – Proxy can solve this, but complicated
      • The dual table issue
            – Avoid walking the shards!

Web 2.0 Expo Berlin, 5th November 2007                103
Bottom line


              Data federation is how
               large applications are
                      scaled


Web 2.0 Expo Berlin, 5th November 2007   104
Bottom line
      • It’s hard, but not impossible

      • Good software design makes it easier
            – Abstraction!

      • Master-master pairs for shards give us HA

      • Master-master trees work for central
        cluster (many reads, few writes)
Web 2.0 Expo Berlin, 5th November 2007              105
9.

               Multi-site HA

Web 2.0 Expo Berlin, 5th November 2007        106
Multiple Datacenters
      • Having multiple datacenters is hard
            – Not just with MySQL

      • Hot/warm with MySQL slaved setup
            – But manual (reconfig on failure)


      • Hot/hot with master-master
            – But dangerous (each site has a SPOF)

      • Hot/hot with sync/async manual replication
            – But tough (big engineering task)

Web 2.0 Expo Berlin, 5th November 2007               107
Multiple Datacenters




Web 2.0 Expo Berlin, 5th November 2007   108
GSLB

      • Multiple sites need to be balanced
            – Global Server Load Balancing


      • Easiest are AkaDNS-like services
            – Performance rotations
            – Balance rotations



Web 2.0 Expo Berlin, 5th November 2007       109
10.

              Serving Files

Web 2.0 Expo Berlin, 5th November 2007         110
Serving lots of files
      • Serving lots of files is not too tough
            – Just buy lots of machines and load balance!


      • We’re IO bound – need more spindles!
            – But keeping many copies of data in sync is
              hard
            – And sometimes we have other per-request
              overhead (like auth)


Web 2.0 Expo Berlin, 5th November 2007                      111
Reverse proxy




Web 2.0 Expo Berlin, 5th November 2007   112
Reverse proxy
      • Serving out of memory is fast!
            – And our caching proxies can have disks too
            – Fast or otherwise
                  • More spindles is better
                  • We stay in sync automatically


      • We can parallelize it!
            – 50 cache servers gives us 50 times the serving rate of
              the origin server
            – Assuming the working set is small enough to fit in
              memory in the cache cluster

Web 2.0 Expo Berlin, 5th November 2007                                 113
Invalidation

      • Dealing with invalidation is tricky

      • We can prod the cache servers directly to
        clear stuff out
            – Scales badly – need to clear asset from every
              server – doesn’t work well for 100 caches



Web 2.0 Expo Berlin, 5th November 2007                        114
Invalidation
      • We can change the URLs of modified
        resources
            – And let the old ones drop out cache naturally
            – Or prod them out, for sensitive data

      • Good approach!
            – Avoids browser cache staleness
            – Hello Akamai (and other CDNs)
            – Read more:
                  • http://www.thinkvitamin.com/features/webapps/serving-javascript-fast


Web 2.0 Expo Berlin, 5th November 2007                                                     115
Reverse proxy

      • Choices
            – L7 load balancer & Squid
                  • http://www.squid-cache.org/

            – mod_proxy & mod_cache
                  • http://www.apache.org/

            – Perlbal and Memcache?
                  • http://www.danga.com/

Web 2.0 Expo Berlin, 5th November 2007            116
High overhead serving
      • What if you need to authenticate your
        asset serving?
            – Private photos
            – Private data
            – Subscriber-only files


      • Two main approaches
            – Proxies w/ tokens
            – Path translation
Web 2.0 Expo Berlin, 5th November 2007          117
Perlbal backhanding

      • Perlbal can do redirection magic
            – Client sends request to Perbal
            – Perlbl plugin verifies user credentials
                  • token, cookies, whatever
                  • tokens avoid data-store access
            – Perlbal goes to pick up the file from elsewhere
            – Transparent to user


Web 2.0 Expo Berlin, 5th November 2007                          118
Perlbal backhanding




Web 2.0 Expo Berlin, 5th November 2007   119
Perlbal backhanding

      • Doesn’t keep database around while
        serving
      • Doesn’t keep app server around while
        serving
      • User doesn’t find out how to access asset
        directly


Web 2.0 Expo Berlin, 5th November 2007              120
Permission URLs
      • But why bother!?
      • If we bake the auth into the URL then it
        saves the auth step
      • We can do the auth on the web app
        servers when creating HTML
      • Just need some magic to translate to
        paths
      • We don’t want paths to be guessable

Web 2.0 Expo Berlin, 5th November 2007             121
Permission URLs




Web 2.0 Expo Berlin, 5th November 2007   122
Permission URLs




                (or mod_perl)




Web 2.0 Expo Berlin, 5th November 2007   123
Permission URLs
      • Downsides
            – URL gives permission for life
            – Unless you bake in tokens
                  • Tokens tend to be non-expirable
                        – We don’t want to track every token
                           » Too much overhead
                  • But can still expire

      • Upsides
            – It works
            – Scales nicely
Web 2.0 Expo Berlin, 5th November 2007                         124
11.

                Storing Files

Web 2.0 Expo Berlin, 5th November 2007         125
Storing lots of files

      • Storing files is easy!
            – Get a big disk
            – Get a bigger disk
            – Uh oh!


      • Horizontal scaling is the key
            – Again

Web 2.0 Expo Berlin, 5th November 2007   126
Connecting to storage
      • NFS
            – Stateful == Sucks
            – Hard mounts vs Soft mounts, INTR


      • SMB / CIFS / Samba
            – Turn off MSRPC & WINS (NetBOIS NS)
            – Stateful but degrades gracefully


      • HTTP
            – Stateless == Yay!
            – Just use Apache
Web 2.0 Expo Berlin, 5th November 2007             127
Multiple volumes
      • Volumes are limited in total size
            – Except (in theory) under ZFS & others


      • Sometimes we need multiple volumes for
        performance reasons
            – When using RAID with single/dual parity


      • At some point, we need multiple volumes

Web 2.0 Expo Berlin, 5th November 2007                  128
Multiple volumes




Web 2.0 Expo Berlin, 5th November 2007   129
Multiple hosts

      • Further down the road, a single host will
        be too small
      • Total throughput of machine becomes an
        issue
      • Even physical space can start to matter
      • So we need to be able to use multiple
        hosts

Web 2.0 Expo Berlin, 5th November 2007              130
Multiple hosts




Web 2.0 Expo Berlin, 5th November 2007   131
HA Storage

      • HA is important for assets too
            – We can back stuff up
            – But we tend to want hot redundancy


      • RAID is good
            – RAID 5 is cheap, RAID 10 is fast



Web 2.0 Expo Berlin, 5th November 2007             132
HA Storage
      • But whole machines can fail
      • So we stick assets on multiple machines

      • In this case, we can ignore RAID
            – In failure case, we serve from alternative
              source
            – But need to weigh up the rebuild time and
              effort against the risk
            – Store more than 2 copies?
Web 2.0 Expo Berlin, 5th November 2007                     133
HA Storage




Web 2.0 Expo Berlin, 5th November 2007   134
Self repairing systems
      • When something fails, repairing can be a
        pain
            – RAID rebuilds by itself, but machine
              replication doesn’t

      • The big appliances self heal
            – NetApp, StorEdge, etc

      • So does MogileFS (reaper)

Web 2.0 Expo Berlin, 5th November 2007               135
12.

                      Field Work

Web 2.0 Expo Berlin, 5th November 2007         136
Real world examples

      • Flickr
            – Because I know it


      • LiveJournal
            – Because everyone copies it




Web 2.0 Expo Berlin, 5th November 2007     137
Flickr
  Architecture




Web 2.0 Expo Berlin, 5th November 2007   138
Flickr
  Architecture




Web 2.0 Expo Berlin, 5th November 2007   139
LiveJournal
  Architecture




Web 2.0 Expo Berlin, 5th November 2007   140
LiveJournal
  Architecture




Web 2.0 Expo Berlin, 5th November 2007   141
Buy my book!




Web 2.0 Expo Berlin, 5th November 2007   142
Or buy Theo’s




Web 2.0 Expo Berlin, 5th November 2007   143
The end!




Web 2.0 Expo Berlin, 5th November 2007   144
Awesome!


                    These slides are available online:
                           iamcal.com/talks/




Web 2.0 Expo Berlin, 5th November 2007                   145

More Related Content

What's hot

Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is uselessAdrian Cockcroft
 
Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsMark Slingsby
 
Azure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars Platzdasch
Azure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars PlatzdaschAzure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars Platzdasch
Azure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars PlatzdaschLars Platzdasch
 
Introduction to AWS Database Services
Introduction to AWS Database ServicesIntroduction to AWS Database Services
Introduction to AWS Database ServicesAmazon Web Services
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudKellyn Pot'Vin-Gorman
 
Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architectureAHM Pervej Kabir
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloudJustin Swanhart
 
How to scale up, out or down in Windows Azure - Webinar
How to scale up, out or down in Windows Azure - WebinarHow to scale up, out or down in Windows Azure - Webinar
How to scale up, out or down in Windows Azure - WebinarCommon Sense
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterSolarWinds
 
Running Spark and MapReduce together in Production
Running Spark and MapReduce together in ProductionRunning Spark and MapReduce together in Production
Running Spark and MapReduce together in ProductionDataWorks Summit
 
How to scale up, out or down in Windows Azure
How to scale up, out or down in Windows AzureHow to scale up, out or down in Windows Azure
How to scale up, out or down in Windows AzureCommon Sense
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web appsDirecti Group
 
Diagnosing MySQL performance problems
Diagnosing  MySQL performance problemsDiagnosing  MySQL performance problems
Diagnosing MySQL performance problemsJustin Swanhart
 
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)Mirko Ortensi
 
Cloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and CloudCloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and CloudEberhard Wolff
 
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Continuent
 
Implementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedImplementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedGear6
 
Profiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty DetailsProfiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty DetailsAchievers Tech
 

What's hot (20)

Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is useless
 
Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web Apps
 
Azure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars Platzdasch
Azure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars PlatzdaschAzure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars Platzdasch
Azure Boot Camp 21.04.2018 SQL Server in Azure Iaas PaaS on-prem Lars Platzdasch
 
Introduction to AWS Database Services
Introduction to AWS Database ServicesIntroduction to AWS Database Services
Introduction to AWS Database Services
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle Cloud
 
Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architecture
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloud
 
How to scale up, out or down in Windows Azure - Webinar
How to scale up, out or down in Windows Azure - WebinarHow to scale up, out or down in Windows Azure - Webinar
How to scale up, out or down in Windows Azure - Webinar
 
Why new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases fasterWhy new hardware may not make Oracle databases faster
Why new hardware may not make Oracle databases faster
 
Move to azure
Move to azureMove to azure
Move to azure
 
Running Spark and MapReduce together in Production
Running Spark and MapReduce together in ProductionRunning Spark and MapReduce together in Production
Running Spark and MapReduce together in Production
 
How to scale up, out or down in Windows Azure
How to scale up, out or down in Windows AzureHow to scale up, out or down in Windows Azure
How to scale up, out or down in Windows Azure
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web apps
 
Diagnosing MySQL performance problems
Diagnosing  MySQL performance problemsDiagnosing  MySQL performance problems
Diagnosing MySQL performance problems
 
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
 
Cloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and CloudCloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and Cloud
 
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
 
Implementing High Availability Caching with Memcached
Implementing High Availability Caching with MemcachedImplementing High Availability Caching with Memcached
Implementing High Availability Caching with Memcached
 
Profiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty DetailsProfiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty Details
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
 

Viewers also liked

Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson
 
Building Big on the Web
Building Big on the WebBuilding Big on the Web
Building Big on the WebCal Henderson
 
Scalable Web Architectures - Common Patterns & Approaches
Scalable Web Architectures - Common Patterns & ApproachesScalable Web Architectures - Common Patterns & Approaches
Scalable Web Architectures - Common Patterns & ApproachesCal Henderson
 
flickr's architecture & php
flickr's architecture & php flickr's architecture & php
flickr's architecture & php coolpics
 
Php internal architecture
Php internal architecturePhp internal architecture
Php internal architectureElizabeth Smith
 
Services Oriented Architecture with PHP and MySQL
Services Oriented Architecture with PHP and MySQLServices Oriented Architecture with PHP and MySQL
Services Oriented Architecture with PHP and MySQLJoe Stump
 
Web App Scaffolding - FOWA London 2010
Web App Scaffolding - FOWA London 2010Web App Scaffolding - FOWA London 2010
Web App Scaffolding - FOWA London 2010Cal Henderson
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Aaron Saray
 
6 3 tier architecture php
6 3 tier architecture php6 3 tier architecture php
6 3 tier architecture phpcefour
 
Anatomy of a Modern PHP Application Architecture
Anatomy of a Modern PHP Application Architecture Anatomy of a Modern PHP Application Architecture
Anatomy of a Modern PHP Application Architecture AppDynamics
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tigerElizabeth Smith
 
Islamic law of inheritance (faraid) in malaysia
Islamic law of inheritance (faraid) in malaysiaIslamic law of inheritance (faraid) in malaysia
Islamic law of inheritance (faraid) in malaysiaan nur
 
Sep Nasiri "Upwork PHP Architecture"
Sep Nasiri "Upwork PHP Architecture"Sep Nasiri "Upwork PHP Architecture"
Sep Nasiri "Upwork PHP Architecture"Fwdays
 
General principles of inheritance under Muslim law - Rules relating to Islami...
General principles of inheritance under Muslim law - Rules relating to Islami...General principles of inheritance under Muslim law - Rules relating to Islami...
General principles of inheritance under Muslim law - Rules relating to Islami...Legal
 
Domain Driven Design using Laravel
Domain Driven Design using LaravelDomain Driven Design using Laravel
Domain Driven Design using Laravelwajrcs
 
Software Design Patterns in Laravel by Phill Sparks
Software Design Patterns in Laravel by Phill SparksSoftware Design Patterns in Laravel by Phill Sparks
Software Design Patterns in Laravel by Phill SparksPhill Sparks
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systemssumitjain2013
 

Viewers also liked (20)

Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
Building Big on the Web
Building Big on the WebBuilding Big on the Web
Building Big on the Web
 
Scalable Web Architectures - Common Patterns & Approaches
Scalable Web Architectures - Common Patterns & ApproachesScalable Web Architectures - Common Patterns & Approaches
Scalable Web Architectures - Common Patterns & Approaches
 
flickr's architecture & php
flickr's architecture & php flickr's architecture & php
flickr's architecture & php
 
Php internal architecture
Php internal architecturePhp internal architecture
Php internal architecture
 
Scafolding pics
Scafolding picsScafolding pics
Scafolding pics
 
Services Oriented Architecture with PHP and MySQL
Services Oriented Architecture with PHP and MySQLServices Oriented Architecture with PHP and MySQL
Services Oriented Architecture with PHP and MySQL
 
Web App Scaffolding - FOWA London 2010
Web App Scaffolding - FOWA London 2010Web App Scaffolding - FOWA London 2010
Web App Scaffolding - FOWA London 2010
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
 
6 3 tier architecture php
6 3 tier architecture php6 3 tier architecture php
6 3 tier architecture php
 
Anatomy of a Modern PHP Application Architecture
Anatomy of a Modern PHP Application Architecture Anatomy of a Modern PHP Application Architecture
Anatomy of a Modern PHP Application Architecture
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tiger
 
Islamic law of inheritance (faraid) in malaysia
Islamic law of inheritance (faraid) in malaysiaIslamic law of inheritance (faraid) in malaysia
Islamic law of inheritance (faraid) in malaysia
 
Sep Nasiri "Upwork PHP Architecture"
Sep Nasiri "Upwork PHP Architecture"Sep Nasiri "Upwork PHP Architecture"
Sep Nasiri "Upwork PHP Architecture"
 
Laws of inheritance
Laws of inheritanceLaws of inheritance
Laws of inheritance
 
General principles of inheritance under Muslim law - Rules relating to Islami...
General principles of inheritance under Muslim law - Rules relating to Islami...General principles of inheritance under Muslim law - Rules relating to Islami...
General principles of inheritance under Muslim law - Rules relating to Islami...
 
Domain Driven Design using Laravel
Domain Driven Design using LaravelDomain Driven Design using Laravel
Domain Driven Design using Laravel
 
Software Design Patterns in Laravel by Phill Sparks
Software Design Patterns in Laravel by Phill SparksSoftware Design Patterns in Laravel by Phill Sparks
Software Design Patterns in Laravel by Phill Sparks
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
Debt: The Inheritance No One Wants | Securian
Debt: The Inheritance No One Wants | SecurianDebt: The Inheritance No One Wants | Securian
Debt: The Inheritance No One Wants | Securian
 

Similar to Scalable Web Architectures: Common Patterns and Approaches

Pstrong Cybera 29 Sept 2008
Pstrong Cybera 29 Sept 2008Pstrong Cybera 29 Sept 2008
Pstrong Cybera 29 Sept 2008Cybera Inc.
 
Migrating from PHP4 To PHP5 - Zend Webinar
Migrating from PHP4 To PHP5 - Zend WebinarMigrating from PHP4 To PHP5 - Zend Webinar
Migrating from PHP4 To PHP5 - Zend WebinarIvo Jansch
 
WepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als dasWepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als dasenpit GmbH & Co. KG
 
WepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als dasWepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als dasAndreas Koop
 
From One to a Cluster
From One to a ClusterFrom One to a Cluster
From One to a Clusterguestd34230
 
Future Proof Development
Future Proof DevelopmentFuture Proof Development
Future Proof DevelopmentJeff Segars
 
High Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling TechniquesHigh Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling TechniquesZendCon
 
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data SetUnder the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Setadunne
 
symfony: An Open-Source Framework for Professionals (PHP Day 2008)
symfony: An Open-Source Framework for Professionals (PHP Day 2008)symfony: An Open-Source Framework for Professionals (PHP Day 2008)
symfony: An Open-Source Framework for Professionals (PHP Day 2008)Fabien Potencier
 
Jon Arne Sæterås - Give Responsive Design a mobile performance boost
Jon Arne Sæterås - Give Responsive Design a mobile performance boost Jon Arne Sæterås - Give Responsive Design a mobile performance boost
Jon Arne Sæterås - Give Responsive Design a mobile performance boost DevConFu
 
MySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the CommunityMySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the CommunityFrederic Descamps
 
Egl Rui Ajax World
Egl Rui Ajax WorldEgl Rui Ajax World
Egl Rui Ajax Worldrajivmordani
 
What's New in Oracle BI for Developers
What's New in Oracle BI for DevelopersWhat's New in Oracle BI for Developers
What's New in Oracle BI for DevelopersDatavail
 

Similar to Scalable Web Architectures: Common Patterns and Approaches (20)

Symfony for non-techies
Symfony for non-techiesSymfony for non-techies
Symfony for non-techies
 
Pstrong Cybera 29 Sept 2008
Pstrong Cybera 29 Sept 2008Pstrong Cybera 29 Sept 2008
Pstrong Cybera 29 Sept 2008
 
Migrating from PHP4 To PHP5 - Zend Webinar
Migrating from PHP4 To PHP5 - Zend WebinarMigrating from PHP4 To PHP5 - Zend Webinar
Migrating from PHP4 To PHP5 - Zend Webinar
 
Ruby on Rails
Ruby on RailsRuby on Rails
Ruby on Rails
 
XS Oracle 2009 Intro Slides
XS Oracle 2009 Intro SlidesXS Oracle 2009 Intro Slides
XS Oracle 2009 Intro Slides
 
WepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als dasWepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als das
 
WepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als dasWepApps mit Play! - Nichts leichter als das
WepApps mit Play! - Nichts leichter als das
 
From One to a Cluster
From One to a ClusterFrom One to a Cluster
From One to a Cluster
 
Future Proof Development
Future Proof DevelopmentFuture Proof Development
Future Proof Development
 
XS Oracle 2009 PVOps
XS Oracle 2009 PVOpsXS Oracle 2009 PVOps
XS Oracle 2009 PVOps
 
MySQL Aquarium Paris
MySQL Aquarium ParisMySQL Aquarium Paris
MySQL Aquarium Paris
 
High Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling TechniquesHigh Performance Php My Sql Scaling Techniques
High Performance Php My Sql Scaling Techniques
 
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data SetUnder the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
Under the Hood: How Geonames Aggregates Over 35 Sources into One Data Set
 
symfony: An Open-Source Framework for Professionals (PHP Day 2008)
symfony: An Open-Source Framework for Professionals (PHP Day 2008)symfony: An Open-Source Framework for Professionals (PHP Day 2008)
symfony: An Open-Source Framework for Professionals (PHP Day 2008)
 
Advanced Deployment
Advanced DeploymentAdvanced Deployment
Advanced Deployment
 
Qtp9.5
Qtp9.5Qtp9.5
Qtp9.5
 
Jon Arne Sæterås - Give Responsive Design a mobile performance boost
Jon Arne Sæterås - Give Responsive Design a mobile performance boost Jon Arne Sæterås - Give Responsive Design a mobile performance boost
Jon Arne Sæterås - Give Responsive Design a mobile performance boost
 
MySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the CommunityMySQL Community Meetup in China : Innovation driven by the Community
MySQL Community Meetup in China : Innovation driven by the Community
 
Egl Rui Ajax World
Egl Rui Ajax WorldEgl Rui Ajax World
Egl Rui Ajax World
 
What's New in Oracle BI for Developers
What's New in Oracle BI for DevelopersWhat's New in Oracle BI for Developers
What's New in Oracle BI for Developers
 

More from adunne

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overviewadunne
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Previewadunne
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networksadunne
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigationadunne
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Communityadunne
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spideradunne
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Previewadunne
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solradunne
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industryadunne
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centersadunne
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...adunne
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Dataadunne
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...adunne
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacyadunne
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketingadunne
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storageadunne
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibilityadunne
 
Web 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and AnalysisWeb 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and Analysisadunne
 

More from adunne (20)

Seedcamp Overview
Seedcamp OverviewSeedcamp Overview
Seedcamp Overview
 
Netvibes Preview
Netvibes PreviewNetvibes Preview
Netvibes Preview
 
Community Practices: From Forums to Social Networks
Community Practices: From Forums to Social NetworksCommunity Practices: From Forums to Social Networks
Community Practices: From Forums to Social Networks
 
Designing Tag Navigation
Designing Tag NavigationDesigning Tag Navigation
Designing Tag Navigation
 
Social Commerce and Community
Social Commerce and CommunitySocial Commerce and Community
Social Commerce and Community
 
The Starfish and the Spider
The Starfish and the SpiderThe Starfish and the Spider
The Starfish and the Spider
 
Ginger Preview
Ginger PreviewGinger Preview
Ginger Preview
 
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
The Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms IndustryThe Impact of Mobile Web 2.0 on the Telecoms Industry
The Impact of Mobile Web 2.0 on the Telecoms Industry
 
Building Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data CentersBuilding Web 2.0: Next-Generation Data Centers
Building Web 2.0: Next-Generation Data Centers
 
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
Killing the Org Chart: Organizational, Cultural and Leadership Models on the ...
 
Designing for a Web of Data
Designing for a Web of DataDesigning for a Web of Data
Designing for a Web of Data
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...Disrupting the Platform: Harnessing social analytics and other musings on the...
Disrupting the Platform: Harnessing social analytics and other musings on the...
 
Your User's Privacy
Your User's PrivacyYour User's Privacy
Your User's Privacy
 
Trends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine MarketingTrends in Search Engine Optimization and Search Engine Marketing
Trends in Search Engine Optimization and Search Engine Marketing
 
Wuala, P2P Online Storage
Wuala, P2P Online StorageWuala, P2P Online Storage
Wuala, P2P Online Storage
 
Breaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for AccessibilityBreaking Down The Barriers: Design for Accessibility
Breaking Down The Barriers: Design for Accessibility
 
Web 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and AnalysisWeb 2.0 Design Patterns, Models and Analysis
Web 2.0 Design Patterns, Models and Analysis
 

Recently uploaded

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 

Recently uploaded (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 

Scalable Web Architectures: Common Patterns and Approaches

  • 1. Scalable Web Architectures Common Patterns & Approaches Cal Henderson
  • 2. Hello Web 2.0 Expo Berlin, 5th November 2007 2
  • 3. Scalable Web Architectures? What does scalable mean? What’s an architecture? An answer in 12 parts Web 2.0 Expo Berlin, 5th November 2007 3
  • 4. 1. Scaling Web 2.0 Expo Berlin, 5th November 2007 4
  • 5. Scalability – myths and lies • What is scalability? Web 2.0 Expo Berlin, 5th November 2007 5
  • 6. Scalability – myths and lies • What is scalability not ? Web 2.0 Expo Berlin, 5th November 2007 6
  • 7. Scalability – myths and lies • What is scalability not ? – Raw Speed / Performance – HA / BCP – Technology X – Protocol Y Web 2.0 Expo Berlin, 5th November 2007 7
  • 8. Scalability – myths and lies • So what is scalability? Web 2.0 Expo Berlin, 5th November 2007 8
  • 9. Scalability – myths and lies • So what is scalability? – Traffic growth – Dataset growth – Maintainability Web 2.0 Expo Berlin, 5th November 2007 9
  • 10. Today • Two goals of application architecture: Scale HA Web 2.0 Expo Berlin, 5th November 2007 10
  • 11. Today • Three goals of application architecture: Scale HA Performance Web 2.0 Expo Berlin, 5th November 2007 11
  • 12. Scalability • Two kinds: – Vertical (get bigger) – Horizontal (get more) Web 2.0 Expo Berlin, 5th November 2007 12
  • 13. Big Irons Sunfire E20k PowerEdge SC1435 36x 1.8GHz processors Dualcore 1.8 GHz processor Around $1,500 $450,000 - $2,500,000 Web 2.0 Expo Berlin, 5th November 2007 13
  • 14. Cost vs Cost Web 2.0 Expo Berlin, 5th November 2007 14
  • 15. That’s OK • Sometimes vertical scaling is right • Buying a bigger box is quick (ish) • Redesigning software is not • Running out of MySQL performance? – Spend months on data federation – Or, Just buy a ton more RAM Web 2.0 Expo Berlin, 5th November 2007 15
  • 16. The H & the V • But we’ll mostly talk horizontal – Else this is going to be boring Web 2.0 Expo Berlin, 5th November 2007 16
  • 17. 2. Architecture Web 2.0 Expo Berlin, 5th November 2007 17
  • 18. Architectures then? • The way the bits fit together • What grows where • The trade-offs between good/fast/cheap Web 2.0 Expo Berlin, 5th November 2007 18
  • 19. LAMP • We’re mostly talking about LAMP – Linux – Apache (or LightHTTPd) – MySQL (or Postgres) – PHP (or Perl, Python, Ruby) • All open source • All well supported • All used in large operations • (Same rules apply elsewhere) Web 2.0 Expo Berlin, 5th November 2007 19
  • 20. Simple web apps • A Web Application – Or “Web Site” in Web 1.0 terminology Interwebnet App server Database Web 2.0 Expo Berlin, 5th November 2007 20
  • 21. Simple web apps • A Web Application – Or “Web Site” in Web 1.0 terminology Storage array Cache Interwobnet App server Database AJAX!!!1 Web 2.0 Expo Berlin, 5th November 2007 21
  • 22. App servers • App servers scale in two ways: Web 2.0 Expo Berlin, 5th November 2007 22
  • 23. App servers • App servers scale in two ways: – Really well Web 2.0 Expo Berlin, 5th November 2007 23
  • 24. App servers • App servers scale in two ways: – Really well – Quite badly Web 2.0 Expo Berlin, 5th November 2007 24
  • 25. App servers • Sessions! – (State) – Local sessions == bad • When they move == quite bad – Centralized sessions == good – No sessions at all == awesome! Web 2.0 Expo Berlin, 5th November 2007 25
  • 26. Local sessions • Stored on disk – PHP sessions • Stored in memory – Shared memory block (APC) • Bad! – Can’t move users – Can’t avoid hotspots – Not fault tolerant Web 2.0 Expo Berlin, 5th November 2007 26
  • 27. Mobile local sessions • Custom built – Store last session location in cookie – If we hit a different server, pull our session information across • If your load balancer has sticky sessions, you can still get hotspots – Depends on volume – fewer heavier users hurt more Web 2.0 Expo Berlin, 5th November 2007 27
  • 28. Remote centralized sessions • Store in a central database – Or an in-memory cache • No porting around of session data • No need for sticky sessions • No hot spots • Need to be able to scale the data store – But we’ve pushed the issue down the stack Web 2.0 Expo Berlin, 5th November 2007 28
  • 29. No sessions • Stash it all in a cookie! • Sign it for safety – $data = $user_id . ‘-’ . $user_name; – $time = time(); – $sig = sha1($secret . $time . $data); – $cookie = base64(“$sig-$time-$data”); – Timestamp means it’s simple to expire it Web 2.0 Expo Berlin, 5th November 2007 29
  • 30. Super slim sessions • If you need more than the cookie (login status, user id, username), then pull their account row from the DB – Or from the account cache • None of the drawbacks of sessions • Avoids the overhead of a query per page – Great for high-volume pages which need little personalization – Turns out you can stick quite a lot in a cookie too – Pack with base64 and it’s easy to delimit fields Web 2.0 Expo Berlin, 5th November 2007 30
  • 31. App servers • The Rasmus way – App server has ‘shared nothing’ – Responsibility pushed down the stack • Ooh, the stack Web 2.0 Expo Berlin, 5th November 2007 31
  • 32. Trifle Web 2.0 Expo Berlin, 5th November 2007 32
  • 33. Trifle Fruit / Presentation Cream / Markup Custard / Page Logic Jelly / Business Logic Sponge / Database Web 2.0 Expo Berlin, 5th November 2007 33
  • 34. Trifle Fruit / Presentation Cream / Markup Custard / Page Logic Jelly / Business Logic Sponge / Database Web 2.0 Expo Berlin, 5th November 2007 34
  • 35. App servers Web 2.0 Expo Berlin, 5th November 2007 35
  • 36. App servers Web 2.0 Expo Berlin, 5th November 2007 36
  • 37. App servers Web 2.0 Expo Berlin, 5th November 2007 37
  • 38. Well, that was easy • Scaling the web app server part is easy • The rest is the trickier part – Database – Serving static content – Storing static content Web 2.0 Expo Berlin, 5th November 2007 38
  • 39. The others • Other services scale similarly to web apps – That is, horizontally • The canonical examples: – Image conversion – Audio transcoding – Video transcoding – Web crawling – Compute! Web 2.0 Expo Berlin, 5th November 2007 39
  • 40. Amazon • Let’s talk about Amazon – S3 - Storage – EC2 – Compute! (XEN based) – SQS – Queueing • All horizontal • Cheap when small – Not cheap at scale Web 2.0 Expo Berlin, 5th November 2007 40
  • 41. 3. Load Balancing Web 2.0 Expo Berlin, 5th November 2007 41
  • 42. Load balancing • If we have multiple nodes in a class, we need to balance between them • Hardware or software • Layer 4 or 7 Web 2.0 Expo Berlin, 5th November 2007 42
  • 43. Hardware LB • A hardware appliance – Often a pair with heartbeats for HA • Expensive! – But offers high performance • Many brands – Alteon, Cisco, Netscalar, Foundry, etc – L7 - web switches, content switches, etc Web 2.0 Expo Berlin, 5th November 2007 43
  • 44. Software LB • Just some software – Still needs hardware to run on – But can run on existing servers • Harder to have HA – Often people stick hardware LB’s in front – But Wackamole helps here Web 2.0 Expo Berlin, 5th November 2007 44
  • 45. Software LB • Lots of options – Pound – Perlbal – Apache with mod_proxy • Wackamole with mod_backhand – http://backhand.org/wackamole/ – http://backhand.org/mod_backhand/ Web 2.0 Expo Berlin, 5th November 2007 45
  • 46. Wackamole Web 2.0 Expo Berlin, 5th November 2007 46
  • 47. Wackamole Web 2.0 Expo Berlin, 5th November 2007 47
  • 48. The layers • Layer 4 – A ‘dumb’ balance • Layer 7 – A ‘smart’ balance • OSI stack, routers, etc Web 2.0 Expo Berlin, 5th November 2007 48
  • 49. 4. Queuing Web 2.0 Expo Berlin, 5th November 2007 49
  • 50. Parallelizable == easy! • If we can transcode/crawl in parallel, it’s easy – But think about queuing – And asynchronous systems – The web ain’t built for slow things – But still, a simple problem Web 2.0 Expo Berlin, 5th November 2007 50
  • 51. Synchronous systems Web 2.0 Expo Berlin, 5th November 2007 51
  • 52. Asynchronous systems Web 2.0 Expo Berlin, 5th November 2007 52
  • 53. Helps with peak periods Web 2.0 Expo Berlin, 5th November 2007 53
  • 54. Synchronous systems Web 2.0 Expo Berlin, 5th November 2007 54
  • 55. Asynchronous systems Web 2.0 Expo Berlin, 5th November 2007 55
  • 56. Asynchronous systems Web 2.0 Expo Berlin, 5th November 2007 56
  • 57. 5. Relational Data Web 2.0 Expo Berlin, 5th November 2007 57
  • 58. Databases • Unless we’re doing a lot of file serving, the database is the toughest part to scale • If we can, best to avoid the issue altogether and just buy bigger hardware • Dual Opteron/Intel64 systems with 16+GB of RAM can get you a long way Web 2.0 Expo Berlin, 5th November 2007 58
  • 59. More read power • Web apps typically have a read/write ratio of somewhere between 80/20 and 90/10 • If we can scale read capacity, we can solve a lot of situations • MySQL replication! Web 2.0 Expo Berlin, 5th November 2007 59
  • 60. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 60
  • 61. Master-Slave Replication Reads and Writes Reads Web 2.0 Expo Berlin, 5th November 2007 61
  • 62. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 62
  • 63. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 63
  • 64. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 64
  • 65. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 65
  • 66. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 66
  • 67. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 67
  • 68. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 68
  • 69. Master-Slave Replication Web 2.0 Expo Berlin, 5th November 2007 69
  • 70. 6. Caching Web 2.0 Expo Berlin, 5th November 2007 70
  • 71. Caching • Caching avoids needing to scale! – Or makes it cheaper • Simple stuff – mod_perl / shared memory • Invalidation is hard – MySQL query cache • Bad performance (in most cases) Web 2.0 Expo Berlin, 5th November 2007 71
  • 72. Caching • Getting more complicated… – Write-through cache – Write-back cache – Sideline cache Web 2.0 Expo Berlin, 5th November 2007 72
  • 73. Write-through cache Web 2.0 Expo Berlin, 5th November 2007 73
  • 74. Write-back cache Web 2.0 Expo Berlin, 5th November 2007 74
  • 75. Sideline cache Web 2.0 Expo Berlin, 5th November 2007 75
  • 76. Sideline cache • Easy to implement – Just add app logic • Need to manually invalidate cache – Well designed code makes it easy • Memcached – From Danga (LiveJournal) – http://www.danga.com/memcached/ Web 2.0 Expo Berlin, 5th November 2007 76
  • 77. Memcache schemes • Layer 4 – Good: Cache can be local on a machine – Bad: Invalidation gets more expensive with node count – Bad: Cache space wasted by duplicate objects Web 2.0 Expo Berlin, 5th November 2007 77
  • 78. Memcache schemes • Layer 7 – Good: No wasted space – Good: linearly scaling invalidation – Bad: Multiple, remote connections • Can be avoided with a proxy layer – Gets more complicated » Last indentation level! Web 2.0 Expo Berlin, 5th November 2007 78
  • 79. 7. HA Data Web 2.0 Expo Berlin, 5th November 2007 79
  • 80. But what about HA? Web 2.0 Expo Berlin, 5th November 2007 80
  • 81. But what about HA? Web 2.0 Expo Berlin, 5th November 2007 81
  • 82. SPOF! • The key to HA is avoiding SPOFs – Identify – Eliminate • Some stuff is hard to solve – Fix it further up the tree • Dual DCs solves Router/Switch SPOF Web 2.0 Expo Berlin, 5th November 2007 82
  • 83. Master-Master Web 2.0 Expo Berlin, 5th November 2007 83
  • 84. Master-Master • Either hot/warm or hot/hot • Writes can go to either – But avoid collisions – No auto-inc columns for hot/hot • Bad for hot/warm too • Unless you have MySQL 5 – But you can’t rely on the ordering! – Design schema/access to avoid collisions • Hashing users to servers Web 2.0 Expo Berlin, 5th November 2007 84
  • 85. Rings • Master-master is just a small ring – With 2 nodes • Bigger rings are possible – But not a mesh! – Each slave may only have a single master – Unless you build some kind of manual replication Web 2.0 Expo Berlin, 5th November 2007 85
  • 86. Rings Web 2.0 Expo Berlin, 5th November 2007 86
  • 87. Rings Web 2.0 Expo Berlin, 5th November 2007 87
  • 88. Dual trees • Master-master is good for HA – But we can’t scale out the reads (or writes!) • We often need to combine the read scaling with HA • We can simply combine the two models Web 2.0 Expo Berlin, 5th November 2007 88
  • 89. Dual trees Web 2.0 Expo Berlin, 5th November 2007 89
  • 90. Cost models • There’s a problem here – We need to always have 200% capacity to avoid a SPOF • 400% for dual sites! – This costs too much • Solution is straight forward – Make sure clusters are bigger than 2 Web 2.0 Expo Berlin, 5th November 2007 90
  • 91. N+M • N+M – N = nodes needed to run the system – M = nodes we can afford to lose • Having M as big as N starts to suck – If we could make each node smaller, we can increase N while M stays constant – (We assume smaller nodes are cheaper) Web 2.0 Expo Berlin, 5th November 2007 91
  • 92. 1+1 = 200% hardware Web 2.0 Expo Berlin, 5th November 2007 92
  • 93. 3+1 = 133% hardware Web 2.0 Expo Berlin, 5th November 2007 93
  • 94. Meshed masters • Not possible with regular MySQL out-of- the-box today • But there is hope! – NBD (MySQL Cluster) allows a mesh – Support for replication out to slaves in a coming version • RSN! Web 2.0 Expo Berlin, 5th November 2007 94
  • 95. 8. Federation Web 2.0 Expo Berlin, 5th November 2007 95
  • 96. Data federation • At some point, you need more writes – This is tough – Each cluster of servers has limited write capacity • Just add more clusters! Web 2.0 Expo Berlin, 5th November 2007 96
  • 97. Simple things first • Vertical partitioning – Divide tables into sets that never get joined – Split these sets onto different server clusters – Voila! • Logical limits – When you run out of non-joining groups – When a single table grows too large Web 2.0 Expo Berlin, 5th November 2007 97
  • 98. Data federation • Split up large tables, organized by some primary object – Usually users • Put all of a user’s data on one ‘cluster’ – Or shard, or cell • Have one central cluster for lookups Web 2.0 Expo Berlin, 5th November 2007 98
  • 99. Data federation Web 2.0 Expo Berlin, 5th November 2007 99
  • 100. Data federation • Need more capacity? – Just add shards! – Don’t assign to shards based on user_id! • For resource leveling as time goes on, we want to be able to move objects between shards – Maybe – not everyone does this – ‘Lockable’ objects Web 2.0 Expo Berlin, 5th November 2007 100
  • 101. The wordpress.com approach • Hash users into one of n buckets – Where n is a power of 2 • Put all the buckets on one server • When you run out of capacity, split the buckets across two servers • Then you run out of capacity, split the buckets across four servers • Etc Web 2.0 Expo Berlin, 5th November 2007 101
  • 102. Data federation • Heterogeneous hardware is fine – Just give a larger/smaller proportion of objects depending on hardware • Bigger/faster hardware for paying users – A common approach – Can also allocate faster app servers via magic cookies at the LB Web 2.0 Expo Berlin, 5th November 2007 102
  • 103. Downsides • Need to keep stuff in the right place • App logic gets more complicated • More clusters to manage – Backups, etc • More database connections needed per page – Proxy can solve this, but complicated • The dual table issue – Avoid walking the shards! Web 2.0 Expo Berlin, 5th November 2007 103
  • 104. Bottom line Data federation is how large applications are scaled Web 2.0 Expo Berlin, 5th November 2007 104
  • 105. Bottom line • It’s hard, but not impossible • Good software design makes it easier – Abstraction! • Master-master pairs for shards give us HA • Master-master trees work for central cluster (many reads, few writes) Web 2.0 Expo Berlin, 5th November 2007 105
  • 106. 9. Multi-site HA Web 2.0 Expo Berlin, 5th November 2007 106
  • 107. Multiple Datacenters • Having multiple datacenters is hard – Not just with MySQL • Hot/warm with MySQL slaved setup – But manual (reconfig on failure) • Hot/hot with master-master – But dangerous (each site has a SPOF) • Hot/hot with sync/async manual replication – But tough (big engineering task) Web 2.0 Expo Berlin, 5th November 2007 107
  • 108. Multiple Datacenters Web 2.0 Expo Berlin, 5th November 2007 108
  • 109. GSLB • Multiple sites need to be balanced – Global Server Load Balancing • Easiest are AkaDNS-like services – Performance rotations – Balance rotations Web 2.0 Expo Berlin, 5th November 2007 109
  • 110. 10. Serving Files Web 2.0 Expo Berlin, 5th November 2007 110
  • 111. Serving lots of files • Serving lots of files is not too tough – Just buy lots of machines and load balance! • We’re IO bound – need more spindles! – But keeping many copies of data in sync is hard – And sometimes we have other per-request overhead (like auth) Web 2.0 Expo Berlin, 5th November 2007 111
  • 112. Reverse proxy Web 2.0 Expo Berlin, 5th November 2007 112
  • 113. Reverse proxy • Serving out of memory is fast! – And our caching proxies can have disks too – Fast or otherwise • More spindles is better • We stay in sync automatically • We can parallelize it! – 50 cache servers gives us 50 times the serving rate of the origin server – Assuming the working set is small enough to fit in memory in the cache cluster Web 2.0 Expo Berlin, 5th November 2007 113
  • 114. Invalidation • Dealing with invalidation is tricky • We can prod the cache servers directly to clear stuff out – Scales badly – need to clear asset from every server – doesn’t work well for 100 caches Web 2.0 Expo Berlin, 5th November 2007 114
  • 115. Invalidation • We can change the URLs of modified resources – And let the old ones drop out cache naturally – Or prod them out, for sensitive data • Good approach! – Avoids browser cache staleness – Hello Akamai (and other CDNs) – Read more: • http://www.thinkvitamin.com/features/webapps/serving-javascript-fast Web 2.0 Expo Berlin, 5th November 2007 115
  • 116. Reverse proxy • Choices – L7 load balancer & Squid • http://www.squid-cache.org/ – mod_proxy & mod_cache • http://www.apache.org/ – Perlbal and Memcache? • http://www.danga.com/ Web 2.0 Expo Berlin, 5th November 2007 116
  • 117. High overhead serving • What if you need to authenticate your asset serving? – Private photos – Private data – Subscriber-only files • Two main approaches – Proxies w/ tokens – Path translation Web 2.0 Expo Berlin, 5th November 2007 117
  • 118. Perlbal backhanding • Perlbal can do redirection magic – Client sends request to Perbal – Perlbl plugin verifies user credentials • token, cookies, whatever • tokens avoid data-store access – Perlbal goes to pick up the file from elsewhere – Transparent to user Web 2.0 Expo Berlin, 5th November 2007 118
  • 119. Perlbal backhanding Web 2.0 Expo Berlin, 5th November 2007 119
  • 120. Perlbal backhanding • Doesn’t keep database around while serving • Doesn’t keep app server around while serving • User doesn’t find out how to access asset directly Web 2.0 Expo Berlin, 5th November 2007 120
  • 121. Permission URLs • But why bother!? • If we bake the auth into the URL then it saves the auth step • We can do the auth on the web app servers when creating HTML • Just need some magic to translate to paths • We don’t want paths to be guessable Web 2.0 Expo Berlin, 5th November 2007 121
  • 122. Permission URLs Web 2.0 Expo Berlin, 5th November 2007 122
  • 123. Permission URLs (or mod_perl) Web 2.0 Expo Berlin, 5th November 2007 123
  • 124. Permission URLs • Downsides – URL gives permission for life – Unless you bake in tokens • Tokens tend to be non-expirable – We don’t want to track every token » Too much overhead • But can still expire • Upsides – It works – Scales nicely Web 2.0 Expo Berlin, 5th November 2007 124
  • 125. 11. Storing Files Web 2.0 Expo Berlin, 5th November 2007 125
  • 126. Storing lots of files • Storing files is easy! – Get a big disk – Get a bigger disk – Uh oh! • Horizontal scaling is the key – Again Web 2.0 Expo Berlin, 5th November 2007 126
  • 127. Connecting to storage • NFS – Stateful == Sucks – Hard mounts vs Soft mounts, INTR • SMB / CIFS / Samba – Turn off MSRPC & WINS (NetBOIS NS) – Stateful but degrades gracefully • HTTP – Stateless == Yay! – Just use Apache Web 2.0 Expo Berlin, 5th November 2007 127
  • 128. Multiple volumes • Volumes are limited in total size – Except (in theory) under ZFS & others • Sometimes we need multiple volumes for performance reasons – When using RAID with single/dual parity • At some point, we need multiple volumes Web 2.0 Expo Berlin, 5th November 2007 128
  • 129. Multiple volumes Web 2.0 Expo Berlin, 5th November 2007 129
  • 130. Multiple hosts • Further down the road, a single host will be too small • Total throughput of machine becomes an issue • Even physical space can start to matter • So we need to be able to use multiple hosts Web 2.0 Expo Berlin, 5th November 2007 130
  • 131. Multiple hosts Web 2.0 Expo Berlin, 5th November 2007 131
  • 132. HA Storage • HA is important for assets too – We can back stuff up – But we tend to want hot redundancy • RAID is good – RAID 5 is cheap, RAID 10 is fast Web 2.0 Expo Berlin, 5th November 2007 132
  • 133. HA Storage • But whole machines can fail • So we stick assets on multiple machines • In this case, we can ignore RAID – In failure case, we serve from alternative source – But need to weigh up the rebuild time and effort against the risk – Store more than 2 copies? Web 2.0 Expo Berlin, 5th November 2007 133
  • 134. HA Storage Web 2.0 Expo Berlin, 5th November 2007 134
  • 135. Self repairing systems • When something fails, repairing can be a pain – RAID rebuilds by itself, but machine replication doesn’t • The big appliances self heal – NetApp, StorEdge, etc • So does MogileFS (reaper) Web 2.0 Expo Berlin, 5th November 2007 135
  • 136. 12. Field Work Web 2.0 Expo Berlin, 5th November 2007 136
  • 137. Real world examples • Flickr – Because I know it • LiveJournal – Because everyone copies it Web 2.0 Expo Berlin, 5th November 2007 137
  • 138. Flickr Architecture Web 2.0 Expo Berlin, 5th November 2007 138
  • 139. Flickr Architecture Web 2.0 Expo Berlin, 5th November 2007 139
  • 140. LiveJournal Architecture Web 2.0 Expo Berlin, 5th November 2007 140
  • 141. LiveJournal Architecture Web 2.0 Expo Berlin, 5th November 2007 141
  • 142. Buy my book! Web 2.0 Expo Berlin, 5th November 2007 142
  • 143. Or buy Theo’s Web 2.0 Expo Berlin, 5th November 2007 143
  • 144. The end! Web 2.0 Expo Berlin, 5th November 2007 144
  • 145. Awesome! These slides are available online: iamcal.com/talks/ Web 2.0 Expo Berlin, 5th November 2007 145