0
Caching, sharding, distributing - Scaling best practices.
    18.11.2009

    Lars Jankowfsky

    CTO swoodoo AG




Mitt...
About me:



                              PHP, C++, Developer, Software Architect since 1992

                           ...
LOAD?




                         Average 17, Maximum 138

Mittwoch, 18. November 2009
Scaling?




                              Scaling   Distributing



                              Caching    Sharding



...
(c) istockphoto


Mittwoch, 18. November 2009
Scaling



Mittwoch, 18. November 2009
SOA                                      Scaling




                  Your App    Your App   Your App


                 ...
SOA                                      Scaling



                              GUI/Frontend


                         ...
SOA                                                           Scaling



                 GUI/Frontend         GUI/Fronten...
SOA                                             PRO



          Scalable!

          You can add Servers where you need t...
SOA                                      CON



          A lot of work....

          Difficult to test when doing TDD

 ...
Distributing



Mittwoch, 18. November 2009
Virtual Machines                                Distributing




                                   GUI   API


        En...
Virtual Machines                     Distributing



                  GUI          API         API

                 GUI
...
Virtual Machines                                          PRO



          Easy to distribute on new hardware as needed

 ...
Virtual Machines                                        CON



          Hardware failure....

          Costs (at least f...
Caching



Mittwoch, 18. November 2009
Mittwoch, 18. November 2009
Caching


                              GUI/Frontend



                                  API




                        ...
Files                                           PRO



          simple, easy for the begin

          good for a „share n...
Files                                                     CON



          hits the HDD

          consumes memory (file s...
APC                                                            PRO



          OPCODE Cache

          Invalidation and s...
APC                                                       CON



          bloats web server (apache) process memory

    ...
memcached                                                      PRO



          can be used by several servers

          ...
memcached                         CON



          network roundtrip penalty

          serialization penalty




Mittwoch...
Conclusion                                    Caching




                              File System   APC



             ...
Conclusion                                                     Caching




                                    APC        ...
Sharding



Mittwoch, 18. November 2009
Single Table                     Database




                              Data




Mittwoch, 18. November 2009
Single Table                       PRO



          simple, easy for the begin




Mittwoch, 18. November 2009
Single Table                        CON



          slow

          read/write lock problematic

          doesn‘t scale ...
Offline/Online Table                                          Database




                         Online,       Once per...
Offline/Online Table                           PRO



          simple architecture

          separation between read & w...
Offline/Online Table                                           CON



          writes not scalable

          generation ...
Sharding #1 Generation                            Database




                                  Flight Server




       ...
Sharding #1 Generation                                      PRO



          Scalable!

          Still fast with hundreds...
Sharding #1 Generation                                        CON



          Queries are limited by shards, you can‘t jo...
Sharding #2 Generation                             Database




                                   Flight Server




     ...
Sharding #2 Generation                                   PRO



          More stable (INNODB vs. MEMORY)

          Fast ...
Sharding #2 Generation                                       CON



          Slower ( MEMORY faster than INNODB)

       ...
„Questions?“


Mittwoch, 18. November 2009
Upcoming SlideShare
Loading in...5
×

Caching, sharding, distributing - Scaling best practices

3,677

Published on

The german travel meta search engine Swoodoo was hit by heavy load spikes due to TV advertisments. Learn about the successful caching, hosting and database strategies we've implemented, and which did not work well. Covering file-based Caching, APC, memcached and sharded database layouts on to our experiences with fully virtualized hosting.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,677
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
96
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Transcript of "Caching, sharding, distributing - Scaling best practices"

  1. 1. Caching, sharding, distributing - Scaling best practices. 18.11.2009 Lars Jankowfsky CTO swoodoo AG Mittwoch, 18. November 2009
  2. 2. About me: PHP, C++, Developer, Software Architect since 1992 PHP since 1998 Many successful projects from 2 to 20 developers Running right now three projects using eXtreme Programming CTO and (Co-)Founder swoodoo AG (Co-)Founder OXID eSales AG Mittwoch, 18. November 2009
  3. 3. LOAD? Average 17, Maximum 138 Mittwoch, 18. November 2009
  4. 4. Scaling? Scaling Distributing Caching Sharding Mittwoch, 18. November 2009
  5. 5. (c) istockphoto Mittwoch, 18. November 2009
  6. 6. Scaling Mittwoch, 18. November 2009
  7. 7. SOA Scaling Your App Your App Your App Your App Your App Your App Mittwoch, 18. November 2009
  8. 8. SOA Scaling GUI/Frontend API Your App Engine Database Mittwoch, 18. November 2009
  9. 9. SOA Scaling GUI/Frontend GUI/Frontend GUI/Frontend API API Engine Database Mittwoch, 18. November 2009
  10. 10. SOA PRO Scalable! You can add Servers where you need them Easier maintainable More robust easy to introduce HA Cloud... Mittwoch, 18. November 2009
  11. 11. SOA CON A lot of work.... Difficult to test when doing TDD Complex deployment Mittwoch, 18. November 2009
  12. 12. Distributing Mittwoch, 18. November 2009
  13. 13. Virtual Machines Distributing GUI API Engine Server 1 GUI API Server 2 DB GUI API Mittwoch, 18. November 2009
  14. 14. Virtual Machines Distributing GUI API API GUI Server 1 API Server 2 API Server 2 GUI API API GUI Engine Server 1 GUI Server 2 DB Server 2 GUI Mittwoch, 18. November 2009
  15. 15. Virtual Machines PRO Easy to distribute on new hardware as needed Isolated, separated services even on one machine Easy to install when using templates (DB, GUI...) Very good for testing, staging Mittwoch, 18. November 2009
  16. 16. Virtual Machines CON Hardware failure.... Costs (at least for VMWare) Performance penalty (15%) Limitations (VMWare only 4 CPU‘s, VSphere 8...) Some resources can‘t be virtualized (Disk I/O) Mittwoch, 18. November 2009
  17. 17. Caching Mittwoch, 18. November 2009
  18. 18. Mittwoch, 18. November 2009
  19. 19. Caching GUI/Frontend API Engine Database Mittwoch, 18. November 2009
  20. 20. Files PRO simple, easy for the begin good for a „share nothing“ architecture Mittwoch, 18. November 2009
  21. 21. Files CON hits the HDD consumes memory (file system cache) local cache, can‘t be reused by different servers manual handling of expiration serialization penalty Mittwoch, 18. November 2009
  22. 22. APC PRO OPCODE Cache Invalidation and size limits are automatically handled good for a „share nothing“ architecture Mittwoch, 18. November 2009
  23. 23. APC CON bloats web server (apache) process memory local cache, can‘t be reused by different servers Mittwoch, 18. November 2009
  24. 24. memcached PRO can be used by several servers Invalidation and size limits are automatically handled Mittwoch, 18. November 2009
  25. 25. memcached CON network roundtrip penalty serialization penalty Mittwoch, 18. November 2009
  26. 26. Conclusion Caching File System APC memcached Mittwoch, 18. November 2009
  27. 27. Conclusion Caching APC memcached opcode cache rarely used local data Mittwoch, 18. November 2009
  28. 28. Sharding Mittwoch, 18. November 2009
  29. 29. Single Table Database Data Mittwoch, 18. November 2009
  30. 30. Single Table PRO simple, easy for the begin Mittwoch, 18. November 2009
  31. 31. Single Table CON slow read/write lock problematic doesn‘t scale properly Mittwoch, 18. November 2009
  32. 32. Offline/Online Table Database Online, Once per hour Offline, read only write only MYISAM INNODB Mittwoch, 18. November 2009
  33. 33. Offline/Online Table PRO simple architecture separation between read & write access very fast reads Mittwoch, 18. November 2009
  34. 34. Offline/Online Table CON writes not scalable generation process will take longer with more data „stale“ data might occur in read table, no „live“ feeling after generation of read table, is „cold“ again. Slow! Mittwoch, 18. November 2009
  35. 35. Sharding #1 Generation Database Flight Server master master master master MEMORY MEMORY MEMORY MEMORY slave slave slave slave MYISAM MYISAM MYISAM MYISAM Mittwoch, 18. November 2009
  36. 36. Sharding #1 Generation PRO Scalable! Still fast with hundreds of millions of records Separates Database logic from system, easy scalable Moving, Adding, Deleting shards on the fly query can be run on various machines in parallel -> Fast! Mittwoch, 18. November 2009
  37. 37. Sharding #1 Generation CON Queries are limited by shards, you can‘t join all shards Complex to develop, special „protocol“ needed for the queries Custom Queries not possible, no SQL any more in your App. Difficult to maintain data (import, export, purge...) After failure or power loss it takes a while to rebuild tables Memory table leak Mittwoch, 18. November 2009
  38. 38. Sharding #2 Generation Database Flight Server master master master master INNODB INNODB INNODB INNODB slave slave slave slave MYISAM MYISAM MYISAM MYISAM Mittwoch, 18. November 2009
  39. 39. Sharding #2 Generation PRO More stable (INNODB vs. MEMORY) Fast failover Slave hardware can be used for production shards Mittwoch, 18. November 2009
  40. 40. Sharding #2 Generation CON Slower ( MEMORY faster than INNODB) but that‘s ok, we got additional machines (slaves..) Mittwoch, 18. November 2009
  41. 41. „Questions?“ Mittwoch, 18. November 2009
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×