Your SlideShare is downloading. ×
0
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

MNPHP Scalable Architecture 101 - Feb 3 2011

1,368

Published on

An overall presentation on scaling out your system starting from a single server and many of the several options you may face.

An overall presentation on scaling out your system starting from a single server and many of the several options you may face.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,368
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Mike Willbanks Blog: http://blog.digitalstruct.com Twitter : mwillbanks IRC : lubs on freenode Scalable Architectures 101 MNPHP Feb 3, 2011
  • 2. Scalability? <ul>Your application is growing, your systems are slowing and growth is inevitable... <li>Where do we go from here? </li><ul><li>Load Balancing
  • 3. Web Servers
  • 4. Database Servers
  • 5. Cache Servers </li></ul></ul><ul><ul><li>Job Servers
  • 6. DNS Servers
  • 7. CDN Servers
  • 8. Front-End Performance </li></ul></ul>
  • 9. The Beginning... <ul>Single Server Syndrome <li>One Server Many Functions </li><ul><li>Web Server, Database Server, Cache Server, Job Server, DNS Server, Mail Server.... </li></ul><li>How we know it&apos;s time </li><ul><li>iostat, cpu load, overall degradation </li></ul></ul>
  • 10. The Next Step... <ul>Single Separation Syndrome <li>Separation of Web and Database </li><ul><li>Fix the main disk I/O bottleneck. </li></ul><li>However, we can&apos;t handle our current I/O, CPU or amount of requests on our web server. </li></ul>
  • 11. Load Balancing
  • 12. Load Balancing Our Environment
  • 13. Several Options <ul><li>DNS Rotation (Little to No Cost) </li><ul><li>Not very reliable, but works on a small scale. </li></ul><li>Software Based (Commodity Server Cost) </li><ul><li>HAProxy, Pound, Varnish, Squid, Wackamole, Perlbal, Web Server Proxy... </li></ul><li>Hardware Based (High Cost Appliance) </li><ul><li>Several vendors ranging based on need. </li><ul><li>A10, F5, etc. </li></ul></ul></ul>
  • 14. Routing Types of Load Balancers <ul><li>Round Robin
  • 15. Static
  • 16. Least Connections
  • 17. Source
  • 18. IP
  • 19. Basic Authentication </li></ul><ul><li>URI
  • 20. URI Parameter
  • 21. Header
  • 22. Cookie
  • 23. Regular Expression </li></ul>
  • 24. Open Source Software Options <ul><li>Out of the many options we will focus in on 3 </li><ul><li>HAProxy – By and large one of the most popular.
  • 25. Pound – Said to be great for medium traffic sites.
  • 26. Varnish – A caching solution that also does load balancing </li></ul></ul>
  • 27. HAProxy <ul><li>Pros </li><ul><li>Extremely full featured
  • 28. Very well known
  • 29. Handles just about every type of routing
  • 30. Several examples online
  • 31. Has a web-based GUI </li></ul><li>Cons </li><ul><li>No native SSL support (use Stunnel)
  • 32. Setup can be complex and take a lot of time </li></ul></ul>
  • 33. Sample HAProxy Configuration global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 user haproxy group haproxy daemon defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 listen localhost 0.0.0.0:80 option httpchk GET / balance roundrobin cookie SERVERID server serv1 0.0.0.0:8080 check inter 2000 rise 2 fall 5 server serv2 0.0.0.0:8080 check inter 2000 rise 2 fall 5 option httpclose stats enable stats uri /lb?stats stats realm haproxy stats auth test:test
  • 34. Pound <ul><li>Pros </li><ul><li>chroot support
  • 35. Native SSL support
  • 36. Insanely simple setup
  • 37. Supports virtually all types of routing
  • 38. Many online tutorials </li></ul><li>Cons </li><ul><li>No native SSL support (use Stunnel)
  • 39. Setup can be complex and take a lot of time </li></ul></ul>
  • 40. Sample Pound Configuration User &amp;quot;www-data&amp;quot; Group &amp;quot;www-data&amp;quot; LogLevel 1 Alive 30 Control &amp;quot;/var/run/pound/poundctl.socket&amp;quot; ListenHTTP Address 127.0.0.1 Port 80 xHTTP 0 Service BackEnd Address 127.0.0.1 Port 8080 End BackEnd Address 127.0.0.1 Port 8080 End End End
  • 41. Varnish <ul><li>Pros </li><ul><li>Supports front-end caching
  • 42. Farily simple setup
  • 43. Extremely well known
  • 44. Many online tutorials
  • 45. Large suite of tools (varnishstat, varnishtop, varnishlog, varnishreplay, varnishncsa) </li></ul><li>Cons </li><ul><li>No native SSL support (use Pound or Stunnel)
  • 46. If you want a WebGUI you must PAY </li></ul></ul>
  • 47. Sample Varnish Configuration backend default1 { .host = &amp;quot;127.0.0.1&amp;quot;; .port = &amp;quot;8080&amp;quot;; .probe = { .url = &amp;quot;/&amp;quot;; .interval = 5s; .timeout = 1s; .window = 5; .threshold = 3; } } backend default2 { .host = &amp;quot;127.0.0.1&amp;quot;; .port = &amp;quot;8080&amp;quot;; .probe = { .url = &amp;quot;/&amp;quot;; .interval = 5s; .timeout = 1s; .window = 5; .threshold = 3; } } director default round-robin { { .backend = default1; } { .backend = default2; } } sub vcl_recv { if (req.http.host ~ &amp;quot;^127.0.0.1$&amp;quot;) { set req.backend = default; } }
  • 48. What We Need to Remember <ul><li>Web Servers </li><ul><li>One always needs to be available
  • 49. Don&apos;t use SSL on the web server level! </li></ul><li>Headers </li><ul><li>Pass headers if SSL is on or not
  • 50. Client IP is likely on X-forwarded-for
  • 51. If using Virtual Hosts pass the Host </li></ul><li>Sessions </li><ul><li>Need a solution if not using sticky routing </li></ul></ul>
  • 52. Web Servers
  • 53. Several Options <ul><li>Apache
  • 54. IIS
  • 55. Nginx
  • 56. Lighttpd
  • 57. etc. </li></ul>
  • 58. Configuration <ul><li>Sever name should be the same on all servers </li><ul><li>Make a server alias so you can reach individual servers w/o load balancing </li></ul><li>Each configuration SHOULD or MUST be the same.
  • 59. Client IP will likely be in X-forwarded-for.
  • 60. SSL will not be in $_SERVER[&apos;HTTPS&apos;] and HTTP_ header instead. </li></ul>
  • 61. What We Need to Remember <ul><li>Files </li><ul><li>All web servers need our files.
  • 62. Static content could be tagged in version control.
  • 63. Static content may need a file server / CDN / etc.
  • 64. User Generated content on NFS mount or served from the cloud or a CDN. </li></ul><li>Sessions </li><ul><li>All web servers need access to our sessions.
  • 65. Remember disk is slow and the database will be a bottleneck. How about distributed caching? </li></ul></ul>
  • 66. Other Thoughts <ul><li>Running PHP on your web server may be a resource hog, you may want to offload static content requests to nginx, lighttpd or some other lightweight web server. </li><ul><li>Running a proxy to your main web servers works great for hardworking processes. While serving static content from the lightweight server. </li></ul></ul>
  • 67. Database Servers
  • 68. Where We All Start <ul>Single Database Server <li>Lots of options and steps as we move forward. </li></ul>
  • 69. Replication <ul>Single Master, Single Slave <li>Write code that can write to the master and read from the slave. </li><ul><li>Exception: Be smart, don&apos;t write to the master and read from the slave on the table you just wrote to. </li></ul></ul>
  • 70. Multiple Slaves <ul>Single Master, Multiple Slaves <li>It is a great time to start to implement connection pooling. </li></ul>
  • 71. Multiple Masters <ul>Multiple Master, Multiple Slaves <li>Do NOT write to both masters at once with MySQL!
  • 72. Be warned, auto-incrementing now should change so you do not conflict. </li></ul>
  • 73. Partitioning <ul>Segmenting your Data <li>Vertical Partitioning </li><ul><li>Move less accessed columns, large data columns and columns not likely in the where to other tables. </li></ul><li>Horizontal Partitioning </li><ul><li>Done by moving rows into different tables. </li><ul><li>Based on Range, Date, User or Interlaced </li></ul></ul></ul>
  • 74. What We Need to Remember <ul><li>Replication </li><ul><li>There may be a lag!
  • 75. All reports / read queries should go here
  • 76. Don&apos;t read here directly after a write </li><ul><li>Transactions / Lag / etc. </li></ul></ul><li>Sessions </li><ul><li>Never store sessions in the DB </li><ul><li>Large binlogs, garbage collection causes slow queries, queue may fill up and cause a crash or max connections. </li></ul></ul></ul>
  • 77. Cache Servers (not full page)
  • 78. Caching <ul>“ Caching is imperative in scaling and performance” <ul><li>Single Server </li><ul><li>Shared Memory: APC / Xcache / etc
  • 79. File Based: Files / Sqlite / etc
  • 80. Not highly scalable, great for configuration files. </li></ul><li>Distributed </li><ul><li>Memcached, Redis, etc.
  • 81. Setup consistent hashing. </li></ul></ul><li>Do not cache what cannot be re-created. </li></ul>
  • 82. Caching <ul>In The Beginning <li>Single Caching Server
  • 83. Start to cache fetches, invalidate cache on write and write new cache, always reading from the cache. </li></ul>
  • 84. Distributed Caching <ul>Distributed Mania <li>Write based on consistent hashing (hash of a key that you are writing)
  • 85. Server depends on the hash.
  • 86. Hint – use the memcached pecl extension. </li></ul>
  • 87. The Read / Write Process <ul>In the most simple form... </ul>
  • 88. What We Need to Remember <ul><li>Replicated or not...
  • 89. Elasticity </li><ul><li>Consistent hashing – cannot add or remove w/o losing data </li></ul><li>Sessions </li><ul><li>Store me here... please please please! </li></ul><li>Memory Caches </li><ul><li>Durability - If it fails, it&apos;s gone!
  • 90. Ensure dedicated memory!
  • 91. If you run out of memory, does it remove an old and add the new or not allow anything to come in? </li></ul></ul>
  • 92. Job Servers
  • 93. “ Message queues and mailboxes are software-engineering components used for interprocess communication, or for inter-thread communication within the same process. They use a queue for messaging – the passing of control or of content.” http://en.wikipedia.org/wiki/Message_queue
  • 94. Messages are Everywhere
  • 95. What are Message Queues <ul><li>A FIFO buffer
  • 96. Asynchronous push / pull
  • 97. An application framework for sending and receiving messages.
  • 98. A way to communicate between applications / systems.
  • 99. A way to decouple components.
  • 100. A way to offload work. </li></ul>
  • 101. Where We All Start <ul>Single Job Server <li>Lots of options and steps as we move forward. </li></ul>Queue Receive Producer Message Queue Server Consumer
  • 102. Distributed Job Servers <ul>Distributed Mania <li>Load balance a message queue for scale
  • 103. Can continue to create more workers </li></ul>Producer Message Queue Server Consumer Consumer Consumer Consumer Consumer Message Queue Server Message Queue Server Producer Producer
  • 104. Why are Message Queues Useful? <ul><li>Asynchronous Processing
  • 105. Communication between Applications / Systems
  • 106. Image Resizing
  • 107. Video Processing
  • 108. Sending out Emails
  • 109. Auto-Scaling Virtual Instances
  • 110. Log Analysis
  • 111. The list goes on... </li></ul>
  • 112. What We Need to Remember <ul><li>Replication or not?
  • 113. You need to keep your workers running </li><ul><li>Supervisord or monit or some other monitoring... </li></ul><li>Don&apos;t offload things just to offload </li><ul><li>If it needs to be real-time and not near real-time this is not a good place for things – however, your boss does not need to know :) </li></ul></ul>
  • 114. DNS Servers
  • 115. What to do <ul><li>Just about every domain registrar runs DNS </li><ul><li>DO NOT RUN YOUR OWN! </li></ul><li>Anycast DNS </li><ul><li>Anycast is a network addressing and routing scheme whereby data is routed to the &amp;quot;nearest&amp;quot; or &amp;quot;best&amp;quot; destination as viewed by the routing topology.
  • 116. It&apos;s sexy, it&apos;s sweet and it is FAST!
  • 117. A “cheaper” provider is DNS Made Easy. </li><ul><li>Yes the interface is ugly. </li></ul></ul></ul>
  • 118. What to look for... <ul><li>Wildcard support
  • 119. Failover / Distributed
  • 120. CNAME support
  • 121. TXT support
  • 122. Name Server support </li></ul>
  • 123. CDN Servers
  • 124. Why Use a CDN <ul><li>Free your bandwidth
  • 125. Free your server from serving basic files
  • 126. Distributed servers around the globe </li></ul>
  • 127. What you need to know <ul><li>Origin Pull </li><ul><li>Utilizes your own web server and pulls the content and stores it in their nodes. </li></ul><li>PoP Pull </li><ul><li>You upload the content to something like S3 and it has a CDN on the top of it like CloudFront. </li></ul></ul>
  • 128. What&apos;s the best? <ul><li>Depends on your need...
  • 129. Origin Pull is great if you want to maintain all of the content in your web server.
  • 130. PoP Push is great for storing things like user generated content. </li></ul>
  • 131. Front-End Performance
  • 132. Discussion Points <ul><li>Tactics </li><ul><li>Minification (JavaScript / CSS)
  • 133. CSS Sprites
  • 134. GZIP
  • 135. Cookies are evil
  • 136. Parallel downloads (using subdomains for serving
  • 137. HTTP Expires </li></ul></ul>
  • 138. Discussion Points <ul><li>Tools </li><ul><li>Yslow
  • 139. Firebug
  • 140. Google Page Speed
  • 141. Google Webmaster Tools </li></ul></ul>
  • 142. Mike Willbanks Blog: http://blog.digitalstruct.com Twitter : mwillbanks IRC : lubs on freenode Questions?

×