Scalable Web Architectures and Infrastructure

3,772 views

Published on

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,772
On SlideShare
0
From Embeds
0
Number of Embeds
32
Actions
Shares
0
Downloads
134
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Scalable Web Architectures and Infrastructure

  1. 1. Scalable Web Architectures and Infrastructure Chris Munt M/Gateway Developments Ltd
  2. 2. Topics <ul><li>Web Servers </li></ul><ul><ul><li>Microsoft IIS, Apache, Sun JSWS </li></ul></ul><ul><ul><li>Development trends </li></ul></ul><ul><li>Web application development/run-time environments </li></ul><ul><ul><li>CSP, WebLink </li></ul></ul><ul><ul><li>Dedicated: JSP, ASP.NET, PHP etc … </li></ul></ul><ul><ul><li>Pre-existing: Perl, Python, Ruby etc … </li></ul></ul><ul><li>Creating high-performance and scalable web infrastructure </li></ul>
  3. 3. Evolution <ul><li>“ Since 1999, the web has grown from a document retrieval system into an application delivery system. ” </li></ul><ul><ul><li>Douglas Crockford at AjaxWorld 2008 </li></ul></ul><ul><li>Implications for: </li></ul><ul><ul><li>Web Server extensibility </li></ul></ul><ul><ul><li>Resilience </li></ul></ul><ul><ul><li>Performance and scalability </li></ul></ul><ul><ul><li>Architecture </li></ul></ul>
  4. 4. Web Servers: Market Share <ul><li>2005: </li></ul><ul><ul><li>Apache 71% </li></ul></ul><ul><ul><li>IIS 20% </li></ul></ul><ul><li>2008: </li></ul><ul><ul><li>Apache 50% </li></ul></ul><ul><ul><li>IIS 35% </li></ul></ul><ul><li>Source: Netcraft </li></ul>
  5. 5. Internet Information Services ( IIS) <ul><li>v1.0 Introduced as free add-on for NT v3.51 (mid 1990s) </li></ul><ul><ul><li>Single multi-threaded process </li></ul></ul><ul><li>Extensibility </li></ul><ul><ul><li>Common Gateway Interface (CGI) </li></ul></ul><ul><ul><li>Internet Server Application Programming Interface (ISAPI) </li></ul></ul>
  6. 6. IIS: CGI <ul><li>Modules implemented as stand-alone scripts/executables </li></ul><ul><li>Application requests processed in separate server process </li></ul><ul><li>Non-optimal but secure </li></ul><ul><ul><li>Overhead of starting new process to serve each request; Overhead of inter-process communication </li></ul></ul><ul><ul><li>Application crash does not impact hosting web server </li></ul></ul><ul><li>Architecturally the same for all web servers </li></ul>
  7. 7. IIS: ISAPI <ul><li>Modules implemented as Windows DLLs </li></ul><ul><ul><li>Can be a Filter (e.g. GZIP) or Extension (e.g. PHP, CSP) </li></ul></ul><ul><li>Application requests serviced in process space of IIS host </li></ul><ul><li>Optimal but there are risks </li></ul><ul><ul><li>No overhead in process management and inter-process communication </li></ul></ul><ul><ul><li>Application crash does impact hosting web server </li></ul></ul><ul><li>Supported for all versions of IIS </li></ul><ul><ul><li>The only web server API (excluding CGI) that’s completely backwards (and forwards) compatible! </li></ul></ul>
  8. 8. IIS v5.0 & Windows 2000 <ul><li>Concept of isolation levels (or application protection levels). </li></ul><ul><ul><li>Low </li></ul></ul><ul><ul><ul><li>All in same process </li></ul></ul></ul><ul><ul><li>Medium </li></ul></ul><ul><ul><ul><li>ISAPI extensions run in a separate process </li></ul></ul></ul><ul><ul><li>High </li></ul></ul><ul><ul><ul><li>ISAPI extensions run in a separate process per application </li></ul></ul></ul><ul><ul><ul><li>An application is broadly defined in terms of its path </li></ul></ul></ul>
  9. 9. IIS v6.0 & Windows 2003 <ul><li>Concept of Worker Process Isolation and Application pools </li></ul><ul><ul><li>Application pool </li></ul></ul><ul><ul><ul><li>Applications associated with one or more worker processes </li></ul></ul></ul><ul><ul><li>Web Gardens </li></ul></ul><ul><ul><ul><li>Multiple worker processes supporting an application </li></ul></ul></ul><ul><ul><ul><li>Not to be confused with web farms where multiple web server installations manage the work load </li></ul></ul></ul><ul><ul><li>Process recycling </li></ul></ul><ul><ul><li>Process idle-time timeout </li></ul></ul>
  10. 10. IIS v7.0 & Windows 2008 <ul><li>Major upgrade </li></ul><ul><ul><li>Previewed in Vista: A good reason to get Vista </li></ul></ul><ul><ul><li>Modular architecture </li></ul></ul><ul><ul><ul><li>Administrators choose the modules required </li></ul></ul></ul><ul><ul><li>Improved security </li></ul></ul><ul><ul><li>Application pools </li></ul></ul><ul><ul><li>New configuration schema. XML based. </li></ul></ul><ul><li>New API </li></ul><ul><ul><li>ISAPI still supported (as a supplied module) </li></ul></ul><ul><ul><li>Third-party modules can be added </li></ul></ul>
  11. 11. IIS Resources <ul><li>www.microsoft.com </li></ul><ul><li>www.iis.net </li></ul><ul><ul><li>IIS developers participate </li></ul></ul>
  12. 12. Apache <ul><li>v1.0 Consolidated in mid-90s </li></ul><ul><ul><li>Robert McCool </li></ul></ul><ul><ul><li>Successor to NCSA HTTPd ( National Center for Supercomputing Application) </li></ul></ul><ul><li>Modular architecture </li></ul><ul><li>Multi-threaded for Windows; Multi-process for UNIX </li></ul><ul><li>Extensibility </li></ul><ul><ul><li>Common Gateway Interface (CGI) </li></ul></ul><ul><ul><li>Apache API </li></ul></ul>
  13. 13. Apache: API <ul><li>Modules implemented as Dynamic Shared Objects (DSOs) </li></ul><ul><ul><li>DLLS for Windows, Shared Objects/Libraries for UNIX, Shareable Images for OpenVMS </li></ul></ul><ul><li>Modules can process any phase of request/response cycle </li></ul><ul><ul><li>Request handling (e.g. mod_csp, mod_weblink) </li></ul></ul><ul><ul><li>Authentication (e.g. mod_auth) </li></ul></ul><ul><ul><li>Security (e.g. mod_ssl) </li></ul></ul><ul><ul><li>Filters (e.g. mod_deflate) </li></ul></ul><ul><ul><li>Miscellaneous (e.g. mod_rewrite) </li></ul></ul>
  14. 14. Apache: v2.0 - 2002 <ul><li>Substantial rewrite of core with improved modularization </li></ul><ul><li>Multi-threaded for Windows and OpenVMS </li></ul><ul><li>Support for UNIX threading </li></ul><ul><ul><li>Multi-process/Multi-threaded hybrid server </li></ul></ul><ul><li>New API </li></ul><ul><ul><li>Use mod_csp2.so instead of mod_csp.so </li></ul></ul>
  15. 15. Apache: v2.2 - 2005 <ul><li>Improved Modules – particularly for authorization </li></ul><ul><li>Same API but binary incompatibilities </li></ul><ul><ul><li>Use mod_csp22.so instead of mod_csp2.so </li></ul></ul>
  16. 16. Web Servers based on Apache <ul><li>HP Secure Web Server (HPSWS) </li></ul><ul><ul><li>For OpenVMS </li></ul></ul><ul><ul><li>v1.3-1 based on Apache v1.3.26 </li></ul></ul><ul><ul><li>v2.1-1 based on Apache v2.0.52 </li></ul></ul><ul><ul><li>Tomcat (Java/JSP), Perl, PHP </li></ul></ul><ul><ul><li>CSP </li></ul></ul><ul><li>Many others </li></ul>
  17. 17. Apache: Resources <ul><li>www.apache.org </li></ul>
  18. 18. Sun JSWS <ul><li>Netscape Enterprise (mid 1990s) </li></ul><ul><ul><li>FastTrack – lightweight offering </li></ul></ul><ul><li>iPlanet (late 1990s) </li></ul><ul><ul><li>America Online, Sun </li></ul></ul><ul><li>Sun ONE (early 2000s) </li></ul><ul><li>Sun Java System Web Server (mid 2000s) </li></ul><ul><ul><li>Resources: www.sun.com </li></ul></ul>
  19. 19. Sun JSWS <ul><li>Multi-threaded for Windows </li></ul><ul><li>Multi-process/Multi-threaded hybrid for UNIX </li></ul><ul><li>Extensibility </li></ul><ul><ul><li>Common Gateway Interface (CGI) </li></ul></ul><ul><ul><li>Netscape Server Application Programming Interface (NSAPI) </li></ul></ul><ul><ul><ul><li>Modules implemented as DLLS for Windows, Shared Objects/Libraries for UNIX </li></ul></ul></ul><ul><ul><ul><li>Binary incompatibility between Netscape Enterprise v2 and v3 </li></ul></ul></ul>
  20. 20. Trends: All web servers <ul><li>Multi-process/Multi-threaded servers </li></ul><ul><li>Increased flexibility/extensibility through modularization </li></ul><ul><li>Better security </li></ul>
  21. 21. Trends: Implications for WebLink and CSP <ul><li>State-aware mode (preserve mode 1 for CSP) increasingly non-optimal </li></ul><ul><ul><li>Connection pool distributed amongst multiple process with no control over the relationship between requests and processes </li></ul></ul><ul><ul><li>Using Network Service Daemon (NSD) in single process mode becomes essential </li></ul></ul><ul><ul><ul><li>Channel load into single multi-threaded process </li></ul></ul></ul><ul><ul><ul><li>Maintain single process pool </li></ul></ul></ul>
  22. 22. Web Standards? <ul><li>Applications using AJAX techniques generate much more HTTP traffic than conventional web applications. </li></ul><ul><ul><li>Poor performance unless connection between client and server is kept open between requests. </li></ul></ul><ul><li>HTTP v1.0: Asymmetric protocol by default </li></ul><ul><ul><li>Client opens connection to server then sends request </li></ul></ul><ul><ul><li>Server sends response then closes connection to server </li></ul></ul>
  23. 23. HTTP Keep-Alive <ul><li>Connection kept open between requests </li></ul><ul><ul><li>HTTP v1.0: Default off </li></ul></ul><ul><ul><ul><li>Connection: Keep-Alive (switch on) </li></ul></ul></ul><ul><ul><li>HTTP v1.1: Default on </li></ul></ul><ul><ul><ul><li>Connection: Close (switch off) </li></ul></ul></ul><ul><li>Must send response size notification </li></ul><ul><ul><li>Content-Length </li></ul></ul><ul><ul><li>Chunked transfer (HTTP v1.1) </li></ul></ul><ul><ul><ul><li>Transfer-Encoding: chunked </li></ul></ul></ul><ul><ul><li>WebLink/CSP Gateway will do this automatically </li></ul></ul>
  24. 24. HTTP Connections: The current status <ul><li>Standard specifies the maximum number of simultaneous connections to a given server </li></ul><ul><ul><li>HTTP v1.0: usually 4 </li></ul></ul><ul><ul><li>HTTP v1.1: always 2 (Section 8.1.4 RFC2616) </li></ul></ul><ul><li>Objective: to improve response times and avoid congestion. </li></ul><ul><li>Can change setting in browser configuration </li></ul><ul><ul><li>Inappropriate for web based applications </li></ul></ul><ul><ul><li>Proxys will implement standard </li></ul></ul>
  25. 25. HTTP Connections: The future <ul><li>Now recognised that high-bandwidth connections are now commonplace </li></ul><ul><ul><li>Key development since HTTP v1.1 which was drafted in January 1997. </li></ul></ul><ul><ul><li>Client-side bandwidth no longer gating factor in connection speed. </li></ul></ul><ul><li>IE v8 will almost certainly increase the number of connections to 6 </li></ul><ul><ul><li>Direct response to needs of AJAX applications </li></ul></ul>
  26. 26. Application Development CSP & WebLink <ul><li>WebLink (1996); CSP (~2000) </li></ul><ul><li>Implemented over CGI and Web Server APIs </li></ul><ul><li>Proxy to Caché </li></ul><ul><ul><li>Responses generated entirely in Caché </li></ul></ul><ul><li>Support for state-aware sessions </li></ul><ul><ul><li>Migration of legacy M/Caché code to the web </li></ul></ul><ul><li>WebLink Event Broker (1998) </li></ul><ul><ul><li>Early incarnation of AJAX-like techniques </li></ul></ul><ul><ul><li>In-form scriptable communication with server </li></ul></ul><ul><ul><ul><li>Initially Java applet based; then XMLHTTP </li></ul></ul></ul><ul><ul><li>CSP equivalent: Hyperevents </li></ul></ul>
  27. 27. Application Development JSP <ul><li>Specified by Sun </li></ul><ul><li>Apache Tomcat </li></ul><ul><ul><li>web container or application server </li></ul></ul><ul><ul><li>Implements Java Servlet and JSP </li></ul></ul><ul><ul><li>Apache mod_jk (Jakarta) manages communication between Apache and Tomcat </li></ul></ul><ul><li>Database access </li></ul><ul><ul><li>JDBC, web services </li></ul></ul>
  28. 28. Application Development ASP & ASP.NET <ul><li>Microsoft IIS </li></ul><ul><li>Classic ASP (~1996) </li></ul><ul><ul><li>script based and interpretive </li></ul></ul><ul><li>ASP.NET (~2002) </li></ul><ul><ul><li>Compiled, dependent on .Net framework </li></ul></ul><ul><li>Database access </li></ul><ul><ul><li>ADO.Net (base class library) </li></ul></ul><ul><ul><li>ODBC data provider </li></ul></ul><ul><ul><li>Web Services </li></ul></ul>
  29. 29. Application Development PHP <ul><li>PHP ( Hypertext Preprocessor) </li></ul><ul><li>Created 1994 </li></ul><ul><li>Used for over 20 million web sites </li></ul><ul><li>Most popular Apache module </li></ul><ul><li>M-like associative arrays </li></ul><ul><li>Increasing OO capability </li></ul><ul><li>Interfaces to numerous SQL-based databases </li></ul><ul><ul><li>MySQL popular choice </li></ul></ul>
  30. 30. Application Development Perl <ul><li>Created 1987 </li></ul><ul><li>General purpose scripting language </li></ul><ul><li>Emphasis on text processing </li></ul><ul><ul><li>Suited to the needs of web programming </li></ul></ul><ul><li>DBI (Database Interface) modules </li></ul>
  31. 31. Application Development Python <ul><li>Created 1991 </li></ul><ul><li>General purpose scripting environment </li></ul><ul><ul><li>Some implementations include compiler </li></ul></ul><ul><li>Multi-paradigm programming environment </li></ul><ul><ul><li>Functional </li></ul></ul><ul><ul><li>Object Oriented </li></ul></ul><ul><li>Large standard library </li></ul><ul><ul><li>Modules for processing web requests </li></ul></ul><ul><ul><li>Modules for database access </li></ul></ul>
  32. 32. Application Development Ruby (on Rails) <ul><li>Created mid 1990s </li></ul><ul><li>General purpose </li></ul><ul><li>Multi-paradigm programming environment </li></ul><ul><ul><li>Functional </li></ul></ul><ul><ul><li>Object Oriented (Many ideas from Perl and Smalltalk) </li></ul></ul><ul><li>Ruby On Rails: Created 2004 </li></ul><ul><ul><li>Complete web application development framework </li></ul></ul><ul><ul><li>Consists of several packages </li></ul></ul><ul><ul><ul><li>ActiveRecord – Object Relational mapping </li></ul></ul></ul>
  33. 33. MGWSI Gateway <ul><li>Uniform/Normalized interface to Cach é </li></ul><ul><li>Underpins Enterprise Web Developer (EWD) </li></ul><ul><li>Currently supported: </li></ul><ul><ul><li>PHP (m_php) </li></ul></ul><ul><ul><li>JSP (m_jsp) </li></ul></ul><ul><ul><li>ASP.NET (m_aspx) </li></ul></ul><ul><li>Future support anticipated: </li></ul><ul><ul><li>Ruby (m_ruby) </li></ul></ul><ul><ul><li>Python (m_python) </li></ul></ul><ul><ul><li>Perl (m_perl) </li></ul></ul><ul><li>www.mgateway.com </li></ul>
  34. 34. Application Development Scalability I <ul><li>Non CSP/WebLink environments </li></ul><ul><ul><li>Requests processed and responses generated on web server host </li></ul></ul><ul><ul><li>Possible multiple round trips to database </li></ul></ul><ul><ul><li>Will need to increase capacity of web server tier sooner </li></ul></ul><ul><ul><ul><li>Load-Balancing techniques </li></ul></ul></ul>
  35. 35. Application Development Scalability II <ul><li>CSP/WebLink </li></ul><ul><ul><li>Web server extension as intelligent proxy/router </li></ul></ul><ul><ul><li>Requests and complete response generated in Caché </li></ul></ul><ul><ul><li>Single round trip to database per request/response cycle </li></ul></ul><ul><ul><li>Integrated scripting environment and database ideal for web application run-time environment </li></ul></ul><ul><ul><li>Better performance/throughput per web server </li></ul></ul>
  36. 36. Scalable Web Applications Conclusion <ul><li>Keep web servers updated </li></ul><ul><li>Keep eye on developments in standards </li></ul><ul><li>Use HTTP Keep-alive for applications using AJAX techniques </li></ul><ul><li>Don’t use state-aware mode in WebLink or CSP </li></ul><ul><li>Follow these guidelines and future scalability options will remain open </li></ul><ul><ul><li>Server clusters/farms </li></ul></ul><ul><ul><li>Load-balancing/Fail-Over </li></ul></ul>

×