High performance for a Web server that receive a large numbers of requests is critical success factor for a web site, but in many cases the Web server is only “tip of the iceberg” of a very large heterogeneous systems, with lots of components and technologies. This talk present best practices to design an high availability and high performance web site. The presentation will cover load balancing, Web server acceleration, and efficient management of dynamic data, that can be adopted by any sites to improve performance and availability. We also describe common mistake implemented in the web application framework that create performance limitations and bottleneck. The presentation will describe how to define monitors metrics of the service , that are the “eyes” of operation departments, and the implementation of the “red button”
3. Introduction: The value of an IT
Service
Beolink.org!
From an ITIL perspective, the value is composed by two
components: utility (fitness for purpose) and warranty
(fitness for use.)!
!
Utility is a "[f]unctionality offered by a product or service to
meet a particular need. Utility is often summarized as 'what
it does'."!
!
Warranty as "[a] promise or guarantee that a product or
service will meet its agreed requirements" and as "derived
from the positive effect of being available when needed, in
sufficient capacity, and dependably in terms of continuity
and security."!
3!
9/5/11!
4. Introduction Beolink.org!
Do you have the new Amazon web
site ?!
!
4.3 million pages/day !
= 50 pages/s, I don’t think so !!
!
1000 pages/s = 86 mil pages/day!
Interesting figure…!
4!
9/5/11!
5. Introduction Beolink.org!
How fast is fast
enough? !
5!
9/5/11!
6. Introduction: user perception Beolink.org!
80% of the end-user response time is spent on the front-end. !
!
Most of this time is tied up in downloading all the components in the page:
images, stylesheets, scripts, Flash, etc. Reducing the number of components
in turn reduces the number of HTTP requests required to render the page!
(see Netflix case studies)!
Request! Render!
css loading, asset loading, javascript loading!
Response!
Time!
6!
9/5/11!
7. Introduction Beolink.org!
Do you know your
Application Architecture ? !
7!
9/5/11!
8. Introduction: Architecture [ˈɑːkɪˌtɛktʃə] Beolink.org!
A software architecture is an abstraction of the run-time
elements of a software system during some phases of its
operation. A system may be composed of many levels of
abstraction and many phases of operation, each with its own
software architecture.!
!
A software architecture is defined by a configuration of
architectural elements--components, connectors, and data--
constrained in their relationships in order to achieve a desired set
of architectural properties.!
!
A component is an abstract unit of software instructions and
internal state that provide a transformation of data via its interface.!
!
A connector is an abstract mechanism that mediates
communication, coordination, or cooperation among components.!
!
Datum is an element of information that is transferred from a
component, or received by a component, via a connector.!
!
[1] Roy Filding theis!
8!
9/5/11!
9. Introduction: Web Architecture Beolink.org!
Presentation Layer supports a client, the
system needs to have a presentation layer through Presentation!
which the user can submit operations and obtain a
result.!
!
Middleware!
Middleware Layer is just a level of indirection
between presentation and other layers of the
system. It introduces an additional layer of
business logic encompassing all underlying Application!
systems. !
!
Application layer establishes what operations
can be performed over the system and how they Resource!
take place. It takes care of enforcing the business
rules and establishing the business processes. !
!
Resource (Datum) deals with the organization
(storage, indexing, and retrieval) of the data
necessary to support the application logic. !
9!
9/5/11!
19. Design:: Yahoo! Rules Beolink.org!
The Exceptional Performance team (Yahoo!) has identified a number of
best practices to make web pages fast. The list includes 35 best
practices divided into 7 categories.!
!
25 % without modifying infrastructure!
Content! Server! Cookie! CSS! Javascript! Images! Mobile!
• Make Fewer • Use get for • Reduce • Put • Put scripts at • Make • Keep
HTTP Req! Ajax Req! Cookie Size! Stylesheets Botton! favicon.ico components
• Make Ajax • Flush Buffer • Use Cookie- at Top! • Make small and under 25 kb!
Cacheable! Early! Gree • Avoid CSS javascript and Cacheable! • Pack
• Reduce the • …! Domains! exp! CSS ext! • Optimize! components
number of • …! • Avoid Filters! • Minify! • Do not Scale into a
DOM! • …! • …! Image in multipart
• …! HTML! document!
• …!
19!
9/5/11!
21. Design: Load Distribution Application!
Resource!
Beolink.org!
q DNS!
- Low TTL!
- GEO IP!
- Round Robin!
- More than one IP!
- Response base on system load!
- Split Components across
Domains!
q Load Balancer!
- TCP/IP!
- Layer 7!
- SSL!
q Anycast!
- Up to 32 systems per IP!
- High availability on WAN !
21!
9/5/11!
22. Design: Proxy/Caching Application!
Resource!
Beolink.org!
120,000"
100,000"
q Configure ETags!
80,000"
Throughput)
q Add expiration or 60,000"
40,000"
Cache-control
Header! 20,000"
0"
ATS"2.1.9" Nginx"0.8.53" Varnish"2.1.5"
q Extension modules! Req"/"sec"
q Reverse Proxy base
on url or domains!
q Redirect on
business logic
(Middleware)!
22!
9/5/11!
23. Design: WebServer Application!
Resource!
Beolink.org!
q VirtualHost with dedicated IP!
q Compress content !
q Process Model!
q Number of Process!
q Number of clients! Apache optimization!
!
q Spare..! Remove unneeded modules!
Set AllowOverride to None!
Avoid FollowSymLinks and SymLinksIfOwnerMatch!
Avoid content negotiation (Multiview)!
MaxClients =(Total Memory - Operating System
q KeepAlive and KeepAliveTimeout! Memory ) / Size Per Apache process.!
!
The KeepAlive directive allows MinSpareServers, MaxSpareServers, and StartServers:!
multiple requests to be sent over the Apache can spawn a maximum of 32 child processes per
second!
same TCP connection. It changes
model from Request to User!
23!
9/5/11!
24. Design: Content Delivery Network Application!
Resource!
Beolink.org!
A content delivery network or
content distribution network
(CDN) is a system of computers
containing copies of data placed
at various nodes of a network.!
!
!
!
q Preload!
q Access Control !
24!
9/5/11!
25. Design: Modular Application Logic Beolink.org!
q Distribution!
! Presentation!
q Accelerator!
q Session and Cookies!
q More instance on same Resource!
system!
!
!
25!
9/5/11!
26. Design: Distribution Beolink.org!
!
q Shared!
All components have all the
functions (round robin)!
q Function/resource!
Components are grouped by
function/resource!
q User!
Components are divided by cluster
of users!
26!
9/5/11!
29. Design: More instance Beolink.org!
q Better CPU Usage!
Many applications do not
support parallelization, then
it is not possible to
implement a scale up
approach (deploying the
applications on larger
servers)!
!
!
q Better Memory Usage!
Many applications still use 32
bits or have fixed internal
data structure!
29!
9/5/11!
30. Design: Modular Application Logic! Beolink.org!
q Choose the correct
algorithms and data
structures!
dqueue vs list, hash vs trees, locks
vs read/write locks, bloom filter!
q Memory allocation!
Reuse memory, stack vs heap,
tcmalloc!
q Make fewer system calls!
Larger writes and reads!
32. Design: Filesystem Beolink.org!
q Distributed
Filesystem!
!- Local copy/cache!
!- Parallel!
!
!
q Replication!
!- rsync+inotify!
!- DRDB!
!!
!
Do not use file system as a !
COMMUNICATION PROTOCOL !!
32!
9/5/11!
33. Design: Database Beolink.org!
q Partitioning!
Small table!
Many systems!
!
q Replication!
Separation btw Read and Write!
Different Index on different system! Tungsten Replicator!
!
q NoSQL!
Key=value!
No schema!
!
!
33!
9/5/11!
34. Design: Hierarchical Storage Beolink.org!
q Directory Server!
Split in domain!
Multi Master!
Right Index !
Avoid COS!
!
!
q JSR!
Distribution!
!
!
!
!
Do not use Directory Servers as a !
!
A STANDARD DATABASE !!
34!
9/5/11!
35. Design: Profiling Beolink.org!
The 20% of the code is responsible for 80% of the results (time)!
q Find bottleneck before production! Cmd line: top, htop, vmstat, dstat,
strace!
Statistical: Oprofile, google profile!
q Dynamic program analysis! Instrumentiing: valgrind's callgrind,
gprof!
q Show frequency of called functions ! !
q Show usage of lines in code!
q Show duration of function calls!
35!
9/5/11!
36. Design: Capacity Beolink.org!
Small TLD Zone performance evaluation
240,419 records
q Find relation btw
application tasks and
resources usage!
!
q Find max Load !
36!
9/5/11!
42. Design: Enterprise Beolink.org!
DNS GEO IP!
Load Balancer! Load Balancer! Load Balancer!
Redirect to !
Zone n!
Auth! cache! cache!
locators!
Apps Servers! Apps Servers!
Memcached! Memcached!
Meta Data!
DS slaves!
DB CR DB CR
Slaves! Slaves! Slaves! Slaves!
Zone 1! …! Zone n!
Replication to Region X!
Ds Master!
DB Master! CR Master!
Region 1!
42!
9/5/11!
44. SO: Monitoring Beolink.org!
System!
q Internal! Requests stats (per second, per IP, …)!
System! Concurrent Users!
System load (cpu, memory,disk I/O,..)!
Internal state! Number of Processes!
!
Traffic !
!! APPS!
Memory per instance!
q External! Number of elements in Data structure!
Communication Timeout!
User experience, time !
spent for each component Database!
(components time loading, Connection!
Query time!
rendering, execution,…)! Query per Table!
Top query!
!
!
q Alarm!
Define key performance
Indicator on System
Capacity !
!
44!
9/5/11!
45. SO: The red button Beolink.org!
q Improve capacity !!
Remove controls!
Increase number of systems!
Dynamic to Static !
Increase TTL of cache!
Async Operation!
Operations Queue!
!
q Disconnection!
Exclusion of Cluster of users!
Exclusion IP list!
Bandwidth reduction!
Web site Sections closure!
!
45!
9/5/11!
55. Cloud Stack Beolink.org!
Things must change!!
!
q Web UI for users, affiliates, marketing,
operations!
q Agile machine management is part of the API!
q Scale up -and down!
q Live upgrade of running system!
q Persistence with key-value stores!
q A Petabyte filesystem is part of the application!
q MapReduce jobs close the loop!
q Developers deploy to the cloud to test!
Original work:!
http://svn.apache.org/repos/asf/labs/clouds/
apache_cloud_computing_edition.pdf!
55!
56. I look forward to meeting you…! Beolink.org!
XVIII European AFS meeting 2011
HAMBURG – GERMANY
4-7 October
Who should attend:
§ Everyone interested in deploying a globally accessible
file system
§ Everyone interested in learning more about real
world usage of Kerberos authentication in single
realm and federated single sign-on environments
§ Everyone who wants to share their knowledge and
experience with other members of the AFS and
Kerberos communities
§ Everyone who wants to find out the latest
developments affecting AFS and Kerberos
More Info: http://www.openafs.org/
56!
9/5/11!
57. Thank you
manfred@freemails.ch!
!
Beolink.org!