Thanks to Massimo for the very informative presentation of the technology roadmap that awaits us. With your permission – I’d like to spend the next few minutes talking about 2 things: How we at GS see the change that our industry is going through (and no - I’m not referring to the sub-prime crisis...), How we are responding to it.
Ultra-Scalable and Blazing-Fast: The Sun Fire x4450-Intel 7460-XAP GigaSpaces Platform Scaling up with Commodity HW® Scale up Benchmark Report Shay Hassidim Deputy CTO GigaSpaces January 2009
Scaling up with commodity HW ® - Benchmark Target
“ S c a l i n g u p with commodity HW®”
Test how GigaSpaces based applications Scales-up on the Sun Fire x4450-Intel 7460-XAP GigaSpaces Platform and how the new Intel 45-nm technology with 4 CPUs and 6 cores utilized with mission critical applications
Scale up Benchmark Results Highlights - Throughput
16,223 page generation/sec with 6 ms latency over LAN
3 web servers
2 x4450 machines
HPC Risk Calculation
Monte Carlo simulation
Near linear scalability (Up to 32 concurrent workers)
Boosting the calculation time in 2640% (compared to one worker)
Calculating 4096 portfolios in 100 seconds (41 calculation per/sec).
Servers Used Since we are running Scale up Benchmark we have fixed amount of machines - mySQL Database - Apache Load-Balancer 3.16GHz 4 2 (2 cores each) Intel Xeon 16 G RAM Sun Fire X4150 2 socket Sun 4 GigaSpaces Clients 2.66GHz 24 4 (6 cores each) Intel Dunnington X7460 32 G RAM 4 socket Intel White box 3 GigaSpaces 2.66GHz 24 4 (6 cores each) Intel Dunnington X7460 32 G RAM Sun Fire X4450 4 socket Sun 2 GigaSpaces 2.66GHz 24 4 (6 cores each) Intel Dunnington X7460 32 G RAM Sun Fire X4450 4 socket Sun 1 Running Clock speed # of Cores # of CPU CPU Type and Memory Model Vendor Server ID
Technology Stack under Test Ethernet (1gE) Sun Fire x4450 Intel Dunnington X7460 4 CPUs (6 cores each) Sun Solaris update 6 GigaSpaces XAP 6.2.2 Sun mySQL 5 Apache LB 2.2.9 Sun JDK 1.6
Introduction Space Based Architecture – Business logic and data collocated Primary 1 Primary 2 Primary 3 Backup 3 Backup 2 Backup 1 Replication Replication Replication Pushing data into the backend system In-Memory-Data-Grid and collocated Processing units Collects results / reporting Service
<1 millisecond latency for 20 users running with 1000 write operations/sec including HA.
<0.4 millisecond latency for 20 users running with 1000 write operations/sec excluding HA.
True linear Scalability Ratio up to 8 users (8K write/sec)
>0.8 Scalability Ratio for 22 users (22K write/sec)
All the results above were taken with:
Total of 20,000 operations/sec
4 K object payload
ping latency 0.1 ms.
The test object had 8 fields where 3 of them are indexed (4 String Fields , 2 Long fields, 2 Integer fields). The test Class did not implemented Externalizable and did not had any special truing or optimization to deuce its footprint or speedup its serialization.
Web Application Benchmark – Physical Deployment Topology X4150 Apache Load Balancer mySQL X4450 GigaSpaces 4 spaces Web servers ,Services X4450 GigaSpaces 4 spaces Web servers ,Services Switched Ethernet LAN white box Client JMeter Switched Ethernet LAN
Web Application Benchmark Results – Latency , Scalability Only 20% drop up to 20 users hitting the system with 7000 requests/sec having 2.8 ms latency
Web Application Benchmark Results - Capacity The Users factor is 50 - Every LAN based user equals 50s WAN based users due-to the inherit latency of the internet (Min latency over the WAN 100ms , over the LAN 2ms)
1. The automatic interrupts rate adjustment in the Solaris IP network stack was recommended to be disabled by adding the following entry in the /etc/system file: set dld:dld_opt=2 The dld:dld_opt parameter is used to control interrupt scheduling within the IP network stack. By default, the IP code will try to automatically adjust the rate of interrupts with the intention to smooth the rate and reduce delays for low and moderate network traffic. At high interrupt rates, automatic adjustment is often counterproductive, causing more work in the IP stack which is not needed. Automatic adjustment will not be done by setting the option to the value “2”. 2. For the 1GbE network infrastructure, it is recommended to disable the interrupt throttling on the e1000g interfaces by adding the following two entries in the /kernel/srv/e1000g.conf file: intr_adaptive=0,0,0,0,0,0,0,0; intr_throttling_rate=0,0,0,0,0,0,0,0; The intr_adaptive and intr_throttling_rate variables define how interrupt blanking (coalescing) is implemented by the driver for each NIC that it controls. Each element inthe list is the value for a specific NIC from e1000g0 to e1000g15 (higher numbered NICs will likely not be present). The table examples show 16 possible NICs being changed, which is not necessary in almost all cases. If interrupt blanking is disabled, packets are processed by the driver as soon as they arrive. If interrupt blanking is enabled, packets are processed by the driver when the interrupt is issued. Enabling blanking allows the network stack to delay processing single packets with the assumption that additional packets will follow shortly which can then all be processed with a single interrupt. This will add latency, but will reduce CPU utilization by reducing the number of interrupts that will need to be handled. The variable intr_adaptive disables (0) or enables (1) the interrupt blanking mechanism provided by the Solaris Generic LAN Driver (GLD) framework. When this tunable is disabled, the intr_throttling_rate parameter is available to configure interrupt blanking manually. The variable intr_throttling_rate (0 - 65535) specifies the inter-interrupt delay to realize the interrupt blanking (coalescing). The intr_throttling_rate variable takes effect only when intr_adaptive is disabled. Smaller values of intr_throttling_rate mean higher interrupt rates/less coalescing; larger values mean lower interrupt rates/more coalescing. The value 0 indicates that there is no interrupt coalescing - an interrupt is fired immediately when a packet is received. It is recommended that for systems handling lower message rates, that the defaults (intr_adaptive set to 1, which disables intr_throttling_rate) be used. Systems that handle very high loads that are relatively constant or which are very latency sensitive should set intr_adaptive and intr_throttling_rate to 0 only for the NIC(s) carrying that load.