Lots of proof – CouchDB ElasticGrid Every week 3 to 4 new GigaSpaces clients on Amazon EC2 Maturity shows in the SLA's too: Unmanaged 50,000 90 1 Managed 5,000 99 2 Well-managed 500 99.9 3 Fault-tolerant 50 99.99 4 High Availability 5 99.999 5 Very High Availability 0.5 99.9999 6 Ultra Availability 0.05 99.99999 7 99.95 -> 250 minuten downtime per jaar
Doesn't mean you have to fully rewrite your application You need to undergo a paradigm shift and change your application accordingly
Transactional / Operational data Writes to disk mean contention, system performance will degrade when adding more capacity or when needing more throughput Caching Numbers everyone should know: http://highscalability.com/numbers-everyone-should-know Writes are expensive! Datastore is transactional: writes require disk access Disk access means disk seeks Rule of thumb: 10ms for a disk seek Simple math: 1s / 10ms = 100 seeks/sec maximum Depends on: * The size and shape of your data * Doing work in batches (batch puts and gets) Reads are cheap! Reads do not need to be transactional, just consistent Data is read from disk once, then it's easily cached All subsequent reads come straight from memory Rule of thumb: 250usec for 1MB of data from memory Simple math: 1s / 250usec = 4GB/sec maximum * For a 1MB entity, that's 4000 fetches/sec L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from memory 250,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns ------------- Writes are 40 times more expensive than reads. Global shared data is expensive. This is a fundamental limitation of distributed systems. The lock contention in shared heavily written objects kills performance as transactions become serialized and slow. Architect for scaling writes. Optimize for low write contention. Optimize wide. Make writes as parallel as you can.
Decouples the application from the deployment environment Since deployment and application virtualization is there you can: Outsource testing, disaster recovery, etc. to public cloud Scale out to public cloud at peak times
Time to value: the amount of time it takes you to achieve/show value Take quote from ROI presentation Savings calculations: Web tier -> 10 at peak, av. 3 => saving=7 Business Logic -> using commodity instead of high-end machines +100 peak, av.10=>savings=90 Messaging -> cost of non-linear scaling=> 6x throughput Data tier -> cost of non-linear scaling=> 6x throughput
GigaSpaces XAP proved far superior to the traditional JEE-based implementation, in terms of throughput and latency. The results on the same hardware (quad-core AMD machines with RH Linux) were 6 times more throughput, with up to 10 times less latency in favor of the GigaSpaces implementation => with at least 6x less machines . The effect of latency in number of reduced machines is harder to measure, but it is expected to reduce number of machines even further.