MILLION MONKEYS PROJECT Randomly recreated Shakespeare Open source Good metric for CPU and memory
EC2 SPECIFICATIONSInstance Name Memory EC2 Compute Platform I/O Units/Cores PerformanceSmall 1.7 GB 1 EC2 on 1 Core 32-bit ModerateLarge 7.5 GB 4 EC2 on 2 Cores 64-bit HighExtra Large 15 GB 8 EC2 on 8 Cores 64-bit HighHigh-CPU 1.7 GB 5 EC2 on 2 Cores 32-bit ModerateMediumHigh-CPU Large 7 GB 20 EC2 on 8 Cores 64-bit HighQuad XL 23 GB 33.5 on 8 Cores 64-bit Very High EC2 Compute Unit (ECU) – One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
EC2 PERFORMANCE My Core 2 Duo 2.66 GHZ did 50,000,000,000 character groups
BREAKDOWNS Original project would have run in 3 days 9 hours Took 1.5 months before 20 node cluster costs $45.44 per day 5 day run cost $317 11 day run cost $528
ENGINEERING FOR THE CLOUD Establish if a good fit Test the EC2 performance Figure out a unit or widget Find the most cost efficient EC2 performer with price per unit/widget Engineer with Spot Instances in mind
CONCLUSIONS Spot Instance Saves From $2.20 to $1.30 per hour Saved $1,000 in one run Hadoop/EMR Scalability 95% efficiency at 2-5 nodes 87% efficiency at 10 nodes 84% efficiency at 20 nodes
MORE INFORMATION http://www.jesse-anderson.com/2012/02/ec2- performance-spot-instance-roi-and-emr- scalability/ @jessetanderson