HotPads.com on AWS Matthew Corgan President May 27, 2009
What is HotPads? <ul><li>Real Estate search engine </li></ul><ul><ul><li>Launched in May, 2005 in Washington, DC </li></ul...
AWS costs in April <ul><li>EC2 instances : $7,400 </li></ul><ul><li>S3  : $1,500 </li></ul><ul><li>EBS  : $500 </li></ul><...
Site components S HotPads.com Load balancer MapTile Job Messaging Databases L XL L Public  S3 CF VA TX CA International We...
S3 – better for larger objects <ul><li>Latency > 10ms or even > 100ms </li></ul><ul><li>Memcached latency below 1ms </li><...
S3 – April usage  <ul><li>Photos </li></ul><ul><ul><li>330 GB downloaded @ $.15/GB = $49 </li></ul></ul><ul><ul><li>55mm G...
CloudFront <ul><li>HotPads uses for: </li></ul><ul><ul><li>Static files : great </li></ul></ul><ul><ul><li>Map tiles : ok ...
EC2 breakdown <ul><li>EC2  (currently all “ memory ” instance types) </li></ul><ul><ul><li>Load balancers, HAProxy, 2 smal...
EBS – used for all databases <ul><li>Cons </li></ul><ul><ul><li>Black box: hard to determine the best usage </li></ul></ul...
<ul><li>I/O bound </li></ul><ul><li>RAIDing multiple volumes didn’t help much </li></ul><ul><li>Testing multiple drives wi...
SimpleDB <ul><li>Pros </li></ul><ul><ul><li>Stand-alone DB servers are often drastically underutilized and a pain to admin...
Reserved Instances <ul><li>Pros </li></ul><ul><ul><li>Get 1 year for the cost of 6 months </li></ul></ul><ul><ul><li>Guara...
How does AppEngine compare? <ul><li>Benefits? </li></ul><ul><ul><li>Low cost, no idle instances sitting around </li></ul><...
Upcoming SlideShare
Loading in...5
×

AWS Customer Presentation - HotPads

5,731

Published on

Matt Corgan, Co-Founder and Director of Technology, HotPads.com talks at AWS Start-Up Event in Washington DC about their use of AWS.

Published in: Technology
1 Comment
16 Likes
Statistics
Notes
  • That's really great for the company without IT infrustructue
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
5,731
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
160
Comments
1
Likes
16
Embeds 0
No embeds

No notes for slide

AWS Customer Presentation - HotPads

  1. 1. HotPads.com on AWS Matthew Corgan President May 27, 2009
  2. 2. What is HotPads? <ul><li>Real Estate search engine </li></ul><ul><ul><li>Launched in May, 2005 in Washington, DC </li></ul></ul><ul><ul><ul><li>Used The Planet for hosting until December, 2008 </li></ul></ul></ul><ul><ul><li>9 employees, 6 engineers </li></ul></ul><ul><ul><li>800,000 visits/month </li></ul></ul><ul><ul><li>4.5 million page-views/month </li></ul></ul><ul><ul><li>3.5 million real estate listings updated daily </li></ul></ul><ul><ul><li>Java and MySQL </li></ul></ul>
  3. 3. AWS costs in April <ul><li>EC2 instances : $7,400 </li></ul><ul><li>S3 : $1,500 </li></ul><ul><li>EBS : $500 </li></ul><ul><li>CloudFront : $460 </li></ul><ul><li>EIPs : $8 </li></ul><ul><li>RightScale - $500 </li></ul><ul><ul><li>3 rd party management console </li></ul></ul><ul><li>SQS : in development </li></ul><ul><li>Reserved instances : still evaluating </li></ul>
  4. 4. Site components S HotPads.com Load balancer MapTile Job Messaging Databases L XL L Public S3 CF VA TX CA International Web L MEM EBS L HotPads.com S3 L CF CF CF CF CF CF EBS L EBS L EBS L EBS L EBS XL EBS MEM L Indexing
  5. 5. S3 – better for larger objects <ul><li>Latency > 10ms or even > 100ms </li></ul><ul><li>Memcached latency below 1ms </li></ul><ul><li>$0.15 per GB-month storage </li></ul><ul><li>$1 per 1mm GETs </li></ul><ul><li>$1 per 100k PUTs </li></ul><ul><li>Ex: 67 KB object (600px image) </li></ul><ul><ul><li>PUT cost ~= storage cost ~= download cost </li></ul></ul><ul><li>Ex: 6.7 KB object (15px thumbnail) </li></ul><ul><ul><li>GET cost ~= storage cost ~= download cost </li></ul></ul><ul><ul><li>Careful! – PUT cost is 10x the storage and transfer costs </li></ul></ul>
  6. 6. S3 – April usage <ul><li>Photos </li></ul><ul><ul><li>330 GB downloaded @ $.15/GB = $49 </li></ul></ul><ul><ul><li>55mm GETs @ $1/mm = $55 </li></ul></ul><ul><ul><li>42mm PUTs @ $1/1k = $420! </li></ul></ul><ul><li>Database backups </li></ul><ul><ul><li>4.4 TB stored @ $.15/GB = $660 </li></ul></ul><ul><ul><ul><li>Probably too many copies stored </li></ul></ul></ul><ul><li>Maptiles </li></ul><ul><ul><li>~$100 for downloads and GETs </li></ul></ul>
  7. 7. CloudFront <ul><li>HotPads uses for: </li></ul><ul><ul><li>Static files : great </li></ul></ul><ul><ul><li>Map tiles : ok </li></ul></ul><ul><ul><li>Photos : toss-up, but we use anyway </li></ul></ul><ul><ul><ul><li>Many photos are only viewed once </li></ul></ul></ul><ul><ul><ul><li>CloudFront miss has to go back to S3, so cache miss may take longer than going to S3 directly </li></ul></ul></ul><ul><ul><ul><li>Pay for 2 GETs on a miss </li></ul></ul></ul><ul><ul><ul><li>Maybe pay for 2x the transfer cost (not sure) </li></ul></ul></ul><ul><ul><ul><li>But, makes frequently viewed listings faster </li></ul></ul></ul>
  8. 8. EC2 breakdown <ul><li>EC2 (currently all “ memory ” instance types) </li></ul><ul><ul><li>Load balancers, HAProxy, 2 small = $150 </li></ul></ul><ul><ul><li>Web servers, Tomcat, 3-5 large = $1,200 </li></ul></ul><ul><ul><ul><li>Scale out 11am to Midnight </li></ul></ul></ul><ul><ul><li>Job servers, Tomcat, 5 large = ~$1,500 </li></ul></ul><ul><ul><li>Index servers, Tomcat, 1 X-large, 1 large = ~$900 </li></ul></ul><ul><ul><li>MySQL masters, 1 X-large, 2 large = ~$1,200 </li></ul></ul><ul><ul><li>MySQL slaves, 1 X-large, 2 large = ~$1,200 </li></ul></ul><ul><ul><li>Messaging server, ActiveMQ,1 large = ~$300 </li></ul></ul><ul><ul><li>Map tile creation servers, Tilecache, 1 large = ~$300 </li></ul></ul><ul><ul><li>Development/testing/migration servers = ~$600 </li></ul></ul><ul><li>8GB Memcached on permanent webs/jobs </li></ul>
  9. 9. EBS – used for all databases <ul><li>Cons </li></ul><ul><ul><li>Black box: hard to determine the best usage </li></ul></ul><ul><ul><li>Adds costs above using local drives (but not too much) </li></ul></ul><ul><ul><li>Less bandwidth (not usually important for databases) </li></ul></ul><ul><li>Pros </li></ul><ul><ul><li>Lower average latency </li></ul></ul><ul><ul><li>Especially fast random writes </li></ul></ul><ul><ul><li>Snapshot backups allow for very short write-locks and only storing diffs </li></ul></ul><ul><ul><li>Ability to clone and hibernate databases </li></ul></ul><ul><ul><li>Redundancy </li></ul></ul><ul><ul><ul><li>We had lost the local disks on a live master database twice </li></ul></ul></ul>
  10. 10. <ul><li>I/O bound </li></ul><ul><li>RAIDing multiple volumes didn’t help much </li></ul><ul><li>Testing multiple drives with 1 schema per drive </li></ul>Database utilization
  11. 11. SimpleDB <ul><li>Pros </li></ul><ul><ul><li>Stand-alone DB servers are often drastically underutilized and a pain to administer, backup, and restore after failure </li></ul></ul><ul><ul><li>SimpleDB is schema-less </li></ul></ul><ul><ul><ul><li>MySQL schema changes are a major problem </li></ul></ul></ul><ul><li>Cons </li></ul><ul><ul><li>Binary stored values can’t be interpreted by generic GUI, </li></ul></ul><ul><ul><li>and have to be encoded by the client </li></ul></ul><ul><ul><li>Tied to EC2 for latency reasons </li></ul></ul><ul><ul><li>Eventual consistency when accessed from different </li></ul></ul><ul><ul><li>EC2 nodes </li></ul></ul><ul><ul><li>“ Column” names (may??) inflate storage size </li></ul></ul><ul><ul><li>Must partition a table before it hits 10 GB </li></ul></ul>
  12. 12. Reserved Instances <ul><li>Pros </li></ul><ul><ul><li>Get 1 year for the cost of 6 months </li></ul></ul><ul><ul><li>Guaranteed to get an instance </li></ul></ul><ul><ul><ul><li>yes – we have been denied </li></ul></ul></ul><ul><li>Cons </li></ul><ul><ul><li>Tied to particular instance type </li></ul></ul><ul><ul><ul><li>Your needs may change </li></ul></ul></ul><ul><ul><ul><li>Amazon may introduce more appropriate instance types </li></ul></ul></ul>
  13. 13. How does AppEngine compare? <ul><li>Benefits? </li></ul><ul><ul><li>Low cost, no idle instances sitting around </li></ul></ul><ul><ul><li>No Linux administration </li></ul></ul><ul><li>Why don’t we use it? </li></ul><ul><ul><li>Java deployments limited to 1,000 files </li></ul></ul><ul><ul><li>Cannot spawn threads </li></ul></ul><ul><ul><ul><li>Several areas of HotPads are multi-threaded for a 10x request latency improvement </li></ul></ul></ul><ul><ul><li>Request limit of 30 seconds: no long jobs </li></ul></ul><ul><ul><li>Our indexes need a big, long-lived heap </li></ul></ul><ul><li>Amazon lets you innovate more, and that’s our goal. </li></ul>

×