• Save
AWS Customer Presentation - HotPads
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

AWS Customer Presentation - HotPads

  • 8,989 views
Uploaded on

Matt Corgan, Co-Founder and Director of Technology, HotPads.com talks at AWS Start-Up Event in Washington DC about their use of AWS.

Matt Corgan, Co-Founder and Director of Technology, HotPads.com talks at AWS Start-Up Event in Washington DC about their use of AWS.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • That's really great for the company without IT infrustructue
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
8,989
On Slideshare
7,056
From Embeds
1,933
Number of Embeds
8

Actions

Shares
Downloads
160
Comments
1
Likes
15

Embeds 1,933

http://daily.hotpads.com 1,862
http://hotpads.com 25
http://wwwin-blogs.cisco.com 21
http://appgirl.net 15
http://www.slideshare.net 6
http://hotpads.import.com 2
http://static.slidesharecdn.com 1
http://rollerblog-dev 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. HotPads.com on AWS Matthew Corgan President May 27, 2009
  • 2. What is HotPads?
    • Real Estate search engine
      • Launched in May, 2005 in Washington, DC
        • Used The Planet for hosting until December, 2008
      • 9 employees, 6 engineers
      • 800,000 visits/month
      • 4.5 million page-views/month
      • 3.5 million real estate listings updated daily
      • Java and MySQL
  • 3. AWS costs in April
    • EC2 instances : $7,400
    • S3 : $1,500
    • EBS : $500
    • CloudFront : $460
    • EIPs : $8
    • RightScale - $500
      • 3 rd party management console
    • SQS : in development
    • Reserved instances : still evaluating
  • 4. Site components S HotPads.com Load balancer MapTile Job Messaging Databases L XL L Public S3 CF VA TX CA International Web L MEM EBS L HotPads.com S3 L CF CF CF CF CF CF EBS L EBS L EBS L EBS L EBS XL EBS MEM L Indexing
  • 5. S3 – better for larger objects
    • Latency > 10ms or even > 100ms
    • Memcached latency below 1ms
    • $0.15 per GB-month storage
    • $1 per 1mm GETs
    • $1 per 100k PUTs
    • Ex: 67 KB object (600px image)
      • PUT cost ~= storage cost ~= download cost
    • Ex: 6.7 KB object (15px thumbnail)
      • GET cost ~= storage cost ~= download cost
      • Careful! – PUT cost is 10x the storage and transfer costs
  • 6. S3 – April usage
    • Photos
      • 330 GB downloaded @ $.15/GB = $49
      • 55mm GETs @ $1/mm = $55
      • 42mm PUTs @ $1/1k = $420!
    • Database backups
      • 4.4 TB stored @ $.15/GB = $660
        • Probably too many copies stored
    • Maptiles
      • ~$100 for downloads and GETs
  • 7. CloudFront
    • HotPads uses for:
      • Static files : great
      • Map tiles : ok
      • Photos : toss-up, but we use anyway
        • Many photos are only viewed once
        • CloudFront miss has to go back to S3, so cache miss may take longer than going to S3 directly
        • Pay for 2 GETs on a miss
        • Maybe pay for 2x the transfer cost (not sure)
        • But, makes frequently viewed listings faster
  • 8. EC2 breakdown
    • EC2 (currently all “ memory ” instance types)
      • Load balancers, HAProxy, 2 small = $150
      • Web servers, Tomcat, 3-5 large = $1,200
        • Scale out 11am to Midnight
      • Job servers, Tomcat, 5 large = ~$1,500
      • Index servers, Tomcat, 1 X-large, 1 large = ~$900
      • MySQL masters, 1 X-large, 2 large = ~$1,200
      • MySQL slaves, 1 X-large, 2 large = ~$1,200
      • Messaging server, ActiveMQ,1 large = ~$300
      • Map tile creation servers, Tilecache, 1 large = ~$300
      • Development/testing/migration servers = ~$600
    • 8GB Memcached on permanent webs/jobs
  • 9. EBS – used for all databases
    • Cons
      • Black box: hard to determine the best usage
      • Adds costs above using local drives (but not too much)
      • Less bandwidth (not usually important for databases)
    • Pros
      • Lower average latency
      • Especially fast random writes
      • Snapshot backups allow for very short write-locks and only storing diffs
      • Ability to clone and hibernate databases
      • Redundancy
        • We had lost the local disks on a live master database twice
  • 10.
    • I/O bound
    • RAIDing multiple volumes didn’t help much
    • Testing multiple drives with 1 schema per drive
    Database utilization
  • 11. SimpleDB
    • Pros
      • Stand-alone DB servers are often drastically underutilized and a pain to administer, backup, and restore after failure
      • SimpleDB is schema-less
        • MySQL schema changes are a major problem
    • Cons
      • Binary stored values can’t be interpreted by generic GUI,
      • and have to be encoded by the client
      • Tied to EC2 for latency reasons
      • Eventual consistency when accessed from different
      • EC2 nodes
      • “ Column” names (may??) inflate storage size
      • Must partition a table before it hits 10 GB
  • 12. Reserved Instances
    • Pros
      • Get 1 year for the cost of 6 months
      • Guaranteed to get an instance
        • yes – we have been denied
    • Cons
      • Tied to particular instance type
        • Your needs may change
        • Amazon may introduce more appropriate instance types
  • 13. How does AppEngine compare?
    • Benefits?
      • Low cost, no idle instances sitting around
      • No Linux administration
    • Why don’t we use it?
      • Java deployments limited to 1,000 files
      • Cannot spawn threads
        • Several areas of HotPads are multi-threaded for a 10x request latency improvement
      • Request limit of 30 seconds: no long jobs
      • Our indexes need a big, long-lived heap
    • Amazon lets you innovate more, and that’s our goal.