eBay From Ground Level to the Clouds


Published on

The mighty cloud draws businesses and developers who seek its agility and productivity. But which type of cloud is best? We moved eBay Marketplace, a major eCommerce site, from a traditional infrastructure to a cloud model. We will present the strategic, technical and cost factors we weighed when deciding between cloud versus automation, and porting applications versus rewriting them. We will explain why we ended up with a hybrid: developing our own internal cloud while leveraging the massive infrastructure of public cloud providers.

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Virtual Data CenterExternal cloud looks like an extension of eBay DCAll traffic goes through VPN (or private peering)Internal IP space is shared between two domainseBay’s DNS zones are delegated to cloud providersDDOS/IDS is on eBay’s side Most transparent model, but creates a lot of technical issues (good when application complexity requires it)Public Shared CloudExternal cloud looks like a 3rd partyAll traffic goes through Internet (or private peering)eBay’s Internal IP space is not accessible Two DNS/IP management pointsDDOS/IDS on Public cloud side ?  Most “cloud like” model, but has more limitation (good for isolated use cases)
  • Before:Mesh of application dependencies (build time, and run time)One build per deployment environmentBig monolithic deliverablesHighly latency sensitive because of DB dependenciesOngoing:Decomposition of applications into servicesModularization of code base (OSGI) No more train releasesBuild Once, Deploy EverywhereRefactorization of DB dependencies behind servicesFormal declaration of dependencies‘Cloud friendly’Future:Migration of some data into “cloud friendly” DB (MongoDB, Cassandra, Hbase, …)Redesign of platform services (e.g. logging) to be less infrastructure dependent‘Cloud ready’
  • Implications:IP space does not identify members of an applicationCannot use application name as a label on cables Change in asset management (e.g. fulfillment, chargeback)Less flexibility in h/w choice or customization (however, changing from small to large VM is faster)Stricter isolation requirement to support multi-tenancy (virtual environments)
  • Today:Missing features (Monitoring, DNS Mgt., LB Mgt, software deployment, PaaS level features…)Not managing full lifecycle (focused on customer facing functions)Impedance mismatch (ISP profile vs. eBay)Infrastructure dependencies have scalability implications (mostly around network isolation) Gaps but catching up fast : opportunity to contributeFuture:Adopt as much as possible and contributeIncrease open source footprint as maturity/feature set improvesKeep abstraction layer to provide eBay’s specific flavor and PDLC integration Adopt gradually but keep eBay’s abstraction
  • eBay From Ground Level to the Clouds

    1. 1. 200M live listings 22B page views/day9 Pb of data 6,000 application servers 250M queries/day 94M active users 23M SLOC $62B 2010 gross merchandise volume 75B database calls/day
    2. 2. Beta PCICompliant PCI Compliant Production Research Skunkworks QA
    3. 3. DRNumber of servers required based on utilization for 8 pools
    4. 4. Even at 4x the internal cost, public Cloud cost to cloud would save Internal cost money ratio Internal cost is dominantExternal cost is dominant
    5. 5. ?Private Public Hybrid Build Buy Build + OSS
    6. 6. Service Catalog REST APIs Ticket driven run book Model driven close loop automation automationConfiguration Management Distributed state Management Database (CMDB) Chargeback Pay as you go Multitenant infrastructure with Server Virtualization secure isolation
    7. 7. Cannot be The task requires human involvement (e.g. racking and wiring) automatedNo support for Component lacks API or requires UI based actions (e.g. checkpoint) automationLimited rate of Configuration requires restart, reload, file sync (e.g. Bind, ISC DHCP) changeNo permission Configuration requires special credential/role (e.g. firewall, network)
    8. 8. Application App App App Application App App App Spare spare spare spare Global resource pool Infra Infra Infra Infra Shared infrastructure
    9. 9. request order receive & deliver{nb servers, rack & wiremodel, app } Label (app) “several” 1w weeks 2-3 w repurpose request order Receive deliver to request deliver{nb servers, pre-racked cache {nb servers, model } Pre-wired model, app } quarterly 45 min 1 day 2-3 w repurpose
    10. 10. IaaS/PaaS API IaaS/PaaS API Resource Distributed Resource Distributed orchestration orchestration Allocation State Allocation State Application Access Point Application Access Point AuthN/AuthZ AuthN/AuthZ Controller Controller Controller Controller Compute Cluster Pool Compute Cluster Pool Controller Controller Controller Controller Controller ControllerCompute Mgt. DNS Mgt. LB Mgt. Monitoring Open Source SolutionNetwork Prov Image/Pkg Repo Software Dist. (openstack / Cloudstack)