The eBay Architecture Striking a balance between site stability, feature velocity, performance, and cost  Presented By:  R...
What we ’re up against <ul><ul><li>eBay users worldwide trade more than $1812 worth of goods every second   </li></ul></ul...
Site Statistics: in a typical  day … June 1999 Q3 2006 Growth Outbound Emails 1 M 35 M 35x Total Page Views 54 M 1 B 19x P...
eBay ’s Exponential Growth 212 Million Users 1999 2000 2001 2002 2003 1998 2004 2005 105 Million Listings 2006
Velocity of eBay -- Software Development Process <ul><li>Our site is our product.  We change it incrementally through impl...
Systemic Requirements Maintainability   Faster Product Delivery Enable rapid business innovation Enable seamless growth De...
Architectural Lessons <ul><li>Scale Out, Not Up </li></ul><ul><ul><li>Horizontal scaling at every tier. </li></ul></ul><ul...
Ongoing Platform Evolution … V1 V2.0 V2.4 V3 V2.3 eBay architecture versions Registered Users 212M V4 V2.5
V1.0  1995-September 1997 <ul><li>Built over a weekend in Pierre Omidyar ’ s living room in  1995 </li></ul><ul><li>System...
V2.0  September 1997- February 1999 <ul><li>3-tiered conceptual architecture (separation of bus/pres and db access tiers) ...
V2.1  February 1999-November 1999 <ul><li>Servers grouped into pools (small soldiers) </li></ul><ul><li>Resonate used for ...
V2.3  June 1999-November 1999 <ul><li>Second Database added for failover </li></ul><ul><li>CGI pools, Listings, Pages, and...
V2.4  November 1999-April 2001 <ul><li>Database &quot;split&quot; technology. </li></ul><ul><li>Logically partition databa...
V2.5  April 2001  –  December 2002 November, 1999 December, 2002 <ul><li>Horizontal scalability through database splits </...
Now that we have the Database taken care of…. <ul><li>Application Server </li></ul><ul><ul><li>Monolithic 2-tier Architect...
V3 – Replace C++/ISAPI with Java  2002-present <ul><li>Re-wrote the entire application in J2EE application server framewor...
Scaling the Data Tier
Scaling the Data Tier: Overview <ul><li>Spread the Load </li></ul><ul><ul><li>Segmentation by function. </li></ul></ul><ul...
Scaling the Data Tier:  Functional Segmentation <ul><li>Segment databases into functional areas </li></ul><ul><ul><li>User...
Scaling the Data Tier:  Horizontal Split <ul><li>Split databases horizontally by primary access path. </li></ul><ul><li>Di...
Scaling the Data Tier:  Logical Database Hosts  <ul><li>Separate Application notion of a database from physical implementa...
Scaling the Data Tier:  Minimize DB Resources <ul><li>No business logic in database </li></ul><ul><ul><li>No stored proced...
Scaling the Data Tier:  Minimize DB Transactions <ul><li>Auto-commit for vast majority of DB writes </li></ul><ul><li>Abso...
Scaling the Application Tier
Scaling the Application Tier – Overview <ul><li>Spread the Load </li></ul><ul><ul><li>Segmentation by function. </li></ul>...
Scaling the Application Tier – Massively Scaling J2EE <ul><li>Step 1 - Throw out most of J2EE </li></ul><ul><ul><li>eBay s...
Scaling the Application Tier – Tiered Application Model BO/BOF AO/AOF (View) Business Logic XML Model Building Logic Comma...
Scaling the Application Tier – Data Access Layer (DAL) <ul><li>What is the DAL? </li></ul><ul><ul><li>eBay ’s internally-d...
Scaling the Application Tier – Vertical Code Partitioning <ul><li>Partition code into functional areas </li></ul><ul><ul><...
Scaling the Application Tier – Functional Segmentation IIS IIS ViewItem Pool   http://cgi.ebay.com … … … User Acct Caty20+...
Scaling the Application Tier – Platform Decoupling <ul><li>Domain Partitioning for Deployment </li></ul><ul><ul><li>Decoup...
Scaling Search
Scaling Search – Overview <ul><li>In 2002, eBay search had reached its limits </li></ul><ul><ul><li>Cost of scaling third-...
Scaling Search – Voyager <ul><li>Real-time feeder infrastructure </li></ul><ul><ul><li>Reliable multicast from primary dat...
Scaling Search – Voyager
Scaling Operations
Scaling Operations – Code Deployment <ul><li>Demanding Requirements </li></ul><ul><ul><li>Entire site rolled every 2 weeks...
Scaling Operations – Monitoring <ul><li>Centralized Activity Logging (CAL) </li></ul><ul><ul><li>Transaction oriented logg...
Recap <ul><li>Enabling seamless growth </li></ul><ul><li>Massive Database and Code Scalability </li></ul><ul><li>Deliverin...
Upcoming SlideShare
Loading in …5
×

The eBay Architecture: Striking a Balance between Site Stability, Feature Velocity, Performance, and Cost

22,120 views

Published on

eBay architects Randy Shoup and Dan Pritchett give a guided tour of the eBay architecture. They cover the evolution of the technology stack from Perl to C++ to Java. And they discuss scaling strategies for the data tier, application tier, search, and operations.

Published in: Technology
2 Comments
7 Likes
Statistics
Notes
  • Future of KiJiJi: http://www.slideshare.net/ishmelev/kijiji-strategy
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Ebay's latest 'purge' last week of some 15,000 sellers is ostensibly being 'justified' by their propagandists as being for 'site stability'.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
22,120
On SlideShare
0
From Embeds
0
Number of Embeds
1,681
Actions
Shares
0
Downloads
147
Comments
2
Likes
7
Embeds 0
No embeds

No notes for slide
  • A few more statistics for you. In 1999, eBay was already a giant in the ecommerce space, but things were just really starting to pick up. &lt;CLICK&gt; Now, we are at a tremendous scale. &lt;CLICK&gt; The growth is just amazing. In only 5 years, we have seen 1600% growth in traffic. And because of all of the added functionality of the site, we have seen even higher growth rates in other areas: For instance, hosting pictures was a growing issue for our users. So we added eBay Picture Services. That ’ s one of the reasons that the Network Utilization has grown so quickly. Availability: You might be saying “ It only changed by roughly 3%, that ’ s not a 50x improvement ” You ’ re right, but let ’ s think about what that number really means. First, a loss of availability does not mean that the entire site is unreachable. It means that at least one of our customers might be impacted and unable to search, list, view or buy a product. 97% uptime = 43 minutes (interrupted service) per day. That ’ s a lot, and that ’ s what we were living with in 1999. Today… 99.94% availability = 50 seconds of interrupted service per day Remember that math the next time someone asks you why it ’ s so hard to improve availability. ;-)
  • Exponential Growth. Business people just love graphs like that don ’ t they? But for us technologist, exponential growth means something very different. I ’ m trained as a nuclear engineer, so it means something even more interesting to me… But that ’ s what we ’ ve been building for over the last 10 years. The first 5 were pretty easy, we had a small group of collectors using the site to trade within a community of like-minded individuals. But after that, things started to become more challenging. We had people starting to quit their jobs and work full-time selling on eBay, we had businesses using us as a new sales channel, and we had our buyers that had found one of the most diverse sets of products every offered in one place. We needed a little process to make all of the changes that this growing community was asking us to produce.
  • This sort of guaranteed delivery makes it possible for the business to reliably plan marketing campaigns and monitor the success of their initiatives. The velocity of code change, while maintaining reliability is unlike anything I had every seen before joining eBay.
  • Registered Users: Almost 40,000/day Listings Approximately 100M/quarter or over 1MM/day GMS Approximately $10B/year; $30MMsold/day
  • V1 was like any system you or I would build at home
  • V2 moved to industry standard technology
  • Then we added redundancy at the application tier and a real search engine
  • We added redundancy at the database tier and more capacity
  • We logically split to the databases and began to move the physical locations as well
  • And we could now migrate the databases to separate, less expensive servers, without affecting the running system negatively.
  • INTRODUCTION 3 -- Objectives List the specific skills or knowledge attendees should gain from this presentation. Write in terms of what the attendee will be able to do with the information when they leave the session – as opposed to all the features of the product. This is an important distinction. Effective Presentation TIPS The purpose of your presentation should be to provide information that will assist attendees in their work back on the job. This requires that you have some level of understanding of the type of work they do and the information they would need to carry out the job effectively and efficiently. So you should think in terms of what they will be doing back on the job. **see below ** The following is a list of topic categories Support Services, Professional Services, and Sales groups have indicated are critical in performing their jobs: SUPPORT SERVICES – Solution Center Engineers, Field and Support Engineers How to identify appropriate uses for the product The product&apos;s relative position within the product family How to recognize &amp; identify the component(s) and sub-component(s) of the product How to install the product How to configure the product How to identify &amp; correct common problems How to identify &amp; correct problems arising from integration with other Sun (or non-Sun) products (continued) But everything wasn ’ t perfect. We were now hitting scaling limits of different types. - code maintenance - code complexity - search
  • More than any specific technology, we needed to re-write the application for the realities of today.
  • Maybe these all become individual slides?
  • Most people don ’ t think of search when they think of eBay, but it ’ s one of the most challenging Information Retrieval projects in the world. We have 6M new items entering and exiting the site every day. And each item is updated on average 5 times during it ’ s lifetime. No off-the-shelf product today can handle the job.
  • The architecture of Search is pretty interesting, and not just because I designed it. The system is built to function regardless of failures within the feeding stream, and every part of the system can be scaled horizontally.
  • The eBay Architecture: Striking a Balance between Site Stability, Feature Velocity, Performance, and Cost

    1. 1. The eBay Architecture Striking a balance between site stability, feature velocity, performance, and cost Presented By: Randy Shoup and Dan Pritchett Date: February 9, 2007 Amazon Friday Learning Series
    2. 2. What we ’re up against <ul><ul><li>eBay users worldwide trade more than $1812 worth of goods every second  </li></ul></ul><ul><ul><li>eBay averages over 1 billion page views per day </li></ul></ul><ul><ul><li>At any given time, there are approximately 100 million listings on the site </li></ul></ul><ul><ul><li>eBay stores over 2 Petabytes of data – over 200 times the size of the Library of Congress! </li></ul></ul><ul><ul><li>The eBay platform handles 3 billion API calls per month </li></ul></ul><ul><li>eBay manages … </li></ul><ul><ul><li>Over 212,000,000 registered users </li></ul></ul><ul><ul><li>Over 1 Billion photos </li></ul></ul><ul><li>In a dynamic environment </li></ul><ul><ul><li>300+ features per quarter </li></ul></ul><ul><ul><li>We roll 100,000+ lines of code every two weeks </li></ul></ul><ul><li>In 33 countries, in seven languages, 24x7 </li></ul>>26 Billion SQL executions/day! An SUV is sold every 5 minutes A sporting good sells every 2 seconds Over ½ Million pounds of Kimchi are sold every year!
    3. 3. Site Statistics: in a typical day … June 1999 Q3 2006 Growth Outbound Emails 1 M 35 M 35x Total Page Views 54 M 1 B 19x Peak Network Utilization 268 Mbps 16 Gbps 59x API Calls 0 100 M N/A Availability ~97% 99.94% 50x 43 mins/day 50 sec/day
    4. 4. eBay ’s Exponential Growth 212 Million Users 1999 2000 2001 2002 2003 1998 2004 2005 105 Million Listings 2006
    5. 5. Velocity of eBay -- Software Development Process <ul><li>Our site is our product. We change it incrementally through implementing new features. </li></ul><ul><li>Very predictable development process – trains leave on-time at regular intervals (weekly). </li></ul><ul><li>Parallel development process with significant output -- 100,000 LOC per release. </li></ul><ul><li>Always on – over 99.94% available. </li></ul>6M LOC 100K LOC/Wk 99.94% 212M Users 300+ Features Per Quarter <ul><ul><ul><li>All while supporting a 24x7 environment </li></ul></ul></ul>
    6. 6. Systemic Requirements Maintainability Faster Product Delivery Enable rapid business innovation Enable seamless growth Deliver quality functionality at accelerating rates Architect for the future 10X Growth Availability Reliability Massive Scalability Security
    7. 7. Architectural Lessons <ul><li>Scale Out, Not Up </li></ul><ul><ul><li>Horizontal scaling at every tier. </li></ul></ul><ul><ul><li>Functional decomposition. </li></ul></ul><ul><li>Prefer Asynchronous Integration </li></ul><ul><ul><li>Minimize availability coupling. </li></ul></ul><ul><ul><li>Improve scaling options. </li></ul></ul><ul><li>Virtualize Components </li></ul><ul><ul><li>Reduce physical dependencies. </li></ul></ul><ul><ul><li>Improve deployment flexibility. </li></ul></ul><ul><li>Design for Failure </li></ul><ul><ul><li>Automated failure detection and notification. </li></ul></ul><ul><ul><li>“ Limp mode” operation of business features. </li></ul></ul>
    8. 8. Ongoing Platform Evolution … V1 V2.0 V2.4 V3 V2.3 eBay architecture versions Registered Users 212M V4 V2.5
    9. 9. V1.0 1995-September 1997 <ul><li>Built over a weekend in Pierre Omidyar ’ s living room in 1995 </li></ul><ul><li>System hardware was made up of parts that could be bought at Fry's </li></ul><ul><li>Every item was a separate file, generated by a Perl script </li></ul><ul><li>No search functionality, only category browsing </li></ul>This system maxed out at 50,000 active items
    10. 10. V2.0 September 1997- February 1999 <ul><li>3-tiered conceptual architecture (separation of bus/pres and db access tiers) </li></ul><ul><li>2-tiered physical implementation (no application server) </li></ul><ul><li>C++ Library (eBayISAPI.dll) running on IIS on Windows </li></ul><ul><li>Microsoft index server used for search </li></ul><ul><li>Items migrated from GDBM to an Oracle database on Solaris </li></ul>
    11. 11. V2.1 February 1999-November 1999 <ul><li>Servers grouped into pools (small soldiers) </li></ul><ul><li>Resonate used for front end load balancing and failover </li></ul><ul><li>Search functionality moved to the Thunderstone indexing system </li></ul><ul><li>Back-end Oracle database server scaled vertically to a larger machine (Sun E10000) </li></ul>
    12. 12. V2.3 June 1999-November 1999 <ul><li>Second Database added for failover </li></ul><ul><li>CGI pools, Listings, Pages, and Search continued to scale horizontally </li></ul>However … By November 1999, the database servers approached their limits of physical growth.
    13. 13. V2.4 November 1999-April 2001 <ul><li>Database &quot;split&quot; technology. </li></ul><ul><li>Logically partition database into separate instances. </li></ul><ul><li>Horizontal scalability through 2000, but not beyond. </li></ul>
    14. 14. V2.5 April 2001 – December 2002 November, 1999 December, 2002 <ul><li>Horizontal scalability through database splits </li></ul><ul><li>Items split by category </li></ul><ul><li>SPOF elimination </li></ul>Bear Bull
    15. 15. Now that we have the Database taken care of…. <ul><li>Application Server </li></ul><ul><ul><li>Monolithic 2-tier Architecture </li></ul></ul><ul><ul><li>3.3 Million Line C++ ISAPI DLL (150MB binary) </li></ul></ul><ul><ul><li>Hundreds of developers, all working on the same code </li></ul></ul><ul><ul><li>Hitting compiler limits on number of methods per class (!!) </li></ul></ul>
    16. 16. V3 – Replace C++/ISAPI with Java 2002-present <ul><li>Re-wrote the entire application in J2EE application server framework </li></ul><ul><ul><li>Gave us a chance to architect the code for reuse and separation of duties </li></ul></ul><ul><li>Leveraged the MSXML framework for the presentation layer </li></ul><ul><ul><li>Minimizing the development cost for migration </li></ul></ul><ul><li>Implemented a development kernel as a foundation for programmers </li></ul><ul><ul><li>Allowed for rapid training and deployment of new engineers </li></ul></ul>
    17. 17. Scaling the Data Tier
    18. 18. Scaling the Data Tier: Overview <ul><li>Spread the Load </li></ul><ul><ul><li>Segmentation by function. </li></ul></ul><ul><ul><li>Horizontal splits within functions. </li></ul></ul><ul><li>Minimize the Work </li></ul><ul><ul><li>Limit in database work </li></ul></ul><ul><li>The Tricks to Scaling </li></ul><ul><ul><li>How to survive without transactions. </li></ul></ul><ul><ul><li>Creating alternate database structures. </li></ul></ul>
    19. 19. Scaling the Data Tier: Functional Segmentation <ul><li>Segment databases into functional areas </li></ul><ul><ul><li>User hosts </li></ul></ul><ul><ul><li>Item hosts </li></ul></ul><ul><ul><li>Account hosts </li></ul></ul><ul><ul><li>Feedback hosts </li></ul></ul><ul><ul><li>Transaction hosts </li></ul></ul><ul><ul><li>And about 70 more functional categories </li></ul></ul><ul><li>Rationale </li></ul><ul><ul><li>Partitions data by different scaling / usage characteristics </li></ul></ul><ul><ul><li>Supports functional decoupling and isolation </li></ul></ul>
    20. 20. Scaling the Data Tier: Horizontal Split <ul><li>Split databases horizontally by primary access path. </li></ul><ul><li>Different patterns for different use cases </li></ul><ul><ul><li>Write Master/Read Slaves </li></ul></ul><ul><ul><li>Segmentation by data; Two approaches </li></ul></ul><ul><ul><ul><li>Modulo on a key, typically the primary key. </li></ul></ul></ul><ul><ul><ul><ul><li>Simple data location if you know the key </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Not so simple if you don ’t. </li></ul></ul></ul></ul><ul><ul><ul><li>Map to data location </li></ul></ul></ul><ul><ul><ul><ul><li>Supports multiple keys. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Doubles reads required to locate data. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>SPOF elimination on map structure is complex. </li></ul></ul></ul></ul><ul><li>Rationale </li></ul><ul><ul><li>Horizontal scaling of transactional load. </li></ul></ul><ul><ul><li>Segment business impact on database outage. </li></ul></ul>
    21. 21. Scaling the Data Tier: Logical Database Hosts <ul><li>Separate Application notion of a database from physical implementation </li></ul><ul><li>Databases may be combined and separated with no code changes </li></ul><ul><li>Reduce cost of creating multiple environments (Dev, QA, …) </li></ul>DB2 DB3 DB1 Application Servers Attributes Catalogs Rules CATY 1..N User Account Feedback Misc API SCRATCH
    22. 22. Scaling the Data Tier: Minimize DB Resources <ul><li>No business logic in database </li></ul><ul><ul><li>No stored procedures </li></ul></ul><ul><ul><li>Only very simple triggers (default value population) </li></ul></ul><ul><li>Move CPU-intensive work to applications </li></ul><ul><ul><li>Referential Integrity </li></ul></ul><ul><ul><li>Joins </li></ul></ul><ul><ul><li>Sorting </li></ul></ul><ul><li>Extensive use of prepared statements and bind variables </li></ul>
    23. 23. Scaling the Data Tier: Minimize DB Transactions <ul><li>Auto-commit for vast majority of DB writes </li></ul><ul><li>Absolutely no client side transactions </li></ul><ul><ul><li>Single database transactions managed through anonymous PL/SQL blocks. </li></ul></ul><ul><ul><li>No distributed transactions. </li></ul></ul><ul><li>How do we pull it off? </li></ul><ul><ul><li>Careful ordering of DB operations </li></ul></ul><ul><ul><li>Recovery through </li></ul></ul><ul><ul><ul><li>Asynchronous recovery events </li></ul></ul></ul><ul><ul><ul><li>Reconciliation batch </li></ul></ul></ul><ul><ul><ul><li>Failover to async flow </li></ul></ul></ul><ul><li>Rationale </li></ul><ul><ul><li>Avoid deadlocks </li></ul></ul><ul><ul><li>Avoid coupling availability </li></ul></ul><ul><ul><li>Update concurrency </li></ul></ul><ul><ul><li>Seamless handling of splits </li></ul></ul>
    24. 24. Scaling the Application Tier
    25. 25. Scaling the Application Tier – Overview <ul><li>Spread the Load </li></ul><ul><ul><li>Segmentation by function. </li></ul></ul><ul><ul><li>Horizontal load-balancing within functions. </li></ul></ul><ul><li>Minimize dependencies </li></ul><ul><ul><li>Between applications </li></ul></ul><ul><ul><li>Between functional areas </li></ul></ul><ul><ul><li>From applications to data tier resources </li></ul></ul><ul><li>Virtualize data access </li></ul>
    26. 26. Scaling the Application Tier – Massively Scaling J2EE <ul><li>Step 1 - Throw out most of J2EE </li></ul><ul><ul><li>eBay scales on servlets and a rewritten connection pool. </li></ul></ul><ul><li>Step 2 – Keep Application Tier Completely Stateless </li></ul><ul><ul><li>No session state in application tier </li></ul></ul><ul><ul><li>Transient state maintained in cookie or scratch database </li></ul></ul><ul><li>Step 3 – Cache Where Possible </li></ul><ul><ul><li>Cache common metadata across requests, with sophisticated cache refresh procedures </li></ul></ul><ul><ul><li>Cache reload from local storage </li></ul></ul><ul><ul><li>Cache request data in ThreadLocal </li></ul></ul>
    27. 27. Scaling the Application Tier – Tiered Application Model BO/BOF AO/AOF (View) Business Logic XML Model Building Logic Command (View) DO/DAO XSL Business Tier Presentation Tier Integration Tier Data Access Layer (DAL) <ul><li>Strictly partition application into tiers </li></ul><ul><ul><li>Presentation </li></ul></ul><ul><ul><li>Business </li></ul></ul><ul><ul><li>Integration </li></ul></ul>
    28. 28. Scaling the Application Tier – Data Access Layer (DAL) <ul><li>What is the DAL? </li></ul><ul><ul><li>eBay ’s internally-developed pure Java OR mapping solution. </li></ul></ul><ul><ul><li>All CRUD (Create Read Update Delete) operations are performed through DAL ’s abstraction of the data. </li></ul></ul><ul><ul><li>Enables horizontal scaling of the Data tier without application code changes </li></ul></ul><ul><li>Dynamic Data Routing abstracts application developers from </li></ul><ul><ul><li>Database splits </li></ul></ul><ul><ul><li>Logical / Physical Hosts </li></ul></ul><ul><ul><li>Markdown </li></ul></ul><ul><ul><li>Graceful degradation </li></ul></ul><ul><li>Extensive JDBC Prepared Statements cached by DataSources </li></ul>
    29. 29. Scaling the Application Tier – Vertical Code Partitioning <ul><li>Partition code into functional areas </li></ul><ul><ul><li>Application is specific to a single area (Selling, Buying, etc.) </li></ul></ul><ul><ul><li>Domain contains common business logic across Applications </li></ul></ul><ul><li>Restrict inter-dependencies </li></ul><ul><ul><li>Applications depend on Domains, not on other Applications </li></ul></ul><ul><ul><li>No dependencies among shared Domains </li></ul></ul>Core-Domain PersonalizationDomain LookupDomain UserValidationDomain SharedBillingDomain SharedSearchDomain API Domain SharedBuyingDomain myEbayDomain UserApplication BuyingDomain BillingDomain SearchDomain SellingDomain UserDomain SellingApplication BuyingApplication SearchApplication BillingApplication Applications Shared Domains
    30. 30. Scaling the Application Tier – Functional Segmentation IIS IIS ViewItem Pool http://cgi.ebay.com … … … User Acct Caty20+ Caty1 AppServers … IIS IIS CGI0 CGI5 AS AS AS AS SYI Pool http://cgi5.ebay.com … IIS WebServers Load Balancing Load Balancing Load Balancing <ul><li>Segment functions into separate application pools </li></ul><ul><ul><li>Minimizes / isolates DB dependencies </li></ul></ul><ul><ul><li>Allows for parallel development, deployment, and monitoring </li></ul></ul>AS AS
    31. 31. Scaling the Application Tier – Platform Decoupling <ul><li>Domain Partitioning for Deployment </li></ul><ul><ul><li>Decouple non-transactional domains from transactional flows </li></ul></ul><ul><ul><ul><li>Search and billing domains are not required in transaction processing. </li></ul></ul></ul><ul><ul><ul><li>Fraud domain is required but easier to manage as separate deployment. </li></ul></ul></ul><ul><ul><li>Integrate with a combination of asynchronous EDA and synchronous SOA patterns. </li></ul></ul>Transaction Platform Billing Search Fraud EDA EDA SOA
    32. 32. Scaling Search
    33. 33. Scaling Search – Overview <ul><li>In 2002, eBay search had reached its limits </li></ul><ul><ul><li>Cost of scaling third-party search engine had become prohibitive </li></ul></ul><ul><ul><li>9 hours to update the index </li></ul></ul><ul><ul><li>Running on largest systems vendor sold – and still not keeping up </li></ul></ul><ul><li>eBay has unique search requirements </li></ul><ul><ul><li>Real-time updates </li></ul></ul><ul><ul><ul><li>Update item on any change (list, bid, sale, etc.) </li></ul></ul></ul><ul><ul><ul><li>Users expect changes to be visible immediately </li></ul></ul></ul><ul><ul><li>Exhaustive recall </li></ul></ul><ul><ul><ul><li>Sellers notice if search results miss any item </li></ul></ul></ul><ul><ul><ul><li>Search results require data ( “histograms”) from every matching item </li></ul></ul></ul><ul><ul><li>Flexible data storage </li></ul></ul><ul><ul><ul><li>Keywords </li></ul></ul></ul><ul><ul><ul><li>Structured categories and attributes </li></ul></ul></ul><ul><li>No off-the-shelf product met these needs </li></ul>
    34. 34. Scaling Search – Voyager <ul><li>Real-time feeder infrastructure </li></ul><ul><ul><li>Reliable multicast from primary database to search nodes </li></ul></ul><ul><li>Real-time indexing </li></ul><ul><ul><li>Search nodes update index in real time from messages </li></ul></ul><ul><li>In-memory search index </li></ul><ul><li>Horizontal segmentation </li></ul><ul><ul><li>Search index divided into N slices ( “columns”) </li></ul></ul><ul><ul><li>Each slice is replicated to M instances ( “rows”) </li></ul></ul><ul><ul><li>Aggregator parallelizes query over all N slices, load-balances over M instances </li></ul></ul><ul><li>Caching </li></ul><ul><ul><li>Cache results for highly expensive and frequently used queries </li></ul></ul>
    35. 35. Scaling Search – Voyager
    36. 36. Scaling Operations
    37. 37. Scaling Operations – Code Deployment <ul><li>Demanding Requirements </li></ul><ul><ul><li>Entire site rolled every 2 weeks </li></ul></ul><ul><ul><li>All deployments require staged rollout with immediate rollback if necessary. </li></ul></ul><ul><ul><li>More than 100 WAR configurations. </li></ul></ul><ul><ul><li>Dependencies exist between pools during some deployment operations. </li></ul></ul><ul><ul><li>More than 15,000 instances across eight physical data centers. </li></ul></ul><ul><li>Rollout Plan </li></ul><ul><ul><li>Custom application that works from dependencies provided by projects. </li></ul></ul><ul><ul><li>Creates transitive closure of dependencies. </li></ul></ul><ul><ul><li>Generates rollout plan for Turbo Roller. </li></ul></ul><ul><li>Automated Rollout Tool ( “Turbo Roller”) </li></ul><ul><ul><li>Manages full deployment cycle onto all application servers. </li></ul></ul><ul><ul><li>Executes rollout plan. </li></ul></ul><ul><ul><li>Built in checkpoints during rollout, including approvals. </li></ul></ul><ul><ul><li>Optimized rollback, including full rollback of dependent pools. </li></ul></ul>
    38. 38. Scaling Operations – Monitoring <ul><li>Centralized Activity Logging (CAL) </li></ul><ul><ul><li>Transaction oriented logging per application server </li></ul></ul><ul><ul><ul><li>Transaction boundary starts at request. Nested transactions supported. </li></ul></ul></ul><ul><ul><ul><li>Detailed logging of all application activity, especially database and other external resources. </li></ul></ul></ul><ul><ul><ul><li>Application generated information and exceptions can be reported. </li></ul></ul></ul><ul><ul><li>Logging streams gathered and broadcast on a message bus. </li></ul></ul><ul><ul><ul><li>Subscriber to log to files (1.5TB/day) </li></ul></ul></ul><ul><ul><ul><li>Subscriber to capture exceptions and generate operational alerts. </li></ul></ul></ul><ul><ul><ul><li>Subscriber for real time application state monitoring. </li></ul></ul></ul><ul><ul><li>Extensive Reporting </li></ul></ul><ul><ul><ul><li>Reports on transactions (page and database) per pool. </li></ul></ul></ul><ul><ul><ul><li>Relationships between URL ’s and external resources. </li></ul></ul></ul><ul><ul><ul><li>Inverted relationships between databases and pools/URL ’s. </li></ul></ul></ul><ul><ul><ul><li>Data cube reporting on several key metrics available in near real time. </li></ul></ul></ul>
    39. 39. Recap <ul><li>Enabling seamless growth </li></ul><ul><li>Massive Database and Code Scalability </li></ul><ul><li>Delivering quality functionality at accelerating rates </li></ul><ul><li>Further streamline and optimize the eBay development model </li></ul><ul><li>Enabling rapid business innovation </li></ul>Maintainability Faster Product Delivery Architecting for the future 10X Growth Availability Reliability Massive Scalability Security

    ×