Your SlideShare is downloading. ×

eBay Architecture

11,479

Published on

1 Comment
10 Likes
Statistics
Notes
No Downloads
Views
Total Views
11,479
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
618
Comments
1
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. eBay  Architecture  Scalability  with  Agility  Tony  Ng  Director,  Systems  Architecture  October  2011  
  • 2. About  Me   •  eBay  –  Systems  Architecture  and  Engineering   •  Yahoo!  –  Social,  Developer  PlaEorms,  YQL   •  Sun  Microsystems  –  J2EE,  GlassFish,  JSRs   •  Author  of  books  on  J2EE,  SOA      2  
  • 3. eBay  Stats   •  97  million  acQve  users   •  62B  Gross  Merchandise  Volume  in  2010   •  200  million  items  for  sale  in  50,000  categories   •  A  cell  phone  is  sold  every  5  seconds  in  US   •  An  iPad  sold  every  2.2  minutes  in  US   •  A  pair  of  shoes  sold  every  9  seconds  in  US   •  A  passenger  vehicle  sold  every  2  minutes   •  A  motorcycle  sold  every  6  minutes   http://www.ebayinc.com/factsheets3  
  • 4. eBay  Scale   •  9  Petabytes  of  data  storage   •  10,000  applicaQon  servers   •  44  million  lines  of  code   •  2  billion  pictures   •  A  typical  day   –  75B  database  calls   –  4B  page  views   –  250B  search  queries   –  Billions  of  service  calls   –  Hundreds  of  millions  of  internal  asynchronous  events  4  
  • 5. History  of  Technology   ·∙ Java ·∙ V4 Components ·∙ Services ·∙ Internal Cloud ·∙ Platform ·∙ Java ·∙ XSL ·∙ Layered ·∙ Horizontal Scale Architecture Maturity Innovation Potential ·∙ Some APIs ·∙ Perl/C++ Agility / TTM ·∙ Inline HTML ·∙ Monolithic ·∙ Vertical Scale ·∙ Walled Garden 2009+ 2001 2005 1995 1999
  • 6. eBay  Scalable  Architecture   •  ParQQon  everything   –  Databases,  applicaQon  Qer,  search  engine   •  Stateless  preference   –  No  session  state  in  app  Qer   •  Asynchronous  processing   –  Event  streams,  batch   •  Manage  failures   –  Central  applicaQon  logging   –  Mark  downs  6  
  • 7. Next  Challenges   •  To  stay  compeQQve,  we  need  to  deliver  quality   features  and  innovaQons  at  acceleraQng  paces   •  Complexity  as  our  codebase  grows   •  Improve  developer  producQvity   •  Enable  faster  Qme-­‐to-­‐market  while  maintaining  site   stability  7  
  • 8. Scalability  with  Agility   •  Strategy  1:  AutomaQon  with  Cloud   •  Strategy  2:  Next  Gen  Service  OrientaQon   •  Strategy  3:  Modularity  8  
  • 9. Automa=on  with  Cloud  9  
  • 10. Hardware  Acquisi=on   request order receive & deliver { servers, rack & wiremodel, app } Label (app) “several” weeks 2-3 w 1w repurpose request order Receive deliver to request deliver {servers, pre-racked cache {servers, model } Pre-wired model, app }quarterly minutes 2-3 w 1 day repurpose 10  
  • 11. Improving  U=liza=on   DR Number of servers required based on utilization for 8 pools 11  
  • 12. Infrastructure  Virtualiza=on  Application App App App Application App App App Spare spare spare spare Global resource pool Infra Infra Infra Infra Shared infrastructure
  • 13. eBay  Cloud   Self Service Automation Capacity Management Portal Resource Allocation Virtualization Spare Capacity pool Hardware Acquisition provisioning in minutes Improved Time to Market
  • 14. Architecture  Decision  14  
  • 15. Infrastructure  &  PlaIorm  as  a  service   Higher developer productivity Full application level automation Enables innovation on new platforms Platform As A Service Infrastructure Automated Life Cycle Management level automation Front End, Search Back End, Generic Platform Infrastructure As A Service Automated Operations Virtualized & Common Infrastructure 15  
  • 16. Model  Driven  Deployment  Automa=on  LB Pool LB Pool •  Desired configuration is specified in the expected state and persisted in CMS Server Server Server Server Server Server •  Upon approval, the orchestration will configure the site to reflect the desired configuration. Reconciliation Expected Current State State •  Updated site configuration is discovered based on Comparison detection of configuration events •  Reconciliation between the Orchestration Discovery expected and current state allows to verify the proper configuration. •  On going validation allows the detection of out of band changes. Site 16  
  • 17. Open  Source  Integra=on   IaaS/PaaS API IaaS/PaaS API Distribute Distribute orchestrat Resource orchestrat Resource d d ion Allocation ion Allocation State State Applicatio Access Applicatio Access AuthN/ AuthN/ n Point n Point AuthZ AuthZ Controller Controller Controller Controller Compute Cluster Pool Compute Cluster Pool Controller Controller Controller Controller Controller Controller Compute Mgt. DNS Mgt. LB Mgt. Monitor ing Open  Source   Network Image/Pkg Software SoluQon   Prov Repo Dist. (openstack  /  Cloudstack)  
  • 18. Applica=on  Architecture   Before Ongoing Future “Cloud ‘Cloud Friendly” ready’
  • 19. Next  Gen  Service  Orienta=on  19  
  • 20. Services  @  eBay  •  It’s a journey !•  History •  One of the first to expose APIs /Services •  In early 2007, embarked on service orienting our entire ecommerce platform, whether the functionality is internal or external •  Support REST style as well as SOA style •  Have close to 300 services now and more on the way •  Early adopter of SOA governance automation (Discovery vs. control) 20
  • 21. Architecture  Vision   Customer Experience Core  Experience   Custom  Experiences   Channels   Application Platform Services Login   Iden=ty   Catalog   Search   List   Pricing   Offer   ADs  Messages   Cart   Coupons  Payment  Shipping  CS   Technology Platform App   Data  Access   Dev  Tools   Presenta Messaging   SOA   Cloud   Stack   Layer   =on   Operations Infrastructure Layer Power   Data  Center   Hardware   Network   Database   Tools   Opera=ons  
  • 22. Challenges   •  MulQple  data  formats   •  Latency   •  Service  consumer  producQvity  22  
  • 23. Challenge  1:  Mul=ple  Data  Formats   •  Mix  of  user  preferences   –  SOAP   –  XML  /  HTTP   –  JSON   –  Name-­‐Value  Pair  (NV)   •  Service  developers  don’t  want  to  write  extra  code  to  do   conversions;  too  much  maintenance  impact   •  Key  observaQons:   –  Users  ask  for  whatever  data  format  they  want.   –  Anything  you  can  express  in  XML,  you  can  express  in  other   formats   –  Complete  mapping  from  XML  structures  to  NV  and  JSON  23  
  • 24. Solu=on:  Pluggable  Data  Formats  Using  JAXB   Uniform interface XML Pluggable formats A single Instance ofJSON XML Directly Service Impl Ser/Deser module NV deserialized into pipeline JAXB Passed to JSON Java objects others NV SOA frameworkOtherformats No intermediate format, 24   Avoids extra conversion
  • 25. Challenge  2:  Latency   •  For  large  datasets,  there  can  be  nasty  latencies.   – Not  fixed  by  compressing  or  using  Fast  Infoset   2MB structured response payload 350 300 250 200 150 100 50 0 Wire Time (msec)25  
  • 26. Solu=on:  Binary  Formats  •  Evaluated  binary  formats:   •  Google  Protocol  Buffers,  Avro,  ThriY  •  Numbers  look  promising  (serializaQon,  deserializaQon)  •  New  challenges  with  these:   •  Each  has  its  own  schema  (type  definiQon  language)  to   model  types  and  messages   •  Each  has  its  own  code  genera=on  for  language  bindings   •  NOT  directly  compaQble  with  JAXB  beans   •  eBay  SOA  plaEorm  uses  WSDL/XML  Schema  (XSD)  data   modeling,  and  JAXB  language  bindings  26  
  • 27. Compare  Popular  Binary  Formats  Protobuf Avro Thrift•  Own IDL/schema •  JSON based Schema •  Own IDL/schema•  Sequence numbers for each •  Schema prepended to the message •  Sequence numbers for each element on the wire element•  Compact binary representation on •  Compact binary representation on •  Compact binary representation on the wire the wire the wire•  Most XML schema elements are •  Most XML schema elements are •  Most XML schema elements are mappable to equivalents, except mappable to equivalent, except mappable to equivalents, except polymorphic constructs polymorphic constructs polymorphic constructs•  Versioning is similar to XML, a bit •  Versioning is easier •  Versioning is similar to XML, a bit more complex in implementing due more complex in implementing due to sequence numbers to sequence numbers Inheritance Self-­‐ / Complex   Unions   References   Polymorph Types   (Choice  Type)  (Trees)   Enums   ism   Inline  A`achment  Protobuf   Yes   No   Yes   Yes   No   No   Yes  (with  Avro   Yes   Yes   workaround)   Yes   No   No  ThriY   Yes   No   No   No   No   No  XML   27   Yes   Yes   Yes   Yes   Yes   Yes  (MIME-­‐TYPE)  
  • 28. Comparison  of  Data  Formats   Response data: 50 items x 75 fields (about 8000 objects)200180160140120100 Size (KB) 80 Wire time (msec) 60 40 20 0 JSON XML Fast Infoset Protobuf28  
  • 29. Latency  Improvements   200 180 160 140 120 100 80 Wire Time(msec) 60 40 20 0 XML XML no XML flat PB no PB flat poly poly29  
  • 30. Challenge  3:    Service  Consumer  Produc=vity   •  Large,  complex  requests  and  responses   •  Get  exactly  what  they  want  in  data  returned  from  services   •  Lack  of  consistency  in  service  interface  convenQons  and  data   access  pajerns   •  Real  client  applicaQons  make  calls  to  mulQple  services  at  a   Qme   –  Serial  calls  increase  latency.  Managing  parallel  calls  is  complex   •  Impedance  mismatch  between  service  interface  and  client   needs   –  Too  much  data  is  returned   –  1  +  n  calls  to  get  detailed  data  30  
  • 31. Sneak  Preview:     •  New  technology  from  eBay   •  Plan  to  open  source  soon   •  SQL  +  JSON  based  scripQng  language  for  aggregaQon   and  orchestraQon  of  service  calls   •  Filtering  and  projecQons  of  responses   •  Async  orchestraQon  engine   –  AutomaQc  parallelizaQon,  fork  /  join  31  
  • 32. What  ql.io  Enables   •  Create  consumer-­‐controlled  interfaces     -  fix/patch  APIs  on  the  fly   •  Filter  and  project  responses     -  use  a  declaraQve  language   •  Bring  in  consistency     -  offer  RESTful  shims  with  simpler  syntax   •  Aggregate  mul=ple  APIs   -  such  as  batching   •  Orchestrate  requests   -  without  worrying  about  async  forks  and  joins  32  
  • 33. ql.io  Examples   •  Simple  Select   –  select  *  from  ebay.finding.items  where  keywords=‘ipad’   •  Field  ProjecQons   –  select  Qtle,  itemId  from  ebay.finding.items  where   keywords=‘ipad’   •  Sub-­‐Select   –  select  e.Title,  e.ItemID  from  ebay.item.details  as  e  where   e.itemId  in  (select  itemId  from  ebay.finding.items  where   keywords  =  ‘ipad’)  33  
  • 34. ql.io  Batch  Example   itemId  =  select  itemId  from  ebay.finding.items  where  keywords  =  ferrari  limit  1;   item  =  select  *  from  ebay.shopping.singleitem  where  itemId  =  {itemId};   user  =  select  *  from  ebay.shopping.userprofile  where  userId  =  sallamar;   tradingItem  =  select  *  from  ebay.trading.geQtem  where  itemId  =  {itemId};   bestOffers  =  select  *  from  ebay.trading.bestoffers  where  itemId  =  {itemId};   bidders  =  select  *  from  ebay.trading.getallbidders  where  itemId  =  {itemId};   return  {          "user"  :  "{user}",        "item"  :  "{item}”,        "tradingItem"  :  "{tradingItem}",        "bidders"  :  "{bidders}",        "bestOffers"  :  "{bestOffers}"   };  34  
  • 35. ql.io  Demo  35  
  • 36. Modularity  36  
  • 37. Key  modularity  concepts  for  soYware  •  Building  blocks  •  Re-­‐use  •  Granularity  •  Dependencies  •  EncapsulaQon  •  ComposiQon  •  Versioning   Source: http://techdistrict.kirkk.com/2010/04/22/granularity-architectures-nemesis/ Author: Kirk Knoernschild37
  • 38. Challenges  for  Large  Enterprises  •  Some  stats  on  the  eBay  code  base   –  ~  44  million  of  lines  of  code  and  growing   –  Hundreds  of  thousands  of  classes   –  Tens  of  thousands  of  packages   –  ~  4,000+  jars  •  We  have  too  many  dependencies  and  Qght  coupling   in  our  code   –  Everyone  sees  everyone  else   –  Everyone  affects  everyone  else    38
  • 39. Challenges  for  Large  Enterprises   •  Developer  producQvity/agility  suffers  as  the  knowledge  goes  down   –  Changes  ripple  throughout  the  system   –  Fallouts  from  changes/features  are  difficult  to  resolve   –  Developers  slow  down  and  become  risk  averse   knowledge complexity code size39  
  • 40. Our  Goals  with  Modularity  Efforts  •  Tame  complexity    •  Organize  our  code  base  in  loose  coupling  fashion   – Coarse-­‐grained  modules:  number  majers!   – DeclaraQve  coupling  contract   – Ability  to  hide  internals  •  Establish  clear  code  ownership,  boundaries  and   dependencies  •  Allow  different  components  (and  teams)  evolve  at   different  speeds  •  Increase  development  agility  40
  • 41. Modularity  Solu=ons  Evalua=on   •  Evaluated  OSGi,  Maven,  Jigsaw  and  JBoss  Module   •  Criteria  include:   –  Modularity  enforcement   –  End-­‐to-­‐end  development   –  MigraQon  concerns   –  AdopQon   –  Maturity   •  Selected  OSGi  41  
  • 42. OSGi  @  eBay   •  Modularize  plaEorm  into  OSGi  bundles  with  well-­‐defined  imports  and   exports   •  Challenges:  split  packages,  Classloader  contructs   •  Source  to  binary  dependencies   •  Refresh  end-­‐to-­‐end  development  life  cycle   pull/push SCM pull Command line IDE build (CI) consume publish/consume Deployment Server runtime Repository packaging deploy42  
  • 43. Lessons  Learned  •  OSGi  learning  curve  is  sQll  fairly  steep   –  large  group  of  developers  with  varying  skill  levels  •  End-­‐to-­‐end  development  lifecycle   –  Tools  may  not  work  well  together.  Leverage  OSGi  tools  like  bnd  •  Conversion/migraQon  of  exisQng  code  base   –  Not  starQng  from  vacuum   –  Cost  to  rewrite  /  refactor  code   –  We  cannot  afford  disrupQon  to  business  meanwhile:  “change  parts   while  the  car  is  running”  •  SemanQc  versioning  adopQon  is  important  43  
  • 44. Overall  Summary   •  Strategies   –  Deployment  Agility:  AutomaQon  with  Cloud   –  Development  Agility:  Next  gen  Service  OrientaQon   –  Taming  complexity:  Modularity   •  Systems  quality  &  scalable  architecture  as  key   foundaQon   •  Complexity  management  and  developer  producQvity   becomes  increasingly  important   •  Strike  balance  between  agility  and  stability  44  
  • 45. eBay Open Source •  eBay has been a strong supporter of Open Source model and community •  Check out http://eBayOpenSource.org •  Mission is to open source some of the best of breed technologies that were developed originally within eBay Inc. •  Under a liberal open source license. •  These projects are generic technology projects and several years of development effort has gone into them to mature them. (Coming soon)45

×