Wix Architecture at Scale - QCon London 2014
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Wix Architecture at Scale - QCon London 2014

on

  • 24,302 views

In this talk I will go over Wix's architecture, how we evolved our system to be highly available even at the worst case scenarios when everything can break, how we built a self-healing eventual ...

In this talk I will go over Wix's architecture, how we evolved our system to be highly available even at the worst case scenarios when everything can break, how we built a self-healing eventual consistency system for website data distribution and will show some of the patterns we use that helps us render 45M websites while maintaining a relatively low number of servers.

Statistics

Views

Total Views
24,302
Views on SlideShare
24,264
Embed Views
38

Actions

Likes
2
Downloads
23
Comments
0

4 Embeds 38

http://lanyrd.com 31
http://www.feedspot.com 5
https://twitter.com 1
http://feedly.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • How many built a website?
  • Editor – Read immediately after write – Small working setViewer optimize for readsWe fight for every ms. Page view = many resource downloading
  • Read-only services only if it is part of the same business functionality
  • Immutable data helps handle eventual consistencyMySql is a great key-value storeNot all data is equal (only 6% of websites are edited 3 months after creation)
  • Revision keep data safe from poisoning Pay in storage and management
  • Highly optimized code – every ms count
  • Tech vendor lock is a myth,  easy to change the api (small dev effort).  Invest in data distribution.Evaluation of new platform starts by putting the data.
  • We can change the arrows as we want
  • Save pages on JSONUpload to static storage
  • Explain what is JSON and what is HTML
  • UPS dies, secondary power source connected to the same UPS
  • Due to error or bad configuration

Wix Architecture at Scale - QCon London 2014 Presentation Transcript

  • 1. Wix Architecture at Scale Aviran Mordo Head of Back-End Engineering @ Wix @aviranm linkedin.com/in/aviran aviransplace.com
  • 2. Wix in Numbers Over 45,000,000 users 1M new users/month Static storage is >800TB of data 1.5TB new files/day 3 data centers + 2 clouds (Google, Amazon) 300 servers 700M HTTP requests/day 600 people work at Wix, of which ~ 200 in R&D
  • 3. Initial Architecture Tomcat, Hibernate, custom web framework Built for fast development Stateful login (Tomcat session), Ehcache, file uploads No consideration for performance, scalability and testing Intended for short-term use Lighttpd (file serving) Wix (Tomcat) MySQL DB
  • 4. The Monolithic Giant One monolithic server that handled everything Dependency between features Changes in unrelated areas of the system caused deployment of the whole system Failure in unrelated areas will cause system wide downtime
  • 5. Breaking the System Apart
  • 6. Concerns and SLA Edit websites Data Validation Security / Authentication Data consistency Lots of data View sites, created by Wix editor High availability Serving Media High availability High performance High performance High traffic volume Lots of static files Long tail Very high traffic volume Viewport optimization Cacheable data
  • 7. Wix Segmentation Networking 1. Editor Segment 2. Media Segment 3. Public Segment
  • 8. Making SOA Guidelines Each service has its own database (if one is needed) Only one service can write to a specific DB There may be additional read-only services that directly accesses the DB (for performance reasons) Services are stateless No DB transactions Cache is not a building block, but an optimization
  • 9. 1. Editor Segment
  • 10. Editor Server Immutable JSON pages (~2.5M / day) Site revisions Active – standby MySQL cross datacenters Editor Server MySQL Active Sites MySQL Archive
  • 11. Protect The Data Protect against DB outage with fast recovery = replication Protect against data poisoning/corruption = revisions / backup Make the data available at all times = data distribution to multiple locations / providers
  • 12. Saving Editor Data Browser Save Page(s) 200 OK Editor Server Upload Static Grid Notify Save Page MySQL Active Sites MySQL Archive DC replication MySQL Active Sites MySQL Archive Download Page Notify Archive (Google) Google Cloud Storage Archive (Amazon)
  • 13. Self Healing Process Browser Save Page(s) 200 OK Editor Server Upload Static Grid Notify Save Page MySQL Active Sites MySQL Archive DC replication MySQL Active Sites MySQL Archive Download Page Notify Archive (Google) Google Cloud Storage Archive (Amazon)
  • 14. No DB Transactions Save each page (JSON) as an atomic operation Page ID is a content based hash (immutable/idempotent) Finalize transaction by sending site header (list of pages) Can generate orphaned pages, not a problem in practice
  • 15. 2. Media Segment
  • 16. Prospero – Wix Media Storage 800TB user media files 3M files uploaded daily 500M metadata records Dynamic media processing • Picture resize, crop and sharpen “on the fly” • Watermark • Audio format conversion
  • 17. Prospero Eventual consistent distributed file system Multi datacenter aware Automatic fallback cross DC Run on commodity servers & cloud
  • 18. Prospero – Wix Media Manager Tampa Google Cloud x36 x36 T x32 T Second fallback First fallback Austin CDN get image.jpg If not in CDN x36 x36 T x32 T
  • 19. 3. Public Segment
  • 20. Public Segment Roles Routing (resolve URLs) www.example.com HTML Renderer Dispatching (to a renderer) HTML SEO Renderer Flash SEO Renderer Flash Renderer Sitemap Renderer Robots.txt Renderer Rendering (HTML,XML,TXT) Public Server
  • 21. Public SLA Response time <100ms at peak traffic
  • 22. Publish A Site Publish site header (a map of pages for a site) Publish routing table Publish site header / routes Editor Segment Public Segment
  • 23. Built For Speed Minimize out-of-service hops (2 DB, 1 RPC) Lookup tables are cached in memory, updated every 5 minutes Denormalized data – optimize for read by primary key (MySQL) Minimize business logic
  • 24. How a Page Gets Rendered Bootstrap HTML template that contains only data Only JavaScript imports JSON data (site-header + dynamic data) No “real” HTML view
  • 25. Offload rendering work to the browser
  • 26. The average Intel Core i750 can push up to 7 GFLOPS without overclocking
  • 27. Why JSON? Easy to parse in JavaScript and Java/Scala Fairly compact text format Highly compressible (5:1 even for small payloads) Easy to fix rendering bugs (just deploy a new client code)
  • 28. Minimum Number of Public Servers Needed to Serve 45M Sites 4
  • 29. Public SLA Be Available 99.99999%
  • 30. Serving a Site – Sunny Day Browser http://example.wix.com HTTP Request HTML Resources / Media CDN Notify site view HTTP Request LB Store HTML to cache Public Rendere r Archiv e Statics
  • 31. Serving a Site – DC Lost Browser http://example.wix.com CDN Statics HTTP Request LB Archiv e LB Public Public Rendere r Rendere r Change DNS
  • 32. Serving a Site – Public Lost Browser http://example.wix.com HTTP Request CDN HTML Get Cached HTML Version LB Public Rendere r Archiv e Statics
  • 33. Living in the Browser Browser http://example.wix.com HTTP Request Fallback Editor LB Public Rendere r HTML JSON / Media CDN Fallback Archiv e Statics
  • 34. Summary Identify your critical path and concerns Build redundancy in critical path (for availability) De-normalize data (for performance) Minimize out-of-process hops (for performance) Take advantage of client’s CPU power
  • 35. Q&A http://goo.gl/Oo3lGr Aviran Mordo Head of Back-End Engineering @ Wix @aviranm linkedin.com/in/aviran aviransplace.com