Inside
Architecture
Lanyrd's
Andrew Godwin
Web Engineer, Lanyrd
@andrewgodwin
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/lanyrd-architecture
Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
WHO AM I?
Andrew Godwin
Web developer
Systems administrator
Technical architect
Django core developer
LANYRD: THE EARLY YEARS
The Origin Story
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
June 2010
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
August 2010
Good music on, an orange juice and some
CSS fun in front of me, we have an apartment
in Casablanca! (for a week or two anyway :)
” ”
@natbat
7:19 pm, 18 August 2010
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
August 2010
We launched lanyrd.com/ ! Go easy on it,
the log files are going a bit nuts,
who knew Twitter was viral?
” ”
@simonw
10:52 am, 31 August 2010
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
August 2010
Right... this clearly isn't sustainable. Going to
have to switch the site in to read only mode
for a few hours, sorry everyone!
” ”
@simonw
11:35 am, 31 August 2010
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
January 2011
Natalie and Simon start three months of
YCombinator, in California.
LANYRD: THE EARLY YEARS
2010 2011 2012 2013
September 2011
Lanyrd closes a $1.4 million seed funding
round, moves back to London.
LANYRD TODAY
2010 2011 2012 2013
March 2013
∙ Conferences
∙ Profile pages
∙ Emails
∙ Coverage
∙ Topics
∙ Guides
∙ Mobile app∙ Dashboard
LANYRD TODAY
2010 2011 2012 2013
March 2013
LANYRD TODAY
2010 2011 2012 2013
March 2013
LANYRD TODAY
2010 2011 2012 2013
March 2013
LANYRD TODAY
2010 2011 2012 2013
March 2013
LANYRD TODAY
2010 2011 2012 2013
March 2013
LANYRD TODAY
2010 2011 2012 2013
March 2013
Key dynamic parts:
Users tracking/attending events
Users tracking each other
Users tracking topics and guides
THE STACK TODAY
What we run on
THE STACK TODAY
Browser
Nginx
HAProxy
Varnish
Gunicorn
Main site runtime
Amazon S3
Celery
Task workers
Redis
PostgreSQL Solr
SSL Termination
Web Cache
Load balancer
Static files & uploads
Tasks, Set calcs
Search and facetingMain data store
Memcached
Fragment caching
THE STACK TODAY
Lanyrd is almost entirely Django (Python)
Background tasks use Celery, a Django task queue
Management tasks/cron jobs also run inside the framework
The Django application is served by Gunicorn containers
THE STACK TODAY
Main data store for everything except uploads
We run a master and a replicated slave
Around 80GB of data in five databases
Each server runs on a RAID 1 disk array
PostgreSQL
THE STACK TODAY
Task queue transport for Celery and tweet listeners
Contains user sets for every conference, user and topic
Used for efficient narrowing of queries before Solr is hit
Redis
THE STACK TODAY
Stores conferences, users, sessions and more
Very rich metadata on each item
Heavy use of sharding thoroughout the site
Solr
We run a master and a replicated slave
THE STACK TODAY
First point of call for all requests
Caches most anonymous requests
Enforces read-only mode if enabled
Varnish
One used and one hot spare at all times
THE STACK TODAY
Sits behind Varnish
Distributes load amongst frontend servers
Re-routes requests during deploys
HAProxy
Two in use at all times, identically configured
THE STACK TODAY
Stores all uploaded files from users
Upload forms post directly to S3
Serves all static assets for the site (images, CSS, JS)
S3
Static assets are versioned with hash to help cache break
THE STACK TODAY
Browser
Nginx
HAProxy
Varnish
Gunicorn
Main site runtime
Amazon S3
Celery
Task workers
Redis
PostgreSQL Solr
SSL Termination
Web Cache
Load balancer
Static files & uploads
Tasks, Set calcs
Search and facetingMain data store
Memcached
Fragment caching
THE STACK BEFORE
What we've eliminated
THE STACK BEFORE
Stored analytics, logs and some other data
Lack of schema meant some bad data persisted
Poor complex query performance
MongoDB
Useful for quick prototyping
THE STACK BEFORE
Primary data store for things not in MongoDB
Very poor complex query performance
No advanced field types
MySQL
Full database locks during schema changes
A TALE OF TWO DBS
The Great Move of 2012
A TALE OF TWO DBS
Amazon EC2
MySQL
Softlayer
PostgreSQL
A TALE OF TWO DBS
Why?
Predictable loading means EC2 unnecessary
Better I/O throughput
Both moves required database downtime
A TALE OF TWO DBS
How?
Replicate Solr and Redis across to new servers
Enter read-only mode
Dump MySQL data
Convert MySQL dump into PostgreSQL dump
Load PostgreSQL dump
Re-point DNS, proxy requests from old servers
Exit read-only mode
A TALE OF TWO DBS
Time in read-only mode: 1 ½ hours
Downtime: 0 hours
CONTENT IS KING
The Advantages of Content
CONTENT IS KING
Read-only mode is entirely viable
An hour or two at most
Everyone logged out
Varnish blocks POSTs, caches everything aggressively
CONTENT IS KING
Indexing delay is acceptable
Most site views are driven by Solr
1 or 2 minute indexing delay
Some views add in recent changes directly
FEATURE FLAGS
Always be deploying
FEATURE FLAGS
Continuous Deployment
We deploy at least 5 times a day, if not 20
Nearly all code goes into master or short-lived branches
Anything unreleased is feature flagged
FEATURE FLAGS
Feature flags
Simple named boolean toggles
Settable by user, user tag, or conference
Can change templates, view code, URLs, etc.
FEATURE FLAGS
Flag management
User tag management
WHO WROTE THAT? OH, ME
Legacy code & decisions
WHO WROTE THAT? OH, ME
Technical Debt
It's fine to have some - it can speed things up
A good chunk of ours is gone, some remains
Big schema changes get harder and harder
SMALL AND NIMBLE
The power of small teams
SMALL AND NIMBLE
Six people
SMALL AND NIMBLE
Six people
2.5
Back-end
developers
1.75
Front-end
developers
1.5
Designers
0.75
System
administrators
0.75
Business
operations
0.5
Mobile
developers
SMALL AND NIMBLE
Awareness
Everyone knows everything that's happening
Daily stand-ups
Weekly show-and-tell sessions
SMALL AND NIMBLE
Always deployable
Master branch always shippable
Large development behind feature flags
Code review for nastier changes
LESSONS LEARNED
What's important here?
LESSONS LEARNED
Small and nimble
Continuous deployment and development style allows
easy project changing
No long approval processes
Less than ½ hour from report to shipped fix
LESSONS LEARNED
Content is great
Read-only mode allows less painful downtimes
Heavy caching smooths out our load
Learnable load patterns
LESSONS LEARNED
Fix it while you can
The bigger you get, the harder a fix
We moved to PostgreSQL just in time
Big schema changes now take days of coding
LESSONS LEARNED
Six amazing people
You don't need a big team to write a complex product
Communication is absolutely key
Using Open Source well is also crucial
Thank you.
Andrew Godwin
Sponsor or promote your company using events?
Get in touch:
@andrewgodwin
http://aeracode.org
info@lanyrd.com

Inside Lanyrd's Architecture

  • 1.
  • 2.
    InfoQ.com: News &Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /lanyrd-architecture
  • 3.
    Presented at QConLondon www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4.
    WHO AM I? AndrewGodwin Web developer Systems administrator Technical architect Django core developer
  • 6.
    LANYRD: THE EARLYYEARS The Origin Story
  • 7.
    LANYRD: THE EARLYYEARS 2010 2011 2012 2013 June 2010
  • 8.
    LANYRD: THE EARLYYEARS 2010 2011 2012 2013 August 2010 Good music on, an orange juice and some CSS fun in front of me, we have an apartment in Casablanca! (for a week or two anyway :) ” ” @natbat 7:19 pm, 18 August 2010
  • 9.
    LANYRD: THE EARLYYEARS 2010 2011 2012 2013 August 2010 We launched lanyrd.com/ ! Go easy on it, the log files are going a bit nuts, who knew Twitter was viral? ” ” @simonw 10:52 am, 31 August 2010
  • 10.
    LANYRD: THE EARLYYEARS 2010 2011 2012 2013 August 2010 Right... this clearly isn't sustainable. Going to have to switch the site in to read only mode for a few hours, sorry everyone! ” ” @simonw 11:35 am, 31 August 2010
  • 11.
    LANYRD: THE EARLYYEARS 2010 2011 2012 2013 January 2011 Natalie and Simon start three months of YCombinator, in California.
  • 12.
    LANYRD: THE EARLYYEARS 2010 2011 2012 2013 September 2011 Lanyrd closes a $1.4 million seed funding round, moves back to London.
  • 13.
    LANYRD TODAY 2010 20112012 2013 March 2013 ∙ Conferences ∙ Profile pages ∙ Emails ∙ Coverage ∙ Topics ∙ Guides ∙ Mobile app∙ Dashboard
  • 14.
    LANYRD TODAY 2010 20112012 2013 March 2013
  • 15.
    LANYRD TODAY 2010 20112012 2013 March 2013
  • 16.
    LANYRD TODAY 2010 20112012 2013 March 2013
  • 17.
    LANYRD TODAY 2010 20112012 2013 March 2013
  • 18.
    LANYRD TODAY 2010 20112012 2013 March 2013
  • 19.
    LANYRD TODAY 2010 20112012 2013 March 2013 Key dynamic parts: Users tracking/attending events Users tracking each other Users tracking topics and guides
  • 20.
  • 21.
    THE STACK TODAY Browser Nginx HAProxy Varnish Gunicorn Mainsite runtime Amazon S3 Celery Task workers Redis PostgreSQL Solr SSL Termination Web Cache Load balancer Static files & uploads Tasks, Set calcs Search and facetingMain data store Memcached Fragment caching
  • 22.
    THE STACK TODAY Lanyrdis almost entirely Django (Python) Background tasks use Celery, a Django task queue Management tasks/cron jobs also run inside the framework The Django application is served by Gunicorn containers
  • 23.
    THE STACK TODAY Maindata store for everything except uploads We run a master and a replicated slave Around 80GB of data in five databases Each server runs on a RAID 1 disk array PostgreSQL
  • 24.
    THE STACK TODAY Taskqueue transport for Celery and tweet listeners Contains user sets for every conference, user and topic Used for efficient narrowing of queries before Solr is hit Redis
  • 25.
    THE STACK TODAY Storesconferences, users, sessions and more Very rich metadata on each item Heavy use of sharding thoroughout the site Solr We run a master and a replicated slave
  • 26.
    THE STACK TODAY Firstpoint of call for all requests Caches most anonymous requests Enforces read-only mode if enabled Varnish One used and one hot spare at all times
  • 27.
    THE STACK TODAY Sitsbehind Varnish Distributes load amongst frontend servers Re-routes requests during deploys HAProxy Two in use at all times, identically configured
  • 28.
    THE STACK TODAY Storesall uploaded files from users Upload forms post directly to S3 Serves all static assets for the site (images, CSS, JS) S3 Static assets are versioned with hash to help cache break
  • 29.
    THE STACK TODAY Browser Nginx HAProxy Varnish Gunicorn Mainsite runtime Amazon S3 Celery Task workers Redis PostgreSQL Solr SSL Termination Web Cache Load balancer Static files & uploads Tasks, Set calcs Search and facetingMain data store Memcached Fragment caching
  • 30.
    THE STACK BEFORE Whatwe've eliminated
  • 31.
    THE STACK BEFORE Storedanalytics, logs and some other data Lack of schema meant some bad data persisted Poor complex query performance MongoDB Useful for quick prototyping
  • 32.
    THE STACK BEFORE Primarydata store for things not in MongoDB Very poor complex query performance No advanced field types MySQL Full database locks during schema changes
  • 33.
    A TALE OFTWO DBS The Great Move of 2012
  • 34.
    A TALE OFTWO DBS Amazon EC2 MySQL Softlayer PostgreSQL
  • 35.
    A TALE OFTWO DBS Why? Predictable loading means EC2 unnecessary Better I/O throughput Both moves required database downtime
  • 36.
    A TALE OFTWO DBS How? Replicate Solr and Redis across to new servers Enter read-only mode Dump MySQL data Convert MySQL dump into PostgreSQL dump Load PostgreSQL dump Re-point DNS, proxy requests from old servers Exit read-only mode
  • 37.
    A TALE OFTWO DBS Time in read-only mode: 1 ½ hours Downtime: 0 hours
  • 38.
    CONTENT IS KING TheAdvantages of Content
  • 39.
    CONTENT IS KING Read-onlymode is entirely viable An hour or two at most Everyone logged out Varnish blocks POSTs, caches everything aggressively
  • 40.
    CONTENT IS KING Indexingdelay is acceptable Most site views are driven by Solr 1 or 2 minute indexing delay Some views add in recent changes directly
  • 41.
  • 42.
    FEATURE FLAGS Continuous Deployment Wedeploy at least 5 times a day, if not 20 Nearly all code goes into master or short-lived branches Anything unreleased is feature flagged
  • 43.
    FEATURE FLAGS Feature flags Simplenamed boolean toggles Settable by user, user tag, or conference Can change templates, view code, URLs, etc.
  • 44.
  • 45.
    WHO WROTE THAT?OH, ME Legacy code & decisions
  • 46.
    WHO WROTE THAT?OH, ME Technical Debt It's fine to have some - it can speed things up A good chunk of ours is gone, some remains Big schema changes get harder and harder
  • 47.
    SMALL AND NIMBLE Thepower of small teams
  • 48.
  • 49.
    SMALL AND NIMBLE Sixpeople 2.5 Back-end developers 1.75 Front-end developers 1.5 Designers 0.75 System administrators 0.75 Business operations 0.5 Mobile developers
  • 50.
    SMALL AND NIMBLE Awareness Everyoneknows everything that's happening Daily stand-ups Weekly show-and-tell sessions
  • 51.
    SMALL AND NIMBLE Alwaysdeployable Master branch always shippable Large development behind feature flags Code review for nastier changes
  • 52.
  • 53.
    LESSONS LEARNED Small andnimble Continuous deployment and development style allows easy project changing No long approval processes Less than ½ hour from report to shipped fix
  • 54.
    LESSONS LEARNED Content isgreat Read-only mode allows less painful downtimes Heavy caching smooths out our load Learnable load patterns
  • 55.
    LESSONS LEARNED Fix itwhile you can The bigger you get, the harder a fix We moved to PostgreSQL just in time Big schema changes now take days of coding
  • 56.
    LESSONS LEARNED Six amazingpeople You don't need a big team to write a complex product Communication is absolutely key Using Open Source well is also crucial
  • 57.
    Thank you. Andrew Godwin Sponsoror promote your company using events? Get in touch: @andrewgodwin http://aeracode.org info@lanyrd.com