Scaling Open edX
with Kubernetes
DevOpsDays Boston
9.15.2015
Who we are
Nate Aune
Morgan Robertson
What we’ll cover
● Background -- Open edX
● Introducing Kubernetes
● Kubernetes concepts
● Scaling + resiliency
● Open edX on Kubernetes
Open edX background
● edX: non-profit founded by MIT and Harvard
● 500+ courses, 5M students learning on edX.org
● edX released Open edX in June 2013
● Stanford, MongoDB, Salesforce, Google, Microsoft,
McKinsey, Johnson & Johnson, Smithsonian
Open edX - a catalyst for innovation
212 Contributors
One of the fastest
growing open
source projects on
Github
Technical components
LMS/CMS (Django/Python)
Forum (Sinatra/Ruby)
User DB (MySQL)
Course DB (Mongo)
Tasks (Celery/RabbitMQ)
Caching (Memcache)
Proxy (Nginx)
Search (ElasticSearch)
Mapreduce (Hadoop)
Hosting infrastructure
S3 for serving:
● static assets
● grade downloads
● certificate downloads
● videos (for mobile)
● Load balancer
● Application server(s)
● Database server(s)
● Search server
● Utility server (tasks)
● Caching server
● Hadoop cluster
Typical scalable deployment of Open edX on AWS
Introducing Kubernetes
● Scheduling + orchestration layer for
containerized applications
● Abstracts your infrastructure
● Open source project by Google
● Production-ready as of July 2015
Kubernetes architecture
Kubernetes vs. the Docker triad
Kubernetes Swarm Compose Machine
Scheduling ✔ ✔
Service discovery ✔ ✅
Container scaling ✔ ✔
Machine provisioning ✅ ✔
Health checking ✔
Secret management ✔
Production-ready ✔
Kubernetes core concepts
● Pods
● Services
● Replication controllers
Pods
● Group of containers + volumes scheduled together
● Smallest deployable unit
● Containers share certain resources including network
stack
Services
Services
● Endpoint for a set of pods
● IP address, port, and label selectors
● Use round-robin routing to direct traffic to backend
pods
Services + Pods
Replication Controllers
● Manage pod lifecycles for a number of replicas
● Provide scaling + fault tolerance
● Use label selectors
Pods + Services + Replication Controllers
Scaling with Kubernetes
● Replication controllers scale pods
● Services provide a single endpoint for a group of
pods
● The Kubernetes master schedules pods across nodes
Resiliency with Kubernetes
● Replication controllers ensure a number of pods are
running
● Services provide load balancing
● Health checks allow bad pods to be ignored/removed
Open edX on Kubernetes
● Goals:
○ Multi-tenant
○ Scalable + resilient
The challenge
Architecture
Monitoring with Sysdig
Sysdig drill-down
Lessons learned
● Containers should be stateless
● Put initialization tasks into separate pods that run
once
● Services can be used to abstract non-containerized
components
Conclusion
● We’re still learning, but..
○ Kubernetes is a promising technology for
providing both scalability and resiliency
More info
Open edX - http://open.edx.org
Kubernetes - http://kubernetes.io
Google Container Engine - http://cloud.
google.com/container-engine
Thank you for your time!
Questions?
Slides: http://bit.ly/open-edx-kubernetes
nate@appsembler.com
morgan@appsembler.com
@appsembler

Scaling Open edX with Kubernetes

  • 1.
    Scaling Open edX withKubernetes DevOpsDays Boston 9.15.2015
  • 2.
    Who we are NateAune Morgan Robertson
  • 3.
    What we’ll cover ●Background -- Open edX ● Introducing Kubernetes ● Kubernetes concepts ● Scaling + resiliency ● Open edX on Kubernetes
  • 4.
    Open edX background ●edX: non-profit founded by MIT and Harvard ● 500+ courses, 5M students learning on edX.org ● edX released Open edX in June 2013 ● Stanford, MongoDB, Salesforce, Google, Microsoft, McKinsey, Johnson & Johnson, Smithsonian
  • 5.
    Open edX -a catalyst for innovation 212 Contributors One of the fastest growing open source projects on Github
  • 6.
    Technical components LMS/CMS (Django/Python) Forum(Sinatra/Ruby) User DB (MySQL) Course DB (Mongo) Tasks (Celery/RabbitMQ) Caching (Memcache) Proxy (Nginx) Search (ElasticSearch) Mapreduce (Hadoop)
  • 7.
    Hosting infrastructure S3 forserving: ● static assets ● grade downloads ● certificate downloads ● videos (for mobile) ● Load balancer ● Application server(s) ● Database server(s) ● Search server ● Utility server (tasks) ● Caching server ● Hadoop cluster
  • 8.
    Typical scalable deploymentof Open edX on AWS
  • 9.
    Introducing Kubernetes ● Scheduling+ orchestration layer for containerized applications ● Abstracts your infrastructure ● Open source project by Google ● Production-ready as of July 2015
  • 10.
  • 11.
    Kubernetes vs. theDocker triad Kubernetes Swarm Compose Machine Scheduling ✔ ✔ Service discovery ✔ ✅ Container scaling ✔ ✔ Machine provisioning ✅ ✔ Health checking ✔ Secret management ✔ Production-ready ✔
  • 12.
    Kubernetes core concepts ●Pods ● Services ● Replication controllers
  • 13.
    Pods ● Group ofcontainers + volumes scheduled together ● Smallest deployable unit ● Containers share certain resources including network stack
  • 14.
  • 15.
    Services ● Endpoint fora set of pods ● IP address, port, and label selectors ● Use round-robin routing to direct traffic to backend pods
  • 16.
  • 17.
    Replication Controllers ● Managepod lifecycles for a number of replicas ● Provide scaling + fault tolerance ● Use label selectors
  • 18.
    Pods + Services+ Replication Controllers
  • 19.
    Scaling with Kubernetes ●Replication controllers scale pods ● Services provide a single endpoint for a group of pods ● The Kubernetes master schedules pods across nodes
  • 20.
    Resiliency with Kubernetes ●Replication controllers ensure a number of pods are running ● Services provide load balancing ● Health checks allow bad pods to be ignored/removed
  • 21.
    Open edX onKubernetes ● Goals: ○ Multi-tenant ○ Scalable + resilient
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Lessons learned ● Containersshould be stateless ● Put initialization tasks into separate pods that run once ● Services can be used to abstract non-containerized components
  • 27.
    Conclusion ● We’re stilllearning, but.. ○ Kubernetes is a promising technology for providing both scalability and resiliency
  • 28.
    More info Open edX- http://open.edx.org Kubernetes - http://kubernetes.io Google Container Engine - http://cloud. google.com/container-engine
  • 29.
    Thank you foryour time! Questions? Slides: http://bit.ly/open-edx-kubernetes nate@appsembler.com morgan@appsembler.com @appsembler