2. OVERVIEWOVERVIEW
BRIEF HISTORY OF CLUSTERBRIEF HISTORY OF CLUSTER
MANAGE MENTMANAGE MENT
WHAT IS KUBERNETES?WHAT IS KUBERNETES?
MES OS AND THE MODERNMES OS AND THE MODERN
DATACENTERDATACENTER
CROSSING THE STREAMSCROSSING THE STREAMS
MES OS <> KUBERNETESMES OS <> KUBERNETES
3. BRIEF HISTORY OF CLUSTERBRIEF HISTORY OF CLUSTER
MANAGEMENTMANAGEMENT
"The good ideas of today, often mimic the good ideas of the past."
4. STEP BACK TO THE 80'SSTEP BACK TO THE 80'S
& 90'S& 90'S
[Ghostbusters] ~1984
5. BEFORE "CONTAINERBEFORE "CONTAINER
ORCHESTRATION"ORCHESTRATION"
BEFORE IAAS/"CLOUD"BEFORE IAAS/"CLOUD"
THERE WAS THE GRIDTHERE WAS THE GRID
In the 1990s, inspired by the availability of high-speed wide area networks
and challenged by the computational requirements of new applications,
researchers began to imagine a computing infrastructure that would
“provide access to computing on demand” (COD) and permit “flexible,
secure, coordinated resource sharing among dynamic collections of
individuals, institutions, and resources”
[The History of the Grid] ~Ian Foster, Carl Kesselman
6. GRID DRIVERSGRID DRIVERS
LARGE SCALE SCIENTIFIC COMPUTING (E.G.LARGE SCALE SCIENTIFIC COMPUTING (E.G.
LHCLHC ))
DESIRE TO HAVE FEDERATED COMPUTING AT HUNDREDS OFDESIRE TO HAVE FEDERATED COMPUTING AT HUNDREDS OF
SITES IN ORDER TO ANALYZE PETABYTES OF DATA. "SOUNDSSITES IN ORDER TO ANALYZE PETABYTES OF DATA. "SOUNDS
LIKE: BIG DATA"LIKE: BIG DATA"
GOAL IS THROUGHPUTGOAL IS THROUGHPUT
PLEASINGLY PARALLEL ALGORITHMSPLEASINGLY PARALLEL ALGORITHMS
LOTS OF ENORMOUS WORKFLOWS (DAGS)LOTS OF ENORMOUS WORKFLOWS (DAGS)
HTTP://HOME.WEB.CERN.CH/ABOUT/COMPUTINGHTTP://HOME.WEB.CERN.CH/ABOUT/COMPUTING
7. GRID DRIVERS (CONT)GRID DRIVERS (CONT)
ANALOGOUS TO UTILITIES OF THE TIME, BUT FOR ON DEMANDANALOGOUS TO UTILITIES OF THE TIME, BUT FOR ON DEMAND
COMPUTE POWERCOMPUTE POWER
HETEROGENOUS DISTRIBUTED RESOURCE MANAGEMENTHETEROGENOUS DISTRIBUTED RESOURCE MANAGEMENT
INFRASTRUCTURESINFRASTRUCTURES
MULTI-TENANTMULTI-TENANT
INDEPENT SECURITY MODELSINDEPENT SECURITY MODELS
** MANY SYSTEMS WORKING TOGETHER (GLOBUS, HADOOP,** MANY SYSTEMS WORKING TOGETHER (GLOBUS, HADOOP,
CONDOR, SGE ...CONDOR, SGE ...
SOPHISTICATED MATCHMAKING DUE TO THE HETEROGENOUSSOPHISTICATED MATCHMAKING DUE TO THE HETEROGENOUS
NATURE OF THE GRIDNATURE OF THE GRID
8. GRID OPERATIONSGRID OPERATIONS
1. PROVISION RESOURCESPROVISION RESOURCES
2. PUBLISH, OR ADVERTISEPUBLISH, OR ADVERTISE
RESOURCE AVAIL ABILITYRESOURCE AVAIL ABILITY
3. ASSEMBLE RESOURCES INTO AASSEMBLE RESOURCES INTO A
OPERATIONAL GRID/POOLOPERATIONAL GRID/POOL
4. CON SUME RESOURCES ACROSS ACON SUME RESOURCES ACROSS A
VARIETY OF APPLICATIONSVARIETY OF APPLICATIONS
9. LESSONS LEARNEDLESSONS LEARNED
NOT EVERYTHING IS A "JOB", HA-MICRO-SERVICES...NOT EVERYTHING IS A "JOB", HA-MICRO-SERVICES...
** NEEDS MORE COMPOSABILITY **** NEEDS MORE COMPOSABILITY **
MANY SYSTEMS DOING SIMILAR THINGS (SGE, LSF, PBS, CONDOR,MANY SYSTEMS DOING SIMILAR THINGS (SGE, LSF, PBS, CONDOR,
MESOS, KUBERNETES, SWARM)MESOS, KUBERNETES, SWARM)
PROVISION RESOURCESPROVISION RESOURCES
PUBLISH, OR ADVERTISE RESOURCE AVAILABILITYPUBLISH, OR ADVERTISE RESOURCE AVAILABILITY
ASSEMBLE RESOURCES INTO A OPERATIONAL GRID/POOLASSEMBLE RESOURCES INTO A OPERATIONAL GRID/POOL
CONSUME RESOURCES ACROSS A VARIETY OF APPLICATIONSCONSUME RESOURCES ACROSS A VARIETY OF APPLICATIONS
HETEROGENEOUS COMPUTING PLATFORMS IS HARDHETEROGENEOUS COMPUTING PLATFORMS IS HARD
HARDWARE DIVERSITY (SUN, X86, ITANIUM, POWERPC)HARDWARE DIVERSITY (SUN, X86, ITANIUM, POWERPC)
ASSORTED HW-SPECIALIZATIONSASSORTED HW-SPECIALIZATIONS
OS DIVERSITY (SOLARIS, WINDOWS, LINUX, HPUX, AIX ...)OS DIVERSITY (SOLARIS, WINDOWS, LINUX, HPUX, AIX ...)
INSTALLED STACK DIVERSITY (LIBRARIES, LANGUAGES)INSTALLED STACK DIVERSITY (LIBRARIES, LANGUAGES)
10. LESSONS LEARNED (CONT)LESSONS LEARNED (CONT)
MATCHING (HW+OS+SW) CAN BE GRIZZLY BEARMATCHING (HW+OS+SW) CAN BE GRIZZLY BEAR
CONTAINERS SOLVES SOME OF THIS...CONTAINERS SOLVES SOME OF THIS...
Software people often say “we eliminated a whole class of problems”
when they mean “we chose tradeoffs that make you solve them
elsewhere.” ~ William Benton
FLAT L3 NETWORKING IS A PITA (PORT MANGLING)FLAT L3 NETWORKING IS A PITA (PORT MANGLING)
NAT-ING SHOULD BE CONFIGURABLENAT-ING SHOULD BE CONFIGURABLE
NEEDS MORE FLEXIBILITY (CREATE YOUR OWN SCHEDULER)NEEDS MORE FLEXIBILITY (CREATE YOUR OWN SCHEDULER)
EXPRESSIVENESS CAN BE GOOD... WHEN MANAGED, OTHERWISE IT CANEXPRESSIVENESS CAN BE GOOD... WHEN MANAGED, OTHERWISE IT CAN
BECOME OBTUSEBECOME OBTUSE
/* now modif y routed job attributes */
/* remove routed job if it goes on hold or stays idle for over 6 hours */
set_PeriodicRemove = JobStatus == 5 ||
(JobStatus == 1 && (CurrentTime - QDate) > 3600*6);
delete_WantJobRouter = true;
set_requirements = true;
11. FAST FORWARD TO 2015FAST FORWARD TO 2015
[Back to the Future Part 3]
12. WHAT IS KUBERNETES?WHAT IS KUBERNETES?
The Greek word “kubernetes,” means “helmsman of a ship,” or, more
metaphorically, “ruler.”
13. WHAT IS KUBERNETES?WHAT IS KUBERNETES?
"Kubernetes is an open source orchestration system for
containers. It handles scheduling onto nodes in a compute
cluster and actively manages workloads to ensure that their
state matches the users declared intentions."
14. KUBERNETES?KUBERNETES?
'KUBERNETES IS AN OPEN SOURCE'KUBERNETES IS AN OPEN SOURCE "ORCHESTRATION SYSTEM""ORCHESTRATION SYSTEM" FOR CONTAINERS. ITFOR CONTAINERS. IT
Kubernetes is an open source derivative work, based on Google's internal BORG infrastruct
It manages containerized applications across multiple hosts, providing basic mechanisms fo
Kubernetes establishes a set of robust declarative primitives for maintaining the desired
15. KUBERNETES?KUBERNETES?
KUBERNETES IS DECL ARATIVEKUBERNETES IS DECL ARATIVE
apiVersion: v1
kind: ReplicationController
metadata:
name: redis-slave
labels:
name: redis-slave
spec:
replicas: 2
...
ALSO IMPARATIVE, THE API ALLOWS YOU TO WRITE
INTROSPECTIVE SERVICES, OR CONTROLLER, ATOP OF IT.
ITS POSSIBLE TO WRITE ELASTIC CONTROLLERS (THINK YARN)
17. CORE CONCEPTSCORE CONCEPTS
PODSPODS
PODS ARE THE ATOM OF SCHEDULING, AND ARE A GROUP OFPODS ARE THE ATOM OF SCHEDULING, AND ARE A GROUP OF
CONTAINERS THAT ARE SCHEDULED ONTO THE SAME HOST.CONTAINERS THAT ARE SCHEDULED ONTO THE SAME HOST.
"COSCHEDULING""COSCHEDULING"
PODS FACILITATE DATA SHARING AND COMMUNICATION BETWEENPODS FACILITATE DATA SHARING AND COMMUNICATION BETWEEN
CONTAINERS WITHIN THE PODCONTAINERS WITHIN THE POD
SHARED MOUNT POINTSHARED MOUNT POINT
SHARED NETWORK NAMESPACE/IP AND PORT SPACESHARED NETWORK NAMESPACE/IP AND PORT SPACE
HIGHER ORDER ABSTRACTION THEN CONTAINERSHIGHER ORDER ABSTRACTION THEN CONTAINERS
COMPOSABLE MICRO-SERVICESCOMPOSABLE MICRO-SERVICES
18. CORE CONCEPTS (CONT)CORE CONCEPTS (CONT)
CONTROLLERSCONTROLLERS
EVENTUAL CONSISTENCY IS MAINTAINED BY SEPARATEEVENTUAL CONSISTENCY IS MAINTAINED BY SEPARATE
CONTROLLERS. EACH CONTROLLERS PURPOSE IS TO RECTIFY ANYCONTROLLERS. EACH CONTROLLERS PURPOSE IS TO RECTIFY ANY
DISCREPANCY BETWEEN THE DECLARED STATE OF A PRIMITIVE,DISCREPANCY BETWEEN THE DECLARED STATE OF A PRIMITIVE,
WITH THE CURRENT STATE OF THE SYSTEMWITH THE CURRENT STATE OF THE SYSTEM
nodes
apiserver
schedulercontroller
kind: ReplicationController
...
spec:
replicas: 2
19. CORE CONCEPTS (CONT)CORE CONCEPTS (CONT)
SERVICES*SERVICES*
SERVICES PROVIDE A SINGLE, STABLE NAME AND ADDRESS FOR ASERVICES PROVIDE A SINGLE, STABLE NAME AND ADDRESS FOR A
SET OF PODS. THEY TYPICALLY ACT AS BASIC LOAD BALANCEDSET OF PODS. THEY TYPICALLY ACT AS BASIC LOAD BALANCED
PROXY ENDPOINT. (NON-COLLIDING-NAT)PROXY ENDPOINT. (NON-COLLIDING-NAT)
CLOUD BASED IMPLEMENTATIONS HAVE NATIVE SUPPORT FORCLOUD BASED IMPLEMENTATIONS HAVE NATIVE SUPPORT FOR
CREATING EXTERNAL LOAD BALANCERS.CREATING EXTERNAL LOAD BALANCERS.
PROVIDES A CONSTRUCT WHICH IS USED TO LOOKUP, NAME, ANDPROVIDES A CONSTRUCT WHICH IS USED TO LOOKUP, NAME, AND
LINK PODS (INJECTION)LINK PODS (INJECTION)
Load Balancer
PodPod
External/Internal
Service or user
21. CORE CONCEPTS (CONT)CORE CONCEPTS (CONT)
LABELSLABELS
Labels are key/value pairs associated with pods or nodes.
Labels enable operators to map their own structures onto objects in a
loosely coupled fashion.
=, !=, in, notin
"labels": {
"release" : "stable",
"environment" : "production"
}
24. USE CASESUSE CASES
1.0 PRIMARY USE CASE:1.0 PRIMARY USE CASE:
CONTAINER ORCHESTRATION FOR CLOUD-NATIVECONTAINER ORCHESTRATION FOR CLOUD-NATIVE
APPLICATIONS.APPLICATIONS.
AN ENGINE FOR BUILDING FULLY FEATURES PAASAN ENGINE FOR BUILDING FULLY FEATURES PAAS
SYSTEMS ATOP.SYSTEMS ATOP.
OpenShift adds developer and operational centric tools on
top of Kubernetes to enable rapid application development,
easy deployment and scaling, and long-term lifecycle
maintenance for small and large teams and applications
25. STATUSSTATUS
1.0+ EXISTS FOR AVAIL ABILITY1.0+ EXISTS FOR AVAIL ABILITY
(GCE, ATOMIC, ETC.)(GCE, ATOMIC, ETC.)
MESO S F RAMEWOR K I S I N THEMESO S F RAMEWOR K I S I N THE
MAIN R E PO, AND SUPPORTED !!!MAIN R E PO, AND SUPPORTED !!!
K8S FORMALLY GIVEN TO THEK8S FORMALLY GIVEN TO THE
CNCFCNCF
GOOGLECLOUD->KUBERNETES ONGOOGLECLOUD->KUBERNETES ON
GITHUBGITHUB
26. MESOS AND THE MODERNMESOS AND THE MODERN
DATA CENTERDATA CENTER
27. NEW DATACENTER (CONT)NEW DATACENTER (CONT)
CHARACTERISTICS:CHARACTERISTICS:
SHARED INFRASTRUCTURE VS. SILO(S)SHARED INFRASTRUCTURE VS. SILO(S)
MULTI-TENANTMULTI-TENANT
MULTIPLE ELASTIC WORKLOADSMULTIPLE ELASTIC WORKLOADS
ANALYTICS + STREAMING + PAASANALYTICS + STREAMING + PAAS
PAAS (COMPOSABLE MICRO-SERVICES)PAAS (COMPOSABLE MICRO-SERVICES)
QOS (TIERS OF SERVICE)QOS (TIERS OF SERVICE)
FAIRNESS | QUOTAFAIRNESS | QUOTA
MANY NETWORKS, SDNMANY NETWORKS, SDN
LAYERS AND LAYERS OF SECURELAYERS AND LAYERS OF SECURE
28. ONLINEONLINE
NEARLINENEARLINE
OFFLINEOFFLINE
BATCH PROCESSING:BATCH PROCESSING:
Machine Learning, Modeling, Data Analysis, ETL, etc.
STREAM + PAASSTREAM + PAAS
Traditional services: Databases, Stream Processing
CLOUD-NATIVE / PAASCLOUD-NATIVE / PAAS
UI Clients, Web Framework Dejour, Event dispatching
http://techblog.netflix.com/2013/03/system-architectures-for.html
OPERATIONALOPERATIONAL
PERSPECTIVEPERSPECTIVE
30. CROSSING THE STREAMSCROSSING THE STREAMS
MESO S <> KUBERNETESMESO S <> KUBERNETES
DISCLAIMER: I'M NOT A NETWORKING GURU
31.
32. STEP 1: DEVISE A PLANSTEP 1: DEVISE A PLAN
DRAW OUT YOUR CORE SERVICES FOR YOUR DATA CENTERDRAW OUT YOUR CORE SERVICES FOR YOUR DATA CENTER
DETERMINE EXTERNAL VISIBILITYDETERMINE EXTERNAL VISIBILITY
AIR-GAPING | RESOLUTION VISIBILITY | INGRESS &AIR-GAPING | RESOLUTION VISIBILITY | INGRESS &
EGRESSEGRESS
NETWORK ACCESSABILITY TO YOUR OTHERNETWORK ACCESSABILITY TO YOUR OTHER
FRAMEWORKSFRAMEWORKS
RESOLUTION (MESOS-DNS)RESOLUTION (MESOS-DNS)
TRY TO NOT RELY ON DNS, PREFER DISCOVERY SERVICESTRY TO NOT RELY ON DNS, PREFER DISCOVERY SERVICES
IF AT ALL POSSIBLE, OR WELL DEFINED VIPS FOR PRIMARYIF AT ALL POSSIBLE, OR WELL DEFINED VIPS FOR PRIMARY
CORE SERVICES.CORE SERVICES.
VIPS DON'T SCALEVIPS DON'T SCALE
PLAN YOUR OVERLAY NETWORKPLAN YOUR OVERLAY NETWORK
TRY TO SEPARATE NETWORKS TO MAINTAIN SOME LEVELTRY TO SEPARATE NETWORKS TO MAINTAIN SOME LEVEL
OF QOSOF QOS
33.
34. EXPOSING K8S SERVICESEXPOSING K8S SERVICES
{
...
"ports": [
{
"protocol": "TCP",
"port": 80,
"targetPort": 9376,
"nodePort": 30061
}
],
...
"type": "LoadBalancer"
},
"status": {
"loadBalancer": {
"ingress": [
{
"ip": "146.148.47.155"
}
...
}
nodePort: the Kubernetes master will
allocate a port from a flag-configured
range (default: 30000-32767), and
each Node will proxy that port (the
same port number on every Node) into
your Service.
type: LoadBalancer - On cloud
providers which support external load
balancers, setting the type field
to "LoadBalancer" will provision a load
balancer for your Service.
https://github.com/kubernetes/kubernetes/bl
ob/master/docs/user-guide/services.md
35. PLAN FOR CONSTRAINTSPLAN FOR CONSTRAINTS
DEALING WITH LEGACY SYSTEMSDEALING WITH LEGACY SYSTEMS
DNSDNS
MANY LEGACY SYSTEMS DEPEND ON DNS FOR BETTER ORMANY LEGACY SYSTEMS DEPEND ON DNS FOR BETTER OR
FOR WORSEFOR WORSE
NAMESPACING (ENG,PROD) AND MULTI-TENANCYNAMESPACING (ENG,PROD) AND MULTI-TENANCY
IN A MULTI-TENANT ENVIRONMENT YOU COULD HAVE 10IN A MULTI-TENANT ENVIRONMENT YOU COULD HAVE 10
COPIES OF THE SAME SERVICE AND THAT SHOULD BE OK.COPIES OF THE SAME SERVICE AND THAT SHOULD BE OK.
REVERSE DNS - (NAT FAILURE)REVERSE DNS - (NAT FAILURE)
DB1 DB1 DB1 DB1
......
37. STEP 2: CREATE A TESTSTEP 2: CREATE A TEST
EXPERIMENTEXPERIMENT
FIND YOUR HAPPY PLACE AND SAFE PLACEFIND YOUR HAPPY PLACE AND SAFE PLACE
HAVE A SANDBOX WHERE YOU CAN PLAY WITH SERVICESHAVE A SANDBOX WHERE YOU CAN PLAY WITH SERVICES
TEST SETTING UP SEPARATE NETWORKS FOR DIFFERENT SERVICESTEST SETTING UP SEPARATE NETWORKS FOR DIFFERENT SERVICES
CONSIDER CLUSTERS TO BE EPHEMERALCONSIDER CLUSTERS TO BE EPHEMERAL
IT ACTUALLY MAKES LIFE EASIERIT ACTUALLY MAKES LIFE EASIER
1 PAAS -> MANY PAAS-ES-S1 PAAS -> MANY PAAS-ES-S
TRY REACHING ACROSS NETWORKSTRY REACHING ACROSS NETWORKS
SETUP DIFFERENT LOAD-BALANCING SERVICESSETUP DIFFERENT LOAD-BALANCING SERVICES
DETERMINE IF VIPS MAKES SENSE FOR YOU AT YOUR SCALEDETERMINE IF VIPS MAKES SENSE FOR YOU AT YOUR SCALE
38. STEP 3: BURN YOURSTEP 3: BURN YOUR
ORIGINAL PLANORIGINAL PLAN
ONLY 1/2 JOKING, YOU WILL LIKELY RUN INTO ISSUESONLY 1/2 JOKING, YOU WILL LIKELY RUN INTO ISSUES
YOU NEVER KNEW EXISTED. CONSULT YOUR LOCALYOU NEVER KNEW EXISTED. CONSULT YOUR LOCAL
NETWORK OPERATORNETWORK OPERATOR
39. ENJOY THE JOURNEYENJOY THE JOURNEY
[Ghostbusters] ~1984
IT MAY GET A LITTLE MESSY, BUT IT'S WORTH ITIT MAY GET A LITTLE MESSY, BUT IT'S WORTH IT