Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Building the Layers of the Scalable
Cluster for Containers
by Barak Michener
About Me
distributed systems
engineer
@barakmich
github.com/barakmich
EpicHack.com
https://twitter.com/gabrtv/status/539805332432637952
Good artists copy…. (or fork?)
the OSI Model
reduced API contracts
OS
kernel
systemd
lvm
ssh
python
java
nginx
mysql
openssl
app
distrodistrodistrodistrodistrodistrodistr
python
java
nginx
mysql
openssl
app
distrodistrodistrodistrodistrodistrodistr
kernel
systemd
lvm
ssh
python
openssl-A
app1
distrodistrodistrodistrodistrodistrodistr java
openssl-B
app2
java
openssl-B
app3
kernel
systemd
lvm...
super-powers
OS
Opportunity for automatic updates.
Consistent set of software across hosts.
Base OS independent from app.
design for host failure
clustering
etcd
/etc
distributed
(or daemon?)
open source software
sequentially consistent
exposed via HTTP
runtime reconfigurable
Available
Available
Available
Unavailable
Available
Leader
Follower
Leader
Follower
Available
Leader
Follower
Temporarily Unavailable
Leader
Follower
Available
1 2 3 4
{Log
1 2 3 4
Entries
1 2 3 4
Indexes
Sequential Consistency
Operations* are atomically executed in the
same sequential order on all machines.
1
1
1
2
Pet=dog
Pet=cat
Pet=cat
1
2
PUT Pet = cat
PUT Pet = dog
1
1
1
2
2
1
2
PUT Pet = cat
PUT Pet = dog
Pet=dog
Pet=dog
Pet=cat
1
1
1
2
2
2
1
2
PUT Pet = cat
PUT Pet = dog
Pet=dog
Pet=dog
Pet=dog
Sequential Consistency
Real-time
1
1
1
2
GET Pet @ 10:00.0 -> 1[cat]!?
GET Pet @ 10:00.0 -> 2[dog]
2
1
1
1
2
2
2
GET Pet @ 10:00.1 -> 1[dog]
Sequential Consistency
Index Time
1
1
1
2
GET Pet @ 2 -> blocking
GET Pet @ 2 -> 2[dog]
2
1
1
1
2
GET Pet @ 2 -> 2[dog]
2
2
etcd guarantees that a get at
index X will always return the
same result.
Avoid thinking in terms of real time because wit...
Quorum GETs
GET via Raft
1
1
1
2
2
1
1
1
2
QGET A
2
1
1
1
2
QGET A -> 2[dog]
2
2
1
1
1
2
QGET A -> 2[dog]
2
2
3
3
https://aphyr.com/posts/313-strong-
consistency-models
super-powers
etcd
Share configuration data across hosts.
Resilient to host failures.
Designed for consistency across hosts.
what can we build
together?
cluster
resources
We have some important primitives
now
What cluster-wide services can we
provide?
Network, Disk, Compute (kublets)
Personal...
super-powers
flannel
Partition a IP range -- hand out an IP to
each container (consistently)
Speed! (Avoiding userspace is good)
getting work to servers
orchestration
You
Orchestrator API
Scheduler, Network, NS
Machine(s)
while true {
todo = diff(desState, curState)
schedule(todo)
}
while true {
todo = diff(desState, curState)
schedule(todo)
}
while true {
todo = diff(desState, curState)
schedule(todo)
}
while true {
todo = diff(desState, curState)
schedule(todo)
}
fleet
mesos
kubernetes
swarm
job scheduling
super-powers
orchestration
Think about app capacity first.
Take advantage of
compute/memory/disk/network
resources.
Build on etcd resilience to host ...
run and isolate apps
containers
what is it exactly?
containers
Containers
Userspace Tarballs
libc
python
django
app.py
example.com/myapp
"Lifts" myapp from the OS into Layer 6
$ container fetch example.com/myapp
$ container run example.com/myapp
Pull from layer 6
Invoke in layer 2
pid ns
isolated pid 1
user ns
isolated uid 0
network ns
isolated netdev
mount ns
isolated /
cgroups
manage resources
cgroups
count resources
cgroups
limit resources
how are they created?
containers
Application (you, MySQL, ElasticSearch)Layer 3
Linux Distro (Ubuntu, Red Hat, CentOS)Layer 2
Physical (Bare Metal, OpenSta...
App MySQL
ElasticSearch
App MySQL
ElasticSearch
a Cluster
a Cluster
a Cluster
a Cluster
@coreoslinux
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)
Upcoming SlideShare
Loading in …5
×

ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)

508 views

Published on

Slides from Barak Michener's talk "CoreOS: Building the Layers of the Scalable Cluster for Containers" at ContainerDays Boston 2015: http://dynamicinfradays.org/events/2015-boston/programme.html#layers

Published in: Technology
  • Be the first to comment

ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Cluster for Containers" (Barak Michener)

  1. 1. Building the Layers of the Scalable Cluster for Containers by Barak Michener
  2. 2. About Me distributed systems engineer @barakmich github.com/barakmich EpicHack.com
  3. 3. https://twitter.com/gabrtv/status/539805332432637952 Good artists copy…. (or fork?)
  4. 4. the OSI Model
  5. 5. reduced API contracts OS
  6. 6. kernel systemd lvm ssh python java nginx mysql openssl app distrodistrodistrodistrodistrodistrodistr
  7. 7. python java nginx mysql openssl app distrodistrodistrodistrodistrodistrodistr kernel systemd lvm ssh
  8. 8. python openssl-A app1 distrodistrodistrodistrodistrodistrodistr java openssl-B app2 java openssl-B app3 kernel systemd lvm ssh
  9. 9. super-powers OS
  10. 10. Opportunity for automatic updates. Consistent set of software across hosts. Base OS independent from app.
  11. 11. design for host failure clustering
  12. 12. etcd
  13. 13. /etc distributed (or daemon?)
  14. 14. open source software sequentially consistent exposed via HTTP runtime reconfigurable
  15. 15. Available
  16. 16. Available
  17. 17. Available
  18. 18. Unavailable
  19. 19. Available Leader Follower
  20. 20. Leader Follower Available
  21. 21. Leader Follower Temporarily Unavailable
  22. 22. Leader Follower Available
  23. 23. 1 2 3 4 {Log
  24. 24. 1 2 3 4 Entries
  25. 25. 1 2 3 4 Indexes
  26. 26. Sequential Consistency Operations* are atomically executed in the same sequential order on all machines.
  27. 27. 1 1 1 2 Pet=dog Pet=cat Pet=cat 1 2 PUT Pet = cat PUT Pet = dog
  28. 28. 1 1 1 2 2 1 2 PUT Pet = cat PUT Pet = dog Pet=dog Pet=dog Pet=cat
  29. 29. 1 1 1 2 2 2 1 2 PUT Pet = cat PUT Pet = dog Pet=dog Pet=dog Pet=dog
  30. 30. Sequential Consistency Real-time
  31. 31. 1 1 1 2 GET Pet @ 10:00.0 -> 1[cat]!? GET Pet @ 10:00.0 -> 2[dog] 2
  32. 32. 1 1 1 2 2 2 GET Pet @ 10:00.1 -> 1[dog]
  33. 33. Sequential Consistency Index Time
  34. 34. 1 1 1 2 GET Pet @ 2 -> blocking GET Pet @ 2 -> 2[dog] 2
  35. 35. 1 1 1 2 GET Pet @ 2 -> 2[dog] 2 2
  36. 36. etcd guarantees that a get at index X will always return the same result. Avoid thinking in terms of real time because with network latency the result is always out-of-date.
  37. 37. Quorum GETs GET via Raft
  38. 38. 1 1 1 2 2
  39. 39. 1 1 1 2 QGET A 2
  40. 40. 1 1 1 2 QGET A -> 2[dog] 2 2
  41. 41. 1 1 1 2 QGET A -> 2[dog] 2 2 3 3
  42. 42. https://aphyr.com/posts/313-strong- consistency-models
  43. 43. super-powers etcd
  44. 44. Share configuration data across hosts. Resilient to host failures. Designed for consistency across hosts.
  45. 45. what can we build together? cluster resources
  46. 46. We have some important primitives now What cluster-wide services can we provide? Network, Disk, Compute (kublets) Personal opinion: ls /dev
  47. 47. super-powers flannel
  48. 48. Partition a IP range -- hand out an IP to each container (consistently) Speed! (Avoiding userspace is good)
  49. 49. getting work to servers orchestration
  50. 50. You Orchestrator API Scheduler, Network, NS Machine(s)
  51. 51. while true { todo = diff(desState, curState) schedule(todo) }
  52. 52. while true { todo = diff(desState, curState) schedule(todo) }
  53. 53. while true { todo = diff(desState, curState) schedule(todo) }
  54. 54. while true { todo = diff(desState, curState) schedule(todo) }
  55. 55. fleet mesos kubernetes swarm job scheduling
  56. 56. super-powers orchestration
  57. 57. Think about app capacity first. Take advantage of compute/memory/disk/network resources. Build on etcd resilience to host failure.
  58. 58. run and isolate apps containers
  59. 59. what is it exactly? containers
  60. 60. Containers Userspace Tarballs
  61. 61. libc python django app.py example.com/myapp "Lifts" myapp from the OS into Layer 6
  62. 62. $ container fetch example.com/myapp $ container run example.com/myapp Pull from layer 6 Invoke in layer 2
  63. 63. pid ns isolated pid 1
  64. 64. user ns isolated uid 0
  65. 65. network ns isolated netdev
  66. 66. mount ns isolated /
  67. 67. cgroups manage resources
  68. 68. cgroups count resources
  69. 69. cgroups limit resources
  70. 70. how are they created? containers
  71. 71. Application (you, MySQL, ElasticSearch)Layer 3 Linux Distro (Ubuntu, Red Hat, CentOS)Layer 2 Physical (Bare Metal, OpenStack, AWS, DO)Layer 1 revisiting today’s Model
  72. 72. App MySQL ElasticSearch
  73. 73. App MySQL ElasticSearch
  74. 74. a Cluster
  75. 75. a Cluster
  76. 76. a Cluster
  77. 77. a Cluster
  78. 78. @coreoslinux

×