1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
YARN and the Docker
Container Runtime
Dublin, April 2016
Sidharta S
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Context
LinuxContainerExecutor
Why docker with YARN?
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Container Runtimes in LinuxContainerExecutor
⬢ Support for Container Runtimes in YARN was added as part of – YARN-3611
(umbrella). Multiple container types are supported in the same executor.
⬢ The current mechanism of handling container lifecycle is moved into its own
runtime
⬢ A new docker container runtime is introduced that manages docker containers
⬢ LinuxContainerExecutor can delegate to either runtime on a per application basis
⬢ Clients specify which container type they want to use – currently via environment
variables but eventually through well-defined client APIs.
⬢ We could support more container types in the future.
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime in LinuxContainerExecutor
⬢ Exposes a subset of Docker container lifecycle functionality
⬢ Docker v1.10.x required for some of the work being planned (e.g. user namespaces)
⬢ A recent linux kernel is required (3.10+) for basic functionality. Some features (e.g.
overlay fs) require an even more recent kernel.
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Resource Isolation
⬢ Support added in YARN-4553
⬢ LinuxContainerExecutor still manages resource isolation and enforcement .
⬢ Docker uses the cgroup specified by LCE ( --cgroup-parent introduced in docker v 1.6 ,
net_cls support added to libcontainer recently – support added in v1.9)
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Linux Capabilities
⬢ Support added in YARN-4258
⬢ Based on linux capabilities
⬢ Admin controlled – cluster administrator can control which capabilities docker
containers have on the cluster
(yarn.nodemanager.runtime.linux.docker.capabilities)
⬢ Default set is based on what docker uses by default.
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Privileged Containers
⬢ Support added in YARN-4262
⬢ Allows certain applications to run in docker containers – e.g. oracle
⬢ This could be a security hazard so access to this needs to be controlled :
–Disabled by default (yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed)
–Admin controlled whitelist (yarn.nodemanager.runtime.linux.docker.privileged-
containers.acl)
–This whitelisted set of users allowed to launch privileged containers – but they must explicitly
request for it (YARN_CONTAINER_RUNTIME_DOCKER_RUN_PRIVILEGED_CONTAINER)
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Users
⬢ Docker runs the container as the specified user (-u)
⬢ This user needs to be available in the image being used.
⬢ Depending on capabilities (CAP_SETUID), privileged escalation could occur.
⬢ Docker v1.10 added support for user namespaces – requires daemon re-
configuration. Support for this in DockerContainerRuntime needs to be planned.
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Networking
⬢ Defaults to --net=host in the docker container runtime.
–This is not secure – but this is the only way some applications can run.
–We need to switch the default to bridged mode or an admin specified network
plugin
⬢ Network plug-in support was added in v1.9 .
–This should allow for more sophisticated networking scenarios
–YARN doesn’t have to do anything except delegate to the specified network
plugin when launching the container.
–Support for this is a work in progress (YARN-4007)
1
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
DockerContainerRuntime : Images
⬢ Localization via HDFS? : We could localize images from HDFS and load them using
‘docker load’.
⬢ This approach has the advantage of using an existing HDFS instance for
storage/distribution at scale.
⬢ However :
–we lose some of the optimizations/functionality that using a full-fledged docker
registry might provide.
–We have to figure out security implications. What if users clobber each others
images when ‘loading’ ?
1
1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Demos
Spark on Docker
YARN on YARN
1
2
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Q&A

YARN and the Docker container runtime

  • 1.
    1 © HortonworksInc. 2011 – 2016. All Rights Reserved YARN and the Docker Container Runtime Dublin, April 2016 Sidharta S
  • 2.
    2 © HortonworksInc. 2011 – 2016. All Rights Reserved Context LinuxContainerExecutor Why docker with YARN?
  • 3.
    3 © HortonworksInc. 2011 – 2016. All Rights Reserved Container Runtimes in LinuxContainerExecutor ⬢ Support for Container Runtimes in YARN was added as part of – YARN-3611 (umbrella). Multiple container types are supported in the same executor. ⬢ The current mechanism of handling container lifecycle is moved into its own runtime ⬢ A new docker container runtime is introduced that manages docker containers ⬢ LinuxContainerExecutor can delegate to either runtime on a per application basis ⬢ Clients specify which container type they want to use – currently via environment variables but eventually through well-defined client APIs. ⬢ We could support more container types in the future.
  • 4.
    4 © HortonworksInc. 2011 – 2016. All Rights Reserved DockerContainerRuntime in LinuxContainerExecutor ⬢ Exposes a subset of Docker container lifecycle functionality ⬢ Docker v1.10.x required for some of the work being planned (e.g. user namespaces) ⬢ A recent linux kernel is required (3.10+) for basic functionality. Some features (e.g. overlay fs) require an even more recent kernel.
  • 5.
    5 © HortonworksInc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Resource Isolation ⬢ Support added in YARN-4553 ⬢ LinuxContainerExecutor still manages resource isolation and enforcement . ⬢ Docker uses the cgroup specified by LCE ( --cgroup-parent introduced in docker v 1.6 , net_cls support added to libcontainer recently – support added in v1.9)
  • 6.
    6 © HortonworksInc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Linux Capabilities ⬢ Support added in YARN-4258 ⬢ Based on linux capabilities ⬢ Admin controlled – cluster administrator can control which capabilities docker containers have on the cluster (yarn.nodemanager.runtime.linux.docker.capabilities) ⬢ Default set is based on what docker uses by default.
  • 7.
    7 © HortonworksInc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Privileged Containers ⬢ Support added in YARN-4262 ⬢ Allows certain applications to run in docker containers – e.g. oracle ⬢ This could be a security hazard so access to this needs to be controlled : –Disabled by default (yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed) –Admin controlled whitelist (yarn.nodemanager.runtime.linux.docker.privileged- containers.acl) –This whitelisted set of users allowed to launch privileged containers – but they must explicitly request for it (YARN_CONTAINER_RUNTIME_DOCKER_RUN_PRIVILEGED_CONTAINER)
  • 8.
    8 © HortonworksInc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Users ⬢ Docker runs the container as the specified user (-u) ⬢ This user needs to be available in the image being used. ⬢ Depending on capabilities (CAP_SETUID), privileged escalation could occur. ⬢ Docker v1.10 added support for user namespaces – requires daemon re- configuration. Support for this in DockerContainerRuntime needs to be planned.
  • 9.
    9 © HortonworksInc. 2011 – 2016. All Rights Reserved DockerContainerRuntime : Networking ⬢ Defaults to --net=host in the docker container runtime. –This is not secure – but this is the only way some applications can run. –We need to switch the default to bridged mode or an admin specified network plugin ⬢ Network plug-in support was added in v1.9 . –This should allow for more sophisticated networking scenarios –YARN doesn’t have to do anything except delegate to the specified network plugin when launching the container. –Support for this is a work in progress (YARN-4007)
  • 10.
    1 0 © Hortonworks Inc.2011 – 2016. All Rights Reserved DockerContainerRuntime : Images ⬢ Localization via HDFS? : We could localize images from HDFS and load them using ‘docker load’. ⬢ This approach has the advantage of using an existing HDFS instance for storage/distribution at scale. ⬢ However : –we lose some of the optimizations/functionality that using a full-fledged docker registry might provide. –We have to figure out security implications. What if users clobber each others images when ‘loading’ ?
  • 11.
    1 1 © Hortonworks Inc.2011 – 2016. All Rights Reserved Demos Spark on Docker YARN on YARN
  • 12.
    1 2 © Hortonworks Inc.2011 – 2016. All Rights Reserved Q&A