July 2014 HUG : Privilege Isolation in Docker Containers

2,356
-1

Published on

July 2014 HUG : Privilege Isolation in Docker Containers

Published in: Data & Analytics
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,356
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
63
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Loss of locality etc. doesn’t make material difference
    Suboptimal scheduling
    No sharing (IA usecase: universities sharing data over a common HDFS)
  • Loss of locality etc. doesn’t make material difference
    Suboptimal scheduling
    No sharing (IA usecase: universities sharing data over a common HDFS)
  • Loss of locality etc. doesn’t make material difference
    Suboptimal scheduling
    No sharing (IA usecase: universities sharing data over a common HDFS)
  • July 2014 HUG : Privilege Isolation in Docker Containers

    1. 1. Containers and Hadoop Hadoop virtualization, done right! Dinesh Subhraveti - dineshs@altiscale.com Altiscale Inc.
    2. 2. “Brief History of Containers” 2001 2002 2003 20052004 First implementation of containers based on syscall interposition — Columbia
    3. 3. “Brief History of Containers” 2001 2002 2003 20052004 First implementation of containers based on syscall interposition — Columbia First research paper on Linux Containers — OSDI’02
    4. 4. “Brief History of Containers” 2001 2002 2003 20052004 First research paper on Linux Containers — OSDI’02 First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
    5. 5. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
    6. 6. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 IBM acquires Meiosys — Focus shifted to AIX First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
    7. 7. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 IBM acquires Meiosys — Focus shifted to AIX First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
    8. 8. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 IBM acquires Meiosys — Focus shifted to AIX First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia Most core kernel changes finally made into Linux mainline
    9. 9. Container Renaissance “Datacenter is the Computer”
    10. 10. “The new computer needs an OS!” Computer OS Mesos KubernetesYARN
    11. 11. Mesos KubernetesYARN Containers: Enabler of the Datacenter OS Computer OS ProcessesContainers: isolated abstractions
    12. 12. Why not Virtual Machines? Application — Hardware misalignment Hypervisor Container Host Application Application Applications have round edges — system call interface Hypervisors expose square holes — hardware interface Lightweight abstraction without IO overhead or startup latency
    13. 13. Why not Virtual Machines? Application — Hardware misalignment Hypervisor Container Host Application Applications have round edges — system call interface Hypervisors expose square holes — hardware interface Lightweight abstraction without IO overhead or startup latency The unwelcome Guest OS Application
    14. 14. Host iSCSI, NFS Image Format Interpreter Virtual Device VM Exit (Context Switch) Guest Driver Guest File System Host Application Why not Virtual Machines? Layers of Intermediate Software VMsContainers Application High IO overhead due to many intermediate layers
    15. 15. Why not Virtual Machines? The Unwelcome Guest OS Slow startup time Guest OS licensing and maintenance burden Poor scalability High resource consumption due to duplication Obfuscated network / storage / compute topologies Application semantic information is lost
    16. 16. ! Hadoop Resource Manager Map Reduce ! YARN Map Reduce Spark Hbase ... Evolution of Hadoop from Map Reduce to YARN Isolation is an immediate challenge
    17. 17. ! Hadoop Resource Manager Map Reduce ! YARN Map Reduce Spark Hbase ... Containers on YARN Containers provide a simple and elegant solution Container Virtualization
    18. 18. ! Node Manager Customer A Task 1 Customer B Task 1 Containers on YARN Node Manager Spawned Tasks as Containers Container Virtualization Customer A Task 2 Customer C Task 1 Tasks representing the same job share the same container
    19. 19. Containers on YARN Advantages Secure multitenancy Performance Isolation Utilization via coscheduling IO and CPU tasks Consistent cluster environment Isolation of software dependencies / configuration Reproducible way to define app environment Rapid provisioning
    20. 20. ❏ Recent addition to the kernel
 ❏ Superuser in container maps to a regular user on the host
 ❏ Docker support for UID virtualization Privilege Isolation through UID namespaces Host Container Container root UID 0 Regular user UID 100 UID Virtualization U Host root UID 0
    21. 21. References ! ❏ Blog post describing UID virtualization support in Docker ❏ https://www.altiscale.com/making-docker-work-yarn/ ❏ Apache wiki page tracking work status across Docker and YARN projects ❏ https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers ❏ JIRA tracking Docker integration into YARN ❏ https://issues.apache.org/jira/browse/YARN-1964 ❏ Related Docker tickets ❏ Several tickets linked from: https://github.com/dotcloud/docker/pull/4572
 
 dineshs@altiscale.com Questions?
    22. 22. Backup Containers on Hadoop or Hadoop on Containers?
    23. 23. Hadoop on Separate Physical Clusters Awesomely Secure ! Everybody gets private hardware running private services Customer 1 Customer 2 Customer 3
    24. 24. Hadoop on Separate Physical Clusters Customer 1 Customer 2 Customer 3 Cannot scale the business this way! Poor utilization Host platform is a huge maintenance burden ❖ Customer 1 needs R ❖ Customer 2 needs Matlab ❖ Customer 3 needs ß∂ø… Utilization: 6 Spare: 0 Unused: 3 Utilization: 1 Spare: 6 Unused: 2 Utilization: 4 Spare: 3 Unused: 2
    25. 25. Container Clusters to Decouple Host from Customer Each customer gets a container image ❖ Encapsulates customer specific software and configuration ❖ Host platform remains lean and simple Utilization: 6 Spare: 0 Unused: 3 Utilization: 1 Spare: 6 Unused: 2 Utilization: 4 Spare: 3 Unused: 2 Poor utilization Customer 1 Customer 2 Customer 3
    26. 26. Global Pool of Resources Global Utilization: 11 Spare: 16 Unused: 0 Container Clusters to Drive Utilization Each customer gets a container image ❖ Encapsulates customer specific software and configuration ❖ Host platform remains lean and simple Densely pack containers together
    27. 27. Global Pool of Resources Containers with Fine-grain Resources ❖ Container resource levels adjusted dynamically per customer ➢ As dictated by business policy ❖ Fractional resource allocation
    28. 28. Global Pool of Resources Disaggregated Compute and Storage DNNM ❖ Add more storage to Customer 1 cluster from a storage rich node ➢ While a compute intensive job from Customer 2 utilizes the available compute capacity on the same node Independently scale compute and storage
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×