July 2014 HUG : Privilege Isolation in Docker Containers

  • 1,125 views
Uploaded on

July 2014 HUG : Privilege Isolation in Docker Containers

July 2014 HUG : Privilege Isolation in Docker Containers

More in: Data & Analytics
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,125
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
48
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Loss of locality etc. doesn’t make material difference
    Suboptimal scheduling
    No sharing (IA usecase: universities sharing data over a common HDFS)
  • Loss of locality etc. doesn’t make material difference
    Suboptimal scheduling
    No sharing (IA usecase: universities sharing data over a common HDFS)
  • Loss of locality etc. doesn’t make material difference
    Suboptimal scheduling
    No sharing (IA usecase: universities sharing data over a common HDFS)

Transcript

  • 1. Containers and Hadoop Hadoop virtualization, done right! Dinesh Subhraveti - dineshs@altiscale.com Altiscale Inc.
  • 2. “Brief History of Containers” 2001 2002 2003 20052004 First implementation of containers based on syscall interposition — Columbia
  • 3. “Brief History of Containers” 2001 2002 2003 20052004 First implementation of containers based on syscall interposition — Columbia First research paper on Linux Containers — OSDI’02
  • 4. “Brief History of Containers” 2001 2002 2003 20052004 First research paper on Linux Containers — OSDI’02 First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
  • 5. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
  • 6. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 IBM acquires Meiosys — Focus shifted to AIX First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
  • 7. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 IBM acquires Meiosys — Focus shifted to AIX First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia
  • 8. “Brief History of Containers” 2001 2002 2003 2005 Enterprise Linux Container solution — Meiosys 2004 First research paper on Linux Containers — OSDI’02 IBM acquires Meiosys — Focus shifted to AIX First container-based distributed checkpointing — HP Labs First implementation of containers based on syscall interposition — Columbia Most core kernel changes finally made into Linux mainline
  • 9. Container Renaissance “Datacenter is the Computer”
  • 10. “The new computer needs an OS!” Computer OS Mesos KubernetesYARN
  • 11. Mesos KubernetesYARN Containers: Enabler of the Datacenter OS Computer OS ProcessesContainers: isolated abstractions
  • 12. Why not Virtual Machines? Application — Hardware misalignment Hypervisor Container Host Application Application Applications have round edges — system call interface Hypervisors expose square holes — hardware interface Lightweight abstraction without IO overhead or startup latency
  • 13. Why not Virtual Machines? Application — Hardware misalignment Hypervisor Container Host Application Applications have round edges — system call interface Hypervisors expose square holes — hardware interface Lightweight abstraction without IO overhead or startup latency The unwelcome Guest OS Application
  • 14. Host iSCSI, NFS Image Format Interpreter Virtual Device VM Exit (Context Switch) Guest Driver Guest File System Host Application Why not Virtual Machines? Layers of Intermediate Software VMsContainers Application High IO overhead due to many intermediate layers
  • 15. Why not Virtual Machines? The Unwelcome Guest OS Slow startup time Guest OS licensing and maintenance burden Poor scalability High resource consumption due to duplication Obfuscated network / storage / compute topologies Application semantic information is lost
  • 16. ! Hadoop Resource Manager Map Reduce ! YARN Map Reduce Spark Hbase ... Evolution of Hadoop from Map Reduce to YARN Isolation is an immediate challenge
  • 17. ! Hadoop Resource Manager Map Reduce ! YARN Map Reduce Spark Hbase ... Containers on YARN Containers provide a simple and elegant solution Container Virtualization
  • 18. ! Node Manager Customer A Task 1 Customer B Task 1 Containers on YARN Node Manager Spawned Tasks as Containers Container Virtualization Customer A Task 2 Customer C Task 1 Tasks representing the same job share the same container
  • 19. Containers on YARN Advantages Secure multitenancy Performance Isolation Utilization via coscheduling IO and CPU tasks Consistent cluster environment Isolation of software dependencies / configuration Reproducible way to define app environment Rapid provisioning
  • 20. ❏ Recent addition to the kernel
 ❏ Superuser in container maps to a regular user on the host
 ❏ Docker support for UID virtualization Privilege Isolation through UID namespaces Host Container Container root UID 0 Regular user UID 100 UID Virtualization U Host root UID 0
  • 21. References ! ❏ Blog post describing UID virtualization support in Docker ❏ https://www.altiscale.com/making-docker-work-yarn/ ❏ Apache wiki page tracking work status across Docker and YARN projects ❏ https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers ❏ JIRA tracking Docker integration into YARN ❏ https://issues.apache.org/jira/browse/YARN-1964 ❏ Related Docker tickets ❏ Several tickets linked from: https://github.com/dotcloud/docker/pull/4572
 
 dineshs@altiscale.com Questions?
  • 22. Backup Containers on Hadoop or Hadoop on Containers?
  • 23. Hadoop on Separate Physical Clusters Awesomely Secure ! Everybody gets private hardware running private services Customer 1 Customer 2 Customer 3
  • 24. Hadoop on Separate Physical Clusters Customer 1 Customer 2 Customer 3 Cannot scale the business this way! Poor utilization Host platform is a huge maintenance burden ❖ Customer 1 needs R ❖ Customer 2 needs Matlab ❖ Customer 3 needs ß∂ø… Utilization: 6 Spare: 0 Unused: 3 Utilization: 1 Spare: 6 Unused: 2 Utilization: 4 Spare: 3 Unused: 2
  • 25. Container Clusters to Decouple Host from Customer Each customer gets a container image ❖ Encapsulates customer specific software and configuration ❖ Host platform remains lean and simple Utilization: 6 Spare: 0 Unused: 3 Utilization: 1 Spare: 6 Unused: 2 Utilization: 4 Spare: 3 Unused: 2 Poor utilization Customer 1 Customer 2 Customer 3
  • 26. Global Pool of Resources Global Utilization: 11 Spare: 16 Unused: 0 Container Clusters to Drive Utilization Each customer gets a container image ❖ Encapsulates customer specific software and configuration ❖ Host platform remains lean and simple Densely pack containers together
  • 27. Global Pool of Resources Containers with Fine-grain Resources ❖ Container resource levels adjusted dynamically per customer ➢ As dictated by business policy ❖ Fractional resource allocation
  • 28. Global Pool of Resources Disaggregated Compute and Storage DNNM ❖ Add more storage to Customer 1 cluster from a storage rich node ➢ While a compute intensive job from Customer 2 utilizes the available compute capacity on the same node Independently scale compute and storage