Docker is an open source platform that allows developers to package applications with dependencies into standardized units called containers that can run on any system with Docker installed. Containers isolate applications from one another and the underlying infrastructure using resource isolation provided by Linux kernel features like namespaces and cgroups. This allows developers to easily deploy and run distributed applications without having to rebuild them for each platform.
1. Docker
The What, Why and How
The What, Why and How
Souvik Maji
001310501056
BCSE UG IV
2. Docker
• Docker is an open source platform designed to make it easier to
create, deploy, and run applications by using containers.
• Automates the deployment of application inside “application
containers”.
• Released in 2013.
3. Stats around Docker
1200 docker
contributers
100,00
Dockerized
application
3 to 4
Million
Develope
rs using
Docker
100,000
Dockerized
Applications
300 Million
Downloads
32,000 Docker
Related Projects
• Initial
release in
2013
4. Container
• Operating system level virtualization.
• Containers allow developers to package up an application with all of
the parts it needs, such as libraries and other dependencies, and ship
it all out as one package.
Finally, Write once run everywhere.
6. How docker handle dependencies
• Docker solves the dependency problem using ‘Layers’.
• Sequence of build commands to inherit results from its previous step.
• Dockerfile:
FROM ubuntu:12.04
RUN apt-get update && apt-get install -y python python-pip curl
RUN curl –sSL https://github.com/shykes/helloflask/archive/master.tar.gz | tar -xzv
RUN cd helloflask-master && pip install -r requirements.txt
9. Virtualization
Virtualization refers to the act of creating a virtual (rather than actual)
version of something, including virtual computer hardware platforms,
storage devices, and computer network resources to run
simultaneously.
14. chroot
• A chroot on Unix operating systems is an operation that changes the
apparent root directory for the current running process and its
children.
• Available since 1982
15. Namespace
• Allows processes to have their own set of users and in particular
allows a process root privileges inside a container but not outside.
• Initial release 2002
• Examples of resources that can be virtualized include process IDs,
hostnames, user IDs, network access, interprocess communication,
and filesystems
16. cgroup
• Cgroups allow processes to be grouped together and ensure that each
group gets a share of memory, cpu, disk I/O preventing any one
container from monopolizing any of these resources.
• Initial release 2007
17. LXC
• Userspace abstraction on top of cgroups and namespaces.
• Abailable since 2008
• LXC combines the kernel's cgroups and support for
isolated namespaces to provide an isolated environment for
applications. Docker can also use LXC as one of its execution drivers,
enabling image management and providing deployment services.
18. Docker
Another layer of abstraction on
top of LXC with even easier
tooling aimed at developers
looking for simple ways to
package their application.
20. Docker Images
• Docker can build images
automatically by reading the
instructions from a Dockerfile.
• Each Docker image references a
list of read-only layers that
represent filesystem differences.
• Layers are stacked on top of
each other to form a base for a
container’s root filesystem.
21. Filesystem
• A pluggable storage driver
architecture.
Supported dirvers: OverlayFS,
AUFS, Btrfs, Device Mapper, VFS,
ZFS
• Once you decide which driver is
best, you set this driver on the
Docker daemon at start time
22. Swarm Mode
• A swarm is a cluster of Docker engines, or nodes, where you
deploy services.
• A node is an instance of the Docker engine participating in the swarm.
• Worker nodes and Manager nodes.
• Services and tasks
• Load Balancer.
• Supports DNS
Docker is an open source project to pack, ship and run any application as a lightweight container.
Docker separates an application from its infrastructure. Infrastructures can be managed the same way you manage applications. Docker is a container management tool as well as an application deployment tool.
Docker began as an open-source implementation of the deployment engine which powered dotCloud, a popular Platform-as-a-Service. It benefits directly from the experience accumulated over several years of large-scale operation and support of hundreds of thousands of applications and databases.
Operating-system-level virtualization is a server virtualization method in which the kernel of an operating system allows the existence of multiple isolated user-space instances, . Such instances, which are sometimes called containers,[1] virtualization engines (VEs) or jails (FreeBSD jail or chroot jail), may look and feel like a real server from the point of view of its owners and users.
Everything at google runs in a container. They use around 2 billion containers per week.
A common problem for developers is the difficulty of managing all their application's dependencies in a simple and automated way.
This is usually difficult for several reasons:
Modern large scale application depends upon a combination of system libraries and binaries, language-specific packages, framework-specific modules, internal components developed for another project, etc
Custom dependencies. A developer may need to prepare a custom version of their application's dependency. Some packaging systems can handle custom versions of a dependency, others can't - and all of them handle it differently
Docker solves the problem of dependency hell by giving developers a simple way to express all their application's dependencies in one place. It simply orchestrates packaging systems use in a simple and repeatable way. How does it do that? With layers.
Docker defines a build as running a sequence of Unix commands, one after the other, in the same container. Build commands modify the contents of the container (usually by installing new files on the filesystem), the next command modifies it some more, etc. Since each build command inherits the result of the previous commands, the order in which the commands are executed expresses dependencies.
Docker doesn't care how dependencies are built - as long as they can be built by running a Unix command in a container.
A method of logically dividing mainframes to allow multiple app
A method of logically dividing mainframes to allow multiple app
The hypervisor handles creating the virtual environment on which the guest virtual machines operate. It supervises the guest systems and makes sure that resources are allocated to the guests as necessary. The hypervisor sits in between the physical machine and virtual machines and provides virtualization services to the virtual machines. To realize it, it intercepts the guest operating system operations on the virtual machines and emulates the operation on the host machine's operating system.
as Type 2 hypervisor. It is installed on the top of host operating system which is responsible for translating guest OS kernel code to software instructions. The translation is done entirely in software and requires no hardware involvement. Emulation makes it possible to run any non-modified operating system that supports the environment being emulated. The downside of this type of virtualization is additional system resource overhead that leads to decrease in performance compared to other types of virtualizations.
Paravirtualization, also known as Type 1 hypervisor, runs directly on the hardware, or “bare-metal”, and provides virtualization services directly to the virtual machines running on it. It helps the operating system, the virtualized hardware, and the real hardware to collaborate to achieve optimal performance. These hypervisors typically have a rather small footprint and do not, themselves, require extensive resources.
In theory different virtualization formats should allow every developer to automatically package their application into a "machine" for easy distribution and deployment. In practice, that almost never happens, for a few reasons:
Size - VMs are very large which makes them impractical to store and transfer.
Performance- running VMs consumes significant CPU and memory, which makes them impractical in many scenarios, for example local development of multi-tier applications, and large-scale deployment of cpu and memory-intensive applications on large numbers of machines.
Portability: competing VM environments don't play well with each other. Although conversion tools do exist, they are limited and add even more overhead.
Hardware-centric: VMs were designed with machine operators in mind, not software developers. As a result, they offer very limited tooling for what developers need most: building, testing and running their software. For example, VMs offer no facilities for application versioning, monitoring, configuration, logging or service discovery.
Container-based virtualization, also know as operating system-level virtualization, enables multiple isolated executions within a single operating system kernel. It has the best possible performance and density and features dynamic resource management. The isolated virtual execution environment provided by this type of virtualization is called container and can be viewed as a traced group of processes.
Uses: testing and development, dependency control, recovery, privilege seperation
A program that is run in such a modified environment cannot name (and therefore normally cannot access) files outside the designated directory tree. The modified environment is called a chroot jail.
Resource limiting – groups can be set to not exceed a configured memory limit, which also includes the file system cache[8][9]
Prioritization – some groups may get a larger share of CPU utilization[10] or disk I/O throughput[11]
Accounting – measures a group's resource usage, which may be used, for example, for billing purposes[12]
Control – freezing groups of processes, their checkpointing and restarting[12]
Since version replaced LXC with their own implementation – libcontainer, written in Golang.
Daemon – builds images, runs and manages containers, restful api
Hub – provides docker services, library of public images, storage of images(free, paid), automated builds(trigger on commit)
Dockerfile, a text file that contains all the commands, in order, needed to build a given image. Dockerfiles adhere to a specific format and use a specific set of instructions.
Docker hub for sharing images. Sharing promotes smaller images.
Docker has a pluggable storage driver architecture. This gives you the flexibility to “plug in” the storage driver that is best for your environment and use-case. Each Docker storage driver is based on a Linux filesystem or volume manager. Further, each storage driver is free to implement the management of image layers and the container layer in its own unique way. This means some storage drivers perform better than others in different circumstances.
. As a result, the Docker daemon can only run one storage driver, and all containers created by that daemon instance use the same storage driver
Swarm: The cluster management and orchestration features embedded in the Docker Engine are built using SwarmKit. Docker engines participating in a cluster are running in swarm mode. You enable swarm mode for an engine by either initializing a swarm or joining an existing swarm.
. The Docker Engine CLI and API include commands to manage swarm nodes (e.g., add or remove nodes), and deploy and orchestrate services across the swarm.
Worker nodes receive and execute tasks dispatched from manager nodes. By default manager nodes also run services as worker nodes, but you can configure them to run manager tasks exclusively and be manager-only nodes. An agent runs on each worker node and reports on the tasks assigned to it. The worker node notifies the manager node of the current state of its assigned tasks so that the manager can maintain the desired state of each worker.
A task carries a Docker container and the commands to run inside the container. It is the atomic scheduling unit of swarm. Manager nodes assign tasks to worker nodes according to the number of replicas set in the service scale. Once a task is assigned to a node, it cannot move to another node. It can only run on the assigned node or fail.
The swarm manager uses ingress load balancing to expose the services you want to make available externally to the swarm. The swarm manager can automatically assign the service a PublishedPort or you can configure a PublishedPort for the service. You can specify any unused port. If you do not specify a port, the swarm manager assigns the service a port in the 30000-32767 range.
Manager nodes also perform the orchestration and cluster management functions required to maintain the desired state of the swarm.
In the replicated services model, the swarm manager distributes a specific number of replica tasks among the nodes based upon the scale you set in the desired state. Replicated serviceFor global services, the swarm runs one task for the service on every available node in the cluster.
Manager nodes elect a single leader to conduct orchestration tasks.
Aws, ansible, google cloud platform, ibm bluemix, kubernotes, vagrant bluebox, vmware vsphere