What is Docker?
A feature that has been part of Unix systems pretty much forever is the concept of the ‘chroot’. This is a special system call that makes the process from then on see the new root of the file-system being the directory the chroot is run from. If various files are in expected places, such as libraries, then software can still run within this smaller view of the world. Chroot is a very useful feature. First there is the obvious security applications such as running potentially exploitable network services inside ‘chroot jails’ – these are deliberately limited environments that contain enough files for a service to run without it having any access to the wider machine. Another useful trick with chroot that many Linux system owners may be need of from time to time is that a system can be recovered and run even if the bootstrap or kernel it is configured to use is corrupt in some way. See my article Xtra-PC in depth for a practical example of how this is used.
This system call is (well is supposed to be) irreversible for the life of the process and any of its children. The only way supposed to be out of a ‘chroot jail’ is for all the jailed processes to die.
Cgroups is a feature that has been in Linux kernels for nearly a decade now that extends the old chroot idea in lots of different other directions. As well as only seeing part of the file-systems, Cgroups or Control Groups allow groups of processes to be set up so that they cannot see each other, have a limit placed on allowed memory usage or share of CPU time and can treat things that can be shared such as network cards (physical and ‘virtual’) as if they are the only ones that see them. If this seems like virtualisation to you then yes, it is very close. The big difference however is that only a single Linux kernel is in place underneath it all. This means that this ‘containerised’ approach to partitioning off resources is much lighter weight than full blown virtualisation. With full virtualisation a complete kernel (Operating System) has to be set up onto of virtual hardware. This imposes costs in terms of memory and startup time but does mean you can run e.g. Windows programs under a Linux host (or vica-versa).
Docker is one of a number of technologies that leverage the Cgroups idea and further extends it to provide the tools to build “do one job well” microservices, scaling over multiple physical machines.
The problem that Docker solves is that in today’s Internet enabled software architecture the sensible way to build big things is out of smaller things until the individual components get down to a size where they can be easily understood, tested, debugged and generally inspire confidence. What is not needed when building a large and complex piece of software is side effects between those components. Many of these components in a modern system are built using languages like Java, Python and Ruby and Perl that themselves rely on a myriad of other libraries from various sources.
Although programmers when improving their software try not to break existing programs it often does happen. A bug may need to be fixed for package A to work reliably but forcing that change on package B may cause it to behave in strange ways. The Docker approach is to start from an appropriate base image (in effect a chroot tree of all the libraries and other support files expected in e.g. a base distribution of Debian Jessie) and then using a build script known as a Dockerfile to install either exact or just ‘latest’ versions of just what the specific aim of the container is.
Docker avoids what used to be called on Windows computers “DLL Hell” where it was impossible to have 2 different applications installed on the same machine because their shared library needs were incompatible. Linux machines have traditionally solved this issue by having a custom compiled version of the Perl or Python or whatever needed by a fussy application installed in a non standard location and overriding the default PATH & LD_LIBRARY_PATH settings for that application. This is a little messy and error prone. Docker improves on it as within each container everything can live in the expected place.
Splitting individual parts of the system up into microservice chunks forces the whole design to only use communication methods that are visible and explicit. Normally this would be via TCP or UDP protocol on either internal or external network interfaces but Docker does also allow areas of the host filesystem to be shared between one or more of the containerised services, either read/write or read only also.
How does this all not result in huge amounts of storage usage? The answer to that is Docker’s use of another Unix/Linux technology that has been around a long time. The Union filesystem. One of the earliest practical uses of the Union filesystem in Linux was the 1995 distribution Linux-FT. This was a ‘Live CD’ that cached any programs actually used to a much smaller file living on the host Windows hard disk. This allowed the system to start fast, get faster as it was used and only use an absolute minimum of the then very expensive hard disk space. At that time a 600MB CD was much bigger than most available hard disks! This trick was all done using a Union filesystem of the read only CD and a writeable filesystem inside a single file on the Windows disk.
Docker takes use of Union filesystems to a whole new level. Each action taken to customise a docker image to make a new image results in a new filesystem layer. These layers are only as large as they need to be to contain the files that the new operation has changed e.g. installation of a specific version of a package. Even docker images that are downloaded from the Hub may consist of dozens of these layers – as can be seen from the download process.
This extreme stratification gives opportunities for layer caching and reuse and provides an audit trail that should keep the security team happy.
A Docker handy commands crib
Installing docker depends on which flavour of Linux you are using. For example for Centos 7 the following needs to go into /etc/yum.repos.d/docker.repo
[dockerrepo] name=Docker Repository baseurl=https://yum.dockerproject.org/repo/main/centos/7/ enabled=1 gpgcheck=1 gpgkey=https://yum.dockerproject.org/gpg
Then install & start it with
sudo yum install docker-engine sudo systemctl enable docker.service sudo systemctl start docker
Note only the root user is permitted to start docker containers by default. To enable other users:
sudo groupadd docker sudo usermod -aG docker yourusername
Basic Container Management
Shows any current docker sessions, extra -l option shows just the latest created container (2 lines out output at the most).
docker ps -a
Shows all docker containers both active and stopped.
Suspends execution of a current container – remember this is one of the neat tricks that Cgroups gives you.
Restarts a container that was stopped. Think of a laptop being woken from suspend.
Removes a stopped container, forgets about it.
docker rm -f
Removes a container even while it is running! – think of the equally powerful rm -f command.
docker inspect image or container
This spits out details about the configuration of the image or the container as a block of JSON code. This has the advantage this it is both easy for humans to read and immediately usable from any programming language that has a JSON parser.
Okay but how do I get containers onto my computer?
Docker images come from the centralised repository called the Docker Hub. If you have registered for a Docker ID you will be able to contribute your own images to this resource but without that you can still choose to consume other peoples containers (and add more layers to them to make them your own).
docker search term
Looks on the docker hub for containers matching your search term. For example Alpine Linux is a very simple and minimal Linux layout ideally suited for use inside containers. Core container images officially supported by Docker have names of just one word. Images contributed by third parties are in the form author/container. Puppet uses a very similar convention for its Forge too.
docker search -f is-official=true alpine NAME DESCRIPTION STARS OFFICIAL AUTOMATED alpine A minimal Docker image based on Alpine Lin... 1759 [OK]
The -f is-official=true filter we are using here limits the output to just the base container that Docker officially sanctions. Other matches mentioning ‘alpine’ may be (most probably) based on this but have the hard work of adding other software expressed as those nice union filesystem layers. An image is a collection of such layers – an image that has other images that are derived from it cannot be removed unless those images are also removed.
docker run -it alpine /bin/sh Unable to find image 'alpine:latest' locally latest: Pulling from library/alpine 0a8490d0dfd3: Pull complete Digest: sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8 Status: Downloaded newer image for alpine:latest / #
What this is doing is running a chosen command (/bin/sh) within the image called ‘alpine’ using an interactive terminal. If the alpine image is int already downloaded, or the one on the hub is newer and we state we want ‘latest’ rather than a specific version then it is downloaded. If there are several union layers that make up the image they all get downloaded simultaneously.
We show this by now repeating the run command for one of the other alpine based images on the hub:
docker run -it mritd/alpine /bin/sh Unable to find image 'mritd/alpine:latest' locally latest: Pulling from mritd/alpine 0a8490d0dfd3: Already exists f22eacae62c9: Pull complete Digest: sha256:98023aa60f8398432f5d62ccd16cea02279bb9efc109244d93d441dd09100e18 Status: Downloaded newer image for mritd/alpine:latest / #
This shows clearly that ‘mritd/alpine’ is based on the alpine files we have already downloaded plus an extra layer of some customisation or other adaptation that contributor felt worth sharing (not important for the purpose of this discussion).
Chunks of storage called volumes on the host can be added (mounted in) to the container with the -v argument to run. If a path in the container is mentioned on its own then the attached storage will be allocated by docker – but still exist outside of the union filesystem ‘layer cake’ of the image. If a colon separated pair of paths is used the host path is joined into the image as the destination path.
docker run -v `pwd`:/tmp/x -it alpine /bin/sh
Will run that super simple Alpine Linux image but if in the shell if you cd to /tmp/x you will find whatever was in the current directory on the host. If you would like the container to be able to look but not touch the data you are sharing add a :ro to the end.
Adding volume mounts means that the data lives outside of the life-cycle of that particular Docker image, and is not subject to size constraints of the original filesystem container – usually only a few GB. Volumes can also be created using the docker volume create command and can live on shared storage systems making them independent of any one host. See this documentation on flocker opening up to Docker ‘swarms’ operating over large numbers of hosts. This allows a software service to be rolled out in the modern fault tolerant “shared nothing” architecture without needing lots of dedicated physical machines, or even dedicated virtual machines.
Going in the other direction it is possible to narrow sharing down to an individual file level too:
docker run --rm -it -v ~/.bash_history:/root/.bash_history ubuntu /bin/bash
This example allows the shell within a ubuntu container to share your bash history and even add commands to it while running as root within that container. A handy trick while you are still experimenting with what need to go into that Dockerfile….
Lastly if a container has mounted volumes you can create a new container that mounts the same volumes – in effect shares the same data areas by using the –volumes-from flag to run. This together with the shared storage paradigm allows large groups of distributed services all sharing the same underlying data to be created.
This shows what images we have locally (contrast this with docker ps for running and suspended instances) . This consists of both images that have been downloaded from the Hub and those we have made ourselves by applying a Dockerfile to the base of an existing image.
Rmi stands for remove image. Eventually you could end up with lots and lots of images you no longer have any use for. The rmi command will remove any image that is not referenced by another. For the above case if we want to docker rmi alpine we would need to add a -f (force) flag because we have another image dependent on it:
docker rmi -f alpine Untagged: alpine:latest Untagged: alpine@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8 Deleted: sha256:88e169ea8f46ff0d0df784b1b254a15ecfaf045aee1856dca1ec242fdd231ddd
If dependencies are more complex it is not possible to use the -f – some scripting will be needed to find all the dependant images and remove them first.
If you want to make custom changes to an image create a directory with a Dockerfile text file plus any custom content you wish to go into an image (e.g. content for a web server). You can give the resulting image any name you like that is not already in use by another image. With the Dockerfile and command line options you can control aspects such as what network ports are accessible within the image, how much memory and CPU time it is permitted to use, and what parts of the host filesystem are exposed to it. Volume mappings as explained above in docker run can alos be specified in the Dockerfile. When the built images are run with docker run it is common for ports in the image to be remapped so that they can be visible from the outside world. For example you can have several web services configured within the container to operate on port 80 but from the outside world accessible on ports 8081, 8082 etc. Whole sets of micro-servers can be orchestrated using Docker Compose in a similar manner to how Amazon Cloud Formation or OpenStack HEAT templates work.
If you have signed up for a hub account you can use those credentials to log into it and then docker push mages you have created with docker build to it. This allows containerised infrastructure to be developed and then stored centrally for deployment.
This is not an exhaustive set of what Docker can do, what I have tried to do is give a useful summary and pointers for further study.