You have heard it million times. Docker here, containers over there. The hype about Docker has been big – and there is a good reason for it. Before we start, read this stolen definition, twice.
Docker is a set of platform-as-a-service (PaaS) products that use operating-system-level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries and configuration files; they can communicate with each other through well-defined channels. All containers are run by a single operating-system kernel and are thus more lightweight than virtual machines.Wikipedia
If you ever worked with traditional VMs, you know, that each VM runs its own operating system. This is I would say the most important fact to mention, when somebody asks you, how do Docker containers differ from VMs. Docker containers are run by a single operating system-kernel. In comparison to VMs, the performance penalty is negligible.
Keywords you should know
Service that runs directly on host operating system (Linux). Exposes an API which allows clients to create and manage Docker objects (you can find most of them below).
Client is anyone/anything which uses the API exposed by Docker daemon. Every time you use the docker command, you are a client. Client can connect to any Daemon using the URI (by default unix:///var/run/docker.sock)
A file that specifies steps to build a Docker image. The steps are executed one by one. Second step is run on top of the result of the first step, third on top of the result of the second step and so on. We will examine the most frequent commands used and create a Dockerfile in a while.
FROM ubuntu:18.04 COPY . /app RUN make /app CMD python /app/app.py
When you want to build an Docker image from a Dockerfile, you need to specify path to the build context. Docker client then sends so this build context to the Docker daemon. In the root of the build context, there must be a Dockerfile present. This effectively means, that the images are not built on the client side. What it mean as well is, that if you are connected to daemon via network, the entire build context will be sent via this network. If you store huge files in the child structures of the build context, the build time and network traffic will not be happy about it. To exclude files or folder from the build context, it is recommended to use
.dockerignore file, which works in the same way as
Each step results in a Docker layer. This is an important part of Docker which many of developers tend to ignore. Use of layer enables the daemon to make use of caching. If you change only the last step in Dockerfile, the previous layers will be reused from cache and the build will be much faster, as only the last step will be executed.
Docker image is the result of a build and effectively a collection of layers, one on top of other. Docker uses a special file system (there are more) to combine the layers together.
Every Docker image has at least a simple name. On top of that, images can have tags, which usually specify their flavour (base image, version, etc):
Images are usually stored in a public or private Docker registry. Same as would use Git for your code, you can push/pull images to/from a registry. Docker registry again respects image layers and will upload/download only updated layers, which is signalized by a change of the layer hash.
In case you do not use the official docker.io registry, you have to login to the registry using:
docker login <registry.url>
The most interesting part. The part which actually runs and does the job! Docker container is “an image that was spin up”. The best practice is to have a single-process container with strict responsibility. The beauty of container lies in the possibility of easily starting and destroying containers. If something misbehaves in a container, we can kill it and spawn a new one instead. Another selling point is the ease of scaling out. If one container is not capable of handling all the request, you can start second one. Third one. Or you can have hundreds of them and split the load onto multiple containers, which are running on multiple hosts on multiple continents (in other starter, we will take a look on container orchestrator – Kubernetes).
By default, all the changes / disk writes you do (or the container itself does) are lost, when the container is stopped and removed. This is completely fine run running apps, which do not need to write anything to the disk. On the other hand, it is hard to imagine, that you would loose all your data when a database container dies. Therefore, you can define a volume, either directly in the Dockerfile or when you are going to start a container.
There are 3 types of volumes: host volumes, named volumes and anonymous volumes. They all serve the purpose of persisting a specific path(s) of the file system. The host volume binds a path from the host operating system – the one where Docker daemon runs. This is frequently used in development, as you can, in other words, share a path from your disks with a container. This results in a direct access to either database files, logs or any other resource.
Work flow and container life cycle
We discussed some basic keywords used in Docker world. Now it would be the right time to take a look on how to actually work with the docker client, images, containers and volumes. This guide assumes you have Docker already installed. If you don’t, please refer to the official tutorial.
Let’s first create a Dockerfile from scratch. Create a working directory with any name and create a file named Dockerfile without an extension and paste following content into it:
FROM busybox # FROM tells Docker daemon which image should be taken as a base to build ours RUN echo "Hello world!" # As we discussed, every step in this file created a new layer. RUN in this case # creates a layer on top of busybox. This layer only echoes a text and ends. ENTRYPOINT ["watch"] # ENTRYPOINT signalizes the command which should run after the container is # started CMD ["-n", "1", "date"] # CMD signalized the command arguments which should be used together with # ENTRYPOINT
Now, being in the working directory, we can build this Dockerfile into an Docker image using
docker build -t myimage . where myimage stands for an image name. If you now use
docker image ls, you will see myimage listed in the list of available images. Here I suggest you to run the build again only to see the use of layer caching. The build will be much faster, as now you will already have the base busybox image locally and the Dockerfile has not changed. To play around, you can change the CMD step to watch the date not every second, but every 5 seconds using
CMD ["-n", "5", "date"]. If you now try to rebuild the image again, you will see that the cache will be used up to ENTRYPOINT and the last layer will be rebuilt.
What also happened automatically is, that busybox image was pulled from the official Docker registry, as you most probably did not have this image locally, yet. Therefore you will see this image listed in
docker image ls, as well. As we already built our image, we do not need busybox anymore and to save disk space, we can remove it using
docker rm busybox. You saw we (automatically) pulled an image from registry – that means someone or something must have pushed this image to the registry before. We can push our new image to the registry as well, but first you either need to start your own private registry, or you can simply create and account in the official public registry and push it there. The official public registry is called Docker Hub. After creating an account, you have to login using the credentials
docker login --username=username --email@example.com. If you do not specify the URL of the Docker registry, the official one will be used.
Before we push the image, we need to prefix the image with repository name, which in this case must match the username:
docker tag myimage username/myimage:v1. The tag v1 we used is optional, as there is a default tag latest. As we are now logged in and the image is tagged, we can push our newly built image using
docker push username/myimage:v1.
If you want, you can now delete all local images and change the working directory, to see that that myimage is now available to pull worldwide. Let’s create our first container from this image using
docker run --name mycontainer myimage. To verify that the container now runs, we can execute
docker ps -a to see all running and also stopped containers. To see that it does what we specified in the Docker file, let’s print the logs using
docker logs mycontainer. If we want, we can also exec into running container, which is very useful for container debugging:
docker exec -ti mycontainer bash. Bash is the command we want to execute in the container and -ti makes sure we get an interactive tty. Finally, we can stop the container
docker stop mycontainer and remove it
docker rm mycontainer.
With this, we covered the fundamentals of Docker and the container lifecycle which are visualized on following picture.
Of course, there is a lot more. A LOT more. The containers usually expose a port so they are reachable by outside world, they define volumes and they communicate with each other using Docker networks. We will cover this in Docker Compose starter next time!