Jump to Content

Andrew M McCall

Docker: Important Topics & Notes

Some basic docker commands for reference. My personal important topics and notes for Docker.

Basic Docker Commands

Docker container run first looks for the image locally in the image cache. If it doesn’t find anything, it looks for it in the remote image repository which defaults to Docker Hub. If not specified, it chooses the latest, then creates a new container based on that image and prepares to start. It gives it a virtual IP on a private network inside of docker engine. It opens up port 80 on the host and forwards to port 80 on the container.
finally the container stars.

What is a container?

It is not a virtual machine. They are a process running on your host operating system. It is a restricted process inside our host operating system and nothing like a virtual machine.

See what is going on inside docker container

docker container run -d --name nginx nginx && docker container run -d --name mysql -e MSQL_RANDOM_ROOT_PASSWORD=true mysqla

Getting a shell inside of the container

Docker Networks: Overview

Batteries Included, But Removable

❯ docker container port webhost
80/tcp -> 0.0.0.0:80
80/tcp -> [::]:80
❯ docker container inspect --format '{{ .NetworkSettings.IPAddress  }}' webhost
172.17.0.2

The firewall blocks incoming traffic by default, and docker container traffic are NAT’d. There is docker virtual networks that are called things like bridge/docker0 and the container attaches to the virtual network. Publishing the ports tells the host machine to forward anything coming into that port and forward it to the container port that is open in the container and forwarded appropriately.

Note: you can’t have 2 containers listening on the same port on the host level. I.E., you can’t have two containers forwarding traffic from port 80.

Docker Networks: CLI Management

Docker Networks: DNS

Note: Static IP’s and using IP’s for communicating between containers is an anti-pattern. Do your best to avoid it.

Docker daemon has as built-in DNS server taht containers use by default.

Docker defaults the hostname to the container’s name, but you can also set aliases.

❯ docker container exec -it my_nginx ping new_nginx
PING new_nginx (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: seq=0 ttl=64 time=0.073 ms
64 bytes from 172.18.0.2: seq=1 ttl=64 time=0.028 ms
64 bytes from 172.18.0.2: seq=2 ttl=64 time=0.026 ms

This solves a problem because you can’t predict if they are going to exist, where they are going to be, etc.

For the default bridge network, it does not have the DNS server built into it by default. But you can use the --link option to specify manual links for the default bridge network.

Containers should not rely on IP’s for inter-communication. DNS for friendly names is a built in DOcker feature via custom networks. Custom networks is the best solution for inter-communication of containers.

We can then run yum update curl for example to update curl.

The advantage of the --rm flag is it removes the container after you are done.

Round Robin (kinda)

Getting started with Docker Images

Images are the building blocks of containers. For example, they can be obtained from the Docker Hub registry.

What is in an image:

There is only one “official” version of each docker image, which is in part maintained by the docker team. If someone creates a custom image similar to an official image such as nginx it will always start with the username prefix, i.e., mydockeruser/nginx-proxy.

In some ways, docker hub is kind of the package manager system for containers.

Image Layers - Image Cache Information

Some important topics include:

A history of the image layers. Every image starts from the very beginning with a blank layer known as scratch

❯ docker history nginx:latest
IMAGE          CREATED       CREATED BY                                      SIZE      COMMENT
b52e0b094bc0   4 weeks ago   CMD ["nginx" "-g" "daemon off;"]                0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   STOPSIGNAL SIGQUIT                              0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   EXPOSE map[80/tcp:{}]                           0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   ENTRYPOINT ["/docker-entrypoint.sh"]            0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   COPY 30-tune-worker-processes.sh /docker-ent…   4.62kB    buildkit.dockerfile.v0
<missing>      4 weeks ago   COPY 20-envsubst-on-templates.sh /docker-ent…   3.02kB    buildkit.dockerfile.v0
<missing>      4 weeks ago   COPY 15-local-resolvers.envsh /docker-entryp…   389B      buildkit.dockerfile.v0
<missing>      4 weeks ago   COPY 10-listen-on-ipv6-by-default.sh /docker…   2.12kB    buildkit.dockerfile.v0
<missing>      4 weeks ago   COPY docker-entrypoint.sh / # buildkit          1.62kB    buildkit.dockerfile.v0
<missing>      4 weeks ago   RUN /bin/sh -c set -x     && groupadd --syst…   117MB     buildkit.dockerfile.v0
<missing>      4 weeks ago   ENV DYNPKG_RELEASE=1~bookworm                   0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   ENV PKG_RELEASE=1~bookworm                      0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   ENV NJS_RELEASE=1~bookworm                      0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   ENV NJS_VERSION=0.8.9                           0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   ENV NGINX_VERSION=1.27.4                        0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   LABEL maintainer=NGINX Docker Maintainers <d…   0B        buildkit.dockerfile.v0
<missing>      4 weeks ago   # debian.sh --arch 'amd64' out/ 'bookworm' '…   74.8MB    debuerreotype 0.15

Note: <missing> indicates that they are layers inside the image, but not actually images themselves.

When creating a new image, we start with one layer, and every layer get its own unique SHA to help identify if this layer is the same as another layer on the system.

The SHA is unique, so it is a guaranteed to be the same layer.

Container Layer

Lets say we have an apache image and we want to run a container off of it, it create a new read write layer on top of the apache image. Underneath, the storage driver that is used by docker, is layering like a stack of pancakes all of these changes on top of each other.

Copy On Write: The filesystem takes a file out of the image and copies it into the container layer.

From O’Reilly:

Docker uses the copy-on-write technique when dealing with images. Copy-on-write is a strategy of sharing and copying files for maximum efficiency. If a layer uses a file or folder that is available in one of the low-lying layers, then it just uses it. If, on the other hand, a layer wants to modify, say, a file from a low-lying layer, then it first copies this file up to the target layer and then modifies it.

Source: O’Reilly - Learn Docker

Docker Image Inspect

Inspect give you back the metadata. Besides the image ID and it’s tags, you get all sort sof details about how this image expects to be run. For example. It can tell you what Ports you need to open up if you want to accept connections.

You can see Environment variables, and commands that it is going to run when it starts up.

It can also tell us useful information such as the Architecture: “amd64”,“os:linux”

Image Tagging & Pushing To Docker Hub

Important Concepts:

❯ docker image tag --help
Usage:  docker image tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]

Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE

Aliases:
  docker image tag, docker tag

Image don’t technically have a name, but we refer to them like that.

This can be proved by running docker image ls. Notice the absence of a name column?

❯ docker image ls
REPOSITORY                                TAG          IMAGE ID       CREATED         SIZE
fusion-webpack                            latest       7b1fc89f614a   47 hours ago    860MB
<none>                                    <none>       112d76331e05   47 hours ago    860MB
washpost/fusion-engine                    latest       5dd6e4b3bdac   3 days ago      708MB
washpost/fusion-origin                    latest       91e87355e57e   10 days ago     227MB
washpost/fusion-cache-proxy               latest       3c7a3c706170   10 days ago     24.6MB
memcached                                 latest       36b77029f362   2 weeks ago     84.8MB
fusion_zip-zip                            latest       a30af9a3ae21   2 weeks ago     14.8MB
fusion_verify-verify                      latest       8fbe48cf5c75   2 weeks ago     716MB
alpine                                    latest       aded1e1a5b37   3 weeks ago     7.83MB
nginx                                     alpine       1ff4bb4faebc   4 weeks ago     47.9MB
nginx                                     latest       b52e0b094bc0   4 weeks ago     192MB
ubuntu                                    latest       a04dc4851cbc   5 weeks ago     78.1MB
httpd                                     latest       0de612e99135   6 weeks ago     148MB
mysql                                     latest       5568fddd4f66   6 weeks ago     797MB
washpost/fusion-resolver                  latest       065d0b72c223   2 months ago    623MB
mariadb                                   latest       6722945a6940   3 months ago    407MB
washpost/pb-editor-api                    dev          965551d554a9   9 months ago    751MB
pagebuilderteam/arc-themes-stylebuilder   latest       c689863d8e3f   11 months ago   203MB
washpost/fusion-cli-api                   production   9235387626da   12 months ago   973MB
washpost/mongo-vandelay                   latest       cc73e1bea97e   2 years ago     485MB
centos                                    7            eeb6ee3f44bd   3 years ago     204MB
mailhog/mailhog                           latest       4de68494cd0d   4 years ago     392MB

Instead we refer to them by 3 different pieces of information:

<user>/<repo>:<tag>

The repository is usually made up by the username or organization name / the repository.

Official images are the only images that can live at the root namespace of the registry, so they do not need an account name in front of the repo name.

Tags

The tag is not quite a version or branch, but similar to git tags. It is a pointer to a specific image commit. Tags point to an image id so multiple tags can point to the same image id.

Can be numbers or names. Docker manages namespaces and associates them to image ids.

We can re-tag existing docker images.

docker image tag nginx am/nginx

❯ docker image tag --help
Usage:  docker image tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
#TARGET_IMAGE is kind of the new image

Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE

Aliases:
  docker image tag, docker tag

:latest doesn’t necessarily mean the latest version, it doesn’t have any special meeting, but it is considered a convention.

docker image push mynewtag/nginx will yield denied: requested access to the resource is denied. It tries to upload the tag, but if you haven’t logged in, then it won’t work.

docker login <server>

Defaults to logging into Docker Hub, but you can override by adding server url.

The config file is located at ~/.docker/config.json which holds your auth tokens.

Add Additional Tag

docker image tag elkcityhazard/nginx elkcityhazard/nginx:my_new_tag docker image push elkcityhazard/nginx:my_new_tag

Dockerfile: A Recipe For Creating An Image

Dockerfile at first glance looks like a bash script, but it is a Docker- specific syntax.

Dockerfile stanzas are executed top down, so the order actually does matter.

RUN ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx.error.log

Docker takes care of logging for us. We just need to make sure that anything we want to log is sent to stdout and stderr.

Building a dockerfile

docker image build -t examplenginx .

Any line inside the dockerfile that changes, it will not use the cache. Also, lines after that line will be rebuilt disregarding the cache. Note: it is a good practice to keep the things that change the least at the top of your dockerfile, while things that change the most near the bottom of your dockerfile.

Copy a file into an container

FROM nginx:latest

WORKDIR /usr/share/nginx/html

COPY index.html index.html

A Simple Dockerfile Example

FROM node:6-alpine
EXPOSE 3000

RUN apk add --no-cache tini

RUN mkdir -p /usr/src/app

WORKDIR /usr/src/app

COPY package.json package.json

RUN npm install \
&& npm cache clean --force

COPY . .

CMD ["/sbin/tini", "--", "node", "./bin/www"]

Using Prune To Keep Docker System Clean

More about docker system prune here: docker system prune.

Container Lifetime & Persistent Data

Containers are meant to be immutable and ephemeral. You can just throw away a container and create a new one from an image. This is a design goal. What about databases, unique data, and other data separation of concerns. Containers are persistent by nature until we remove the container. Unique data aka persistent data.

Docker has two solutions for this problem: Volumes and Bind Mounts.

Volumes make a special location outside of the container UFS aka Union File System.

Bind mounts link container path to host path. This is just sharing or mounting a host directory or file into a container.

Persistent Data: Volumes

VOLUME command in Dockerfile.

VOLUME /var/lib/mysql # default location of mysql database. This tells docker that when we start a container, to create a new volume location and assign it to this directory until we delete it.

Volumes need manual deletion.

docker volume prune to cleanup unused volumes and make it easiser to see what you have.

docker container run -d --name mysql_sandbox -e MYSQL_ALLOW_EMPTY_PASSWORD=true mysql

docker container inspect mysql_sandbox will show us that there is a volume, but it is also located in “Mounts”. This gives you some interesting metadata such as where the data is actually being stored on the host system.

"Mounts": [
            {
                "Type": "volume",
                "Name": "520765faa029897f9b21b945dd4b818fd6f9b1848165585d32a19b26ebe3cd73",
                "Source": "/var/lib/docker/volumes/520765faa029897f9b21b945dd4b818fd6f9b1848165585d32a19b26ebe3cd73/_data",
                "Destination": "/var/lib/mysql",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            }
        ],
"Volumes": {
                "/var/lib/mysql": {}
            },

docker volume ls && docker volume inspect {volume_id}.

We can see from the container perspective what volume it is using, but we can’t really see what it is connected to.

Volumes persist after container is destroyed.

Named Volumes

docker container run -d --name mysql -e MYSQL_ALLOW_EMPTY_PASSWORD=true -v mysql-db:/var/lib/mysql mysql

This creates a named volume which is more user friendly. Named volumes are much easier to work with if it needs to stick around.

docker volume create

Required to do this before “docker run” to use custom drivers and labels.

This is a time where you ight need to create a custom driver or label.

Bind Mounts

Maps a host file or directory to a container file or directory - “bind outside to inside.”

You can specify a directory or a single file.

This skips the UFS and host files overwrite any in container/

Cannot use bind mounts in Dockerfile, must be at container run.

... run -v /Users/me/stuff:/path/container for mac/linux ...run -v //c/Users/me/stuff:/path/container windoze

This is useful for development when you need to use or access files in development.

docker container run -d --name nginx -p 80:70 -v $(pwd):/usr/share/nginx/html nginx

After we bind a volume, we can do something like this:

docker container exec -it nginx bash && ls -la

And we should be able to access our bind mount from inside the container.

Postgres Password

docker container run --name postgres -e POSTGRES_PASSWORD=mypasswd

Note: In the real world, I always pin my production apps to the patch version. It’s the only safe way to operate.

Upgrading Postgres with named volume

docker container run --rm -d --name mypg2 \
-e POSTGRES_PASSWORD=mypgpw \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v data:/var/lib/postgresql/data \
postgres:9.6.1

you can continue to use the same named volume and only change the postgres tag i.e.,

docker container run --rm -d --name mypg2 \
-e POSTGRES_PASSWORD=mypgpw \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v data:/var/lib/postgresql/data \
postgres:9.6.2

Troubleshooting File Permissions accross multiple containers

ps aux

Look at containers /etc/passwd and /etc/group, you’ll likely find a mismatch.

Figure out how to make sure both containers are running with matching user ID or group ID. i

    RUN groupadd --gid 1000 node \\
            && useradd --uid 1000 --gid node --shell /bin/bash --create-home node
    USER 1000:1000

Note: When setting a Dockerfile’s USER, use numbers, which work better in Kubernetes than using names.

Note 2: If ps doesn’t work in your container, you may need to install it. In debian-based images with apt, you can add it with apt-get update && apt-get install procps

Bind Mount Example

docker run -p 9090:4000 -v $(pwd):/container/dir username/image:tag docker run -p 9090:4000 -v ./data:/some/container/data/path username/image:tag

Dockerfile ENTRYPOINTa

Recap Basic Dockerfile Statements

Dockerfile ENTRYPOINT

Two questions to ask with every new instruction you write in a dockerfile:

  1. Will this new statement overwrite it’s previous use in my dockerfile or any use in my from image. Overwrite vs Additive.
  2. Will this statement be used during my image build or will it be stored in my image metadata and used later when I start a container from this image. Buildtime vs Runtime.

A common confusion for beginners is thinking that the CMD command is run at build time. It is stored in the image metadata and only executed when you start a container from that image.

Only the last CMD in a dockerfile will ever be used including any that might be coming in a from statement. THe final workdir statement decales the file system path where the cmd is exeucted from.

Dockerfile Buildtime vs. Runtime Cheatsheet

What Is An ENTRYPOINT?

The purpose of ENTRYPOINT is to execute a command on container start. They act differently than the CMD command and can work together.

ENTRYPOINT only runs on container starts. Only the last ENTRYPOINT is used in the container.

docker run busybox
docker inspect busybox
docker run -it busybox
whoami # root in container
ps # check processes
ls /bin # get binaries
hostname # returns the hostname of the os
date # get date
exit # exit container
FROM busybox:latest

CMD ["hostname"] // docker calls this json syntax the execform
cd /path-to-directory
docker build -t hostname . 
docker run hostname
docker run --help
docker run hostname date // override the cmd statement

Update CMD to ENTRYPOINT

FROM busybox:latest

ENTRYPOINT ["hostname"] // docker calls this json syntax the execform

docker build -t entryhostname .

docker run entryhostname => nothing changed docker run entryhostname date => operation not permitted

docker run --help

docker run --entrypoint date entryhostname

Docker intends the ENTRYPOINT to complement the CMD and not replace it.

CMD is good for images meant to run long lasting processes in the background.

ENTRYPOINT offers no benefits over CMD by itself, but shines when used together.

ENTRYPOINT & CMD Together

If you set both, then every time you start the container docker takes the ENTRYPOINT and the CMD and combines them with a space between them.

Two main use cases:

  1. You want to treat your containers like a command line tool
  2. You want to run a startup script in the main container before the program starts

This is good for linux utilities:

FROM ubuntu:latest

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    curl \
    && rm -rf /var/lib/apt/lists/*

ENTRYPOINT ["curl"]

CMD ["--help"]

Container startup script

.sh
exec "$@"

FROM python:slim
USER www-data
WORKDIR /var/www/html
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . . 
ENTRYPOINT ["./startup.sh"]
CMD ["python", "app.py"]

Choosing Between RUN, CMD, & ENTRYPOINT

Shell vs Exec Form

Shell vs Exec Form

RUN - shell by default ENTRYPOINT - always use EXEC Form CMD - Use EXEC form by default, but in rare cases shell might be needed ENTRYPOINT + CMD: always use Exec form

Tags: