Docker 101

Aug 9, 2020 by Charlie Reese | 15 minute read

I recently started using Docker for personal projects, and I love it. So much so that I wanted to put together a reference document for myself (and anyone else that finds it useful). What follows is that document.

Table of Contents

  1. What is Docker? Why use Docker?
    • Docker Engine
  2. Docker Images
  3. Docker Containers
  4. Docker Networking
  5. Docker Volumes (Persistent Data)
  6. Containerizing an App
    • Dockerfiles
    • Docker Compose
  7. Containerizing Multiple Apps
    • Docker Swarm
    • Docker Stacks
    • Kubernetes

1. What is Docker? Why use Docker?

Docker is infrastructure software that creates, manages, and orchestrates containers (it is also the company that maintains the project, Docker Inc.).

1.1 What is a Container?

A container is a stand-alone, executable package of software including everything needed to run it.

Reasons to use containers:

  1. Containers always run the same, regardless of their environment

    • No more "Works on my machine" problems
  2. Containers are fast / lightweight

    • Containers running on a single machine can share a machine's operating system kernel (unlike virtual machines)
  3. Containers allow you to easily isolate and run many applications

    • Developers can deploy production-like environments locally in seconds
    • Applications can be isolated with only what they need to survive, and nothing more
    • Deploying and managing microservices becomes simple

1.2 The Docker Engine

The Docker engine is the modular software that runs and manages containers. The following components make up the Docker engine:

  • Docker client
  • Docker daemon
    • Implements the Docker API, image management, authentication, security, networking, and volumes
  • containerd
    • Manages container execution logic and container lifecycle operations (i.e. start, stop, pause, etc.)
    • Formerly in the Docker daemon
  • runc
    • Implementation of the OCI container-runtime-spec (creates containers / interface to kernel primitives)

2. Docker Images

Much like a virtual machine (VM) template is like a stopped VM, a Docker image is like a stopped container. You can pull images from an image repository; Docker Hub is the most popular. You can also build images using Dockerfiles (more on this later).

Images are made up of stacked layers that form a single object. An image contains a basic OS and any files / dependencies required to run an application.

Popular Commands

  • docker image pull [OPTIONS] NAME[:TAG|@DIGEST]
    • Pull an image or a repository from a registry
  • docker image push [OPTIONS] NAME[:TAG]
    • Push an image or a repository to a registry
  • docker image ls [OPTIONS] [REPOSITORY[:TAG]]
    • List images
  • docker image inspect [OPTIONS] IMAGE [IMAGE...]
    • Display detailed information on one or more images
  • docker image rm [OPTIONS] IMAGE [IMAGE...]
    • Remove one or more images
  • docker image tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
    • Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
  • docker image build [OPTIONS] PATH | URL | -
    • Build an image from a Dockerfile
  • docker system df
    • View Docker disk usage
  • docker system prune
    • This will remove:
    • All stopped containers
    • All volumes not used by at least one container
    • All networks not used by at least one container
    • All dangling images

3. Docker Containers

Containers are the runtime instances of images. Much like a running VM is a started VM template, a running Docker container is a started image. However, unlike VMs, containers share the OS / kernel with the host they are running on, and as such are faster and more lightweight.

After a container is started, it will run until the app it is executing exits; if you run docker container run alpine:latest sleep 10, the container will exit after 10 seconds. Note that you can also manually ask the container to exit using docker stop (sends a SIGTERM signal, waits 10 seconds, then sends a SIGKILL signal if needed). You can also start containers with a restart policy (always, unless-stopped, on-failed).

Popular Commands

  • docker container run [OPTIONS] IMAGE [COMMAND] [ARG...]
    • Start a new container and run a command
  • docker container create [OPTIONS] IMAGE [COMMAND] [ARG...]
    • Like docker container run, but container is never started
  • ctrl + pq
    • Detach shell from terminal of container, leaving container running in background
  • docker container ls [OPTIONS]
    • List containers (use -a to list stopped containers)
  • docker container exec [OPTIONS] CONTAINER COMMAND [ARG...]
    • Run a command in a running container
    • e.g. docker container exec -it [CONTAINER] bash
  • docker container attach [OPTIONS] CONTAINER
    • Attach local standard input, output, and error streams to a running container
  • docker container stop [OPTIONS] CONTAINER [CONTAINER...]
    • Stop one or more running containers
    • Issues SIGTERM to process with PID 1 in container (and SIGKILL 10 seconds later if needed)
  • docker container start [OPTIONS] CONTAINER [CONTAINER...]
    • Start one or more stopped containers
  • docker container rm [OPTIONS] CONTAINER [CONTAINER...]
    • Remove one or more containers
  • docker container inspect [OPTIONS] CONTAINER [CONTAINER...]
    • Display detailed information on one or more containers
  • docker container logs [OPTIONS] CONTAINER
    • Fetch the logs of a container
  • docker container commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]
    • Create a new image from a container’s changes
  • docker container cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
    • Copy files / folders between a container and the local filesystem
    • Use ‘-‘ as the source to read a tar archive from stdin and extract it to a directory destination in a container
    • Use ‘-‘ as the destination to stream a tar archive of a container source to stdout

4. Docker Networking

Docker networking allows applications inside of containers (and possibly in different locations) to communicate with each other. Docker comes with solutions for container-to-container networks, VLANs (for communicating with external systems like VMs and physicals), and connecting to existing networks. It also provides native service discovery and container load balancing.

Docker has a set of native drivers that handle the most common networking requirements, such as:

  • Single-host bridge networks (bridge driver)
    • Connects containers on the same host
    • By default, Docker hosts get a single-host bridge network that new containers attach to (called bridge on Linux)
  • Multi-host overlays (overlay driver)
    • Connects containers across hosts
  • VLANs (macvlan driver)
    • Gives containers MAC and IP addresses on existing physical networks
    • Requires the host NIC be in promiscuous mode (not allowed on many public cloud platforms)

Containers on the same network can ping other containers by name (except containers on the default bridge network); all containers are registered with the embedded Docker DNS service.

Containers are also able to map a container port to a port on the Docker host. Traffic that is routed to the Docker host on the configured port will be directed into the container.

Popular Commands

  • docker network inspect [NETWORK]
    • Provides configuration information about a Docker network
  • docker network ls [OPTIONS]
    • Lists networks on local Docker host
  • docker network create [OPTIONS] NETWORK
    • Create a Docker network
    • Specify the driver with the -d flag
  • docker network prune [OPTIONS]
    • Delete unused networks on Docker host
  • docker network rm NETWORK [NETWORK...]
    • Remove network(s)
  • docker container [dis]connect [OPTIONS] NETWORK CONTAINER
    • [Dis]connect a container from a network

5. Docker Volumes (Persistent Data)

In Docker, data can be persistent (volumes) or non-persistent (created and deleted alongside the container). Many types of data should be persistent - logs, records, etc. Persistent data lives indefinitely (until volume is manually destroyed), while non-persistent data is tied to the lifecycle of its container. Said another way: unless your data is in a volume (volume lifecycle is independent of container lifecycle), when you destroy a container, you will lose the non-persistent data contained within it.

Docker creates new volumes with the built in local driver (local scope - only available to containers on the node volume is created on) by default. By using the -d flag, you can specify a different driver than local. Third-party drivers are available as plugins and some provide advanced storage features.

The mountpoint of a container's volume indicates where on the host the volume exists. Data copied into a host's mountpoint will be immediately available in the container.

Note: it is more trouble than it is worth to use volumes to containerize a production database.

Popular Commands

  • docker volume create [OPTIONS] [VOLUME]
    • Create a volume
    • Use the -d flag to specify a volume driver
    • Use the --name flag to specify a volume name
  • docker volume ls [OPTIONS]
    • List volumes
  • docker volume inspect [OPTIONS] VOLUME [VOLUME...]
    • Display information on volume(s)
  • docker volume prune [OPTIONS]
    • Remove unused local volumes
  • docker volume rm [OPTIONS] VOLUME [VOLUME...]
    • Remove volume(s)

6. Containerizing an App

Containerizing (or Dockerizing) an app is the process of configuring an application to run as a container. The main advantages of containerizing apps are:

  • Increased build simplicity and standardization
  • Faster shipment
  • Easier deployment

Containerizing an app requires the following steps:

  1. Create your application code
  2. Create a Dockerfile that describes your app, dependencies, and how to run it
  3. Use Dockerfile with docker image build to create (and later run) a Docker image
    • Use .dockerignore file in root directory of build context to exclude files / directories

6.1 Dockerfiles

A Dockerfile (always one word with a capital 'D') is typically found in the root directory of an application's build context. It is used to:

  1. Describe an application
  2. Tell Docker how to build an image from the application

Given the above, Dockerfiles are, in a sense, self-documenting. Dockerfiles start with the FROM instruction, which forms the base layer of an image. The rest of an app is built from additional layers from other commands. Previously built layers are saved (to expedite future builds), and as such any instructions building a layer from evolving contents should be placed near the end of a Dockerfile if possible (for increased build speed).

Popular Dockerfile instructions:

  • FROM
    • Base layer of image
    • Specifies maintainer metadata of image
  • RUN
    • Runs commands (often installing packages / creating files as additional image layer)
  • COPY
    • Copies files relative to the build context
  • ADD
    • Like COPY, with some extra functionality
    • Sets the working directory for instructions that follow
    • Relative to the image (does not create a new layer)
    • Suggests container port data (does not create a new layer)
    • Does not actually publish the port (this can be done using -p flag when running the container)
    • Sets the main application command that the image (container) runs by default (does not create a new layer); configures a container to run as an executable
  • CMD
    • Can be used to define what command is executed when running a container
    • Can be used to specify default arguments for ENTRYPOINT (is appended to ENTRYPOINT command) if ENTRYPOINT command exists and is run in exec form
    • Overridden when container is run with alternative arguments
  • ENV
    • Sets environment variables
    • Sets system call signal sent to the container to exit
    • Sets how to test a container to check if it is still working
  • USER
    • Sets the user name (and optionally user group) used when running the image and for any (RUN, CMD and ENTRYPOINT) instructions
    • Creates a mount point
    • The ONBUILD instruction adds =a trigger instruction to be executed at a later time (when the image is used as a base for another build)
    • The trigger will be executed in the context of the downstream build (like it had been inserted immediately after the FROM instruction in the downstream Dockerfile)
  • ARG
    • Sets variable that can be passed at build-time to docker build command

Example Dockerfile:

FROM alpine
LABEL maintainer=""
RUN apk add --update nodejs nodejs-npm
COPY . /public
WORKDIR /public
RUN npm install
ENTRYPOINT ["node", "./app.js"]

Note: make sure you carefully track changes to Dockerfiles in source control

6.2 Docker Compose

Docker Compose lets you describe an app (a collection of services / containers) in a single configuration file that can be deployed (and managed) with simple commands. Services are defined in a YAML file (JSON is also supported) typically called docker-compose.yml, which Compose deploys via the Docker Engine API.

Under the hood, Compose is an external Python library that you install on the host running the Docker Engine.

docker-compose.yml files have 4 main top-level keys (other keys exist, e.g. secrets and configs):

  1. version
  2. services
  3. networks
  4. volumes

Example Compose file for a Flask app containing 2 services (web-app and redis):

version: "3.5"
    build: .
    command: python
      - target: 5000
         published: 5000
      - counter-net
      - type: volume
         source: counter-vol
         target: /code
    image: "redis:alpine"

The Docker docs have great getting started guides, like the Quickstart: Compose and Rails guide.

Popular Commands:

  • General format: docker-compose [-f <arg>...] [options] [COMMAND] [ARGS...]
  • docker-compose up
    • Create and start containers
  • docker-compose down
    • Stop and remove containers, networks, images, and volumes
  • docker-compose exec
    • Execute a command in a running container
  • docker-compose logs
    • View output from containers
  • docker-compose ps
    • List containers
  • docker-compose restart
    • Restart services
  • docker-compose rm
    • Remove stopped containers
  • docker-compose scale
    • Set number of containers for a service
  • docker-compose start
    • Start services
  • docker-compose stop
    • Stop services
  • docker-compose top
    • Display the running processes

7. Containerizing Multiple Apps

Deploying / managing multi-service apps at scale is hard, but it doesn't have to be. The below 3 solutions help you simplify application management by providing:

  • State management
  • Rolling updates
  • Easy scaling
  • Health checks

and more.

Having trouble picking one of the below? If you are building a small ecosystem of applications without the need for heavy configuration, try Docker Swarm and Docker Stacks. If you are managing a more complex ecosystem, or simply want more control over your configuration, try Kubernetes.

7.1 Docker Swarm

Docker Swarm is used for grouping containers across multiple hosts, allowing them to be managed as a cluster. It also:

  • Exposes an API for deploying / managing microservices
  • Allows you to deploy apps with native Docker commands
  • Allows you to perform rolling updates (docker service update), rollbacks, and scaling (docker service scale)

Popular Commands:

  • docker swarm init [OPTIONS]
    • Initialize a swarm
  • docker swarm join [OPTIONS] HOST:PORT
    • Join a swarm as a node and/or manager
  • docker swarm leave [OPTIONS]
    • Leave the swarm
  • docker swarm unlock
    • Unlock swarm
  • docker swarm update [OPTIONS]
    • Update the swarm

7.2 Docker Stacks

Docker Stacks is used (often with Docker Swarm) to define and manage complex multi-container / multi-service applications using a single declarative file (a _Compose file). It also provides a simple way to deploy and manage lifecycle (i.e. initial deployment, health checks, scaling, updates, rollbacks, etc.).

Popular Commands:

  • docker stack deploy [OPTIONS] STACK
    • Deploy a new stack or update an existing stack
  • docker stack ls
    • List stacks
  • docker stack ps [OPTIONS] STACK
    • List the tasks in the stack
  • docker stack rm STACK [STACK...]
    • Remove one or more stacks
  • docker stack services [OPTIONS] STACK
    • List services in stack

7.3 Kubernetes

Kubernetes deserves a post of its own. In short, it is a Google open-source project that is widely considered the leading orchestrator of containerized apps. It is similar to Docker Swarm, but is more sophisticated and popular.