Mar 1, 2020 / KUBERNETES, CONTAINER, INITIALIZATION, CONFIGURATION

The versatility of Kubernetes' initContainer

There are a lot of different ways to configure containers running on Kubernetes:

Environment variables
Config maps
Volumes shared across multiple pods
Arguments passed to scheduled pods
etc.

Those alternatives fit a specific context, with specific requirements.

For example, none of them allow you to clone a Git repository before the container starts. It would be possible to design that feature into the image itself. Yet, that would introduce coupling, and defeat the Single Responsibility principle. Even though if this principle originally comes from OOP, it makes a lot of sense for containers as well. An image should do one thing only, and do it well. Just as in OOP, the opposite would introduce complexity and fragility - and impact the maintainability of the image.

Other options to actually run commands include:

Container lifecycle hook: Kubernetes offers lifecycle hooks, callbacks that allow code execution during pod initialization (and destruction). The main issue with this approach is that both hook and pod start in parallel: the startup hook might not be finished when the pod starts, leaving the latter in an unknown state.
Jobs and CronJobs: Kubernetes allows to run images of "batch" containers. Contrary to applications that are meant to be interacted with, such as webapps, batch containers run to completion without manual input. Batchs can either run only once - Job, or at regular intervals - CronJob.

One could design an architecture with a job pod, and a regular pod, interacting through a shared volume. Unfortunately, both pods will start at the same time. Just as in the previous case, there’s no guarantee that the job will be finished before the application starts.

However, a feature exists that guarantee that a command will be successfully executed before the start of the pod - initContainer:

Init containers behave like regular containers, except:

They always run to completion.
Each one must complete successfully before the next one is started.

— Kubernetes concepts
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

By contract, the init container command needs to succeed before the pod can even start. If the command fails, the pod will be restarted - and the command will be re-executed, so that containers can rely 100% on the state after initialization. Multiple initContainer can be defined on a pod. In that case, they will be executed in order.

apiVersion: v1
kind: Pod
metadata:
  name: pod
spec:
  initContainers:
  - name: first
    image: busybox
    command: ['sh', '-c', 'echo init One']
  - name: second
    image: busybox
    command: ['sh', '-c', 'echo init Two']
  containers:
  - name: container
    image: busybox
    command: ['sh', '-c', 'echo container']

Sample use-cases of initContainer include:

Registering the external URL of a Service to a third-party server
Initializing a shared volume with a Git repository
Getting data from a third-party provider to create a dynamic configuration file at runtime
Mixing-and-matching the above, so as to get the dynamic configuration from a Git repo
Initializing a shared volume with runtime dependencies (cf. Composition over inheritance applied to Docker)
etc.

In essence, initContainer allows to customize at runtime the execution of immutable images.

For example, here’s the YAML to initialize a shared volume from a Git repo:

apiVersion: v1
kind: ConfigMap                       (1)
metadata:
  name: cfg
data:
  version: v1.0.0
---
apiVersion: apps/v1
kind: Pod
metadata:
  name: pod
spec:
  volumes:
  - name: config-volume               (2)
    emptyDir: {}
  initContainers:
  - name: clone
    image: alpine/git:1.0.4           (3)
    volumeMounts:
    - name: config-volume
      mountPath: /config              (4)
    envFrom:
      - configMapRef:
          name: cfg                   (5)
    command: ['/bin/sh', '-c']        (6)
    args: ['git clone --branch $(version) https://github.com/ajavageek/foo-config && mv foo-config/* /config']
  containers:
  - name: foo
    image: ajavageek/foo:1.0          (7)
    volumeMounts:
    - name: config-volume
      mountPath: /config              (8)
      readOnly: true

1	Create a config map named `cfg` with key `version` and value `v1.0.0`
2	Create an empty volume named `config-volume`
3	Reference the `alpine/git:1.0.4` container as an init container
4	Mount the `/config` folder of the shared volume into this init container
5	References the config map to get the `version` value
6	Clone the `ajavageek/foo-config` Git repo (don’t bother looking, it doesn’t exist) and copy the repo content to the shared volume in the `/config` folder. This command will repeat until it has been executed successfully.
7	Reference the `ajavageek/foo:1.0` container as a standard container
8	Mount the `/config` folder of the shared volume into this standard container. By contract, it can rely on the `/config` folder to contain the required data.

Init container is a feature that is very useful to run initialization code in a Kubernetes pod: it should be part of everybody’s book of knowledge.

To go further:

Follow me Follow me

The versatility of Kubernetes' initContainer

To go further:

Map merge and compute, hidden API diamonds

Stream processing: sources and sinks