Keeping base and CI/CD Docker images up-to-date in AWS
If you’re building containerised services, or using a CI/CD system, you’ll likely end up building base images that contain the customisations that fit your organisation’s needs. For example, you might update OS packages, install a newer version of a package manager, or install the CLI tool(s) of your chosen cloud provider. Keeping these images up-to-date can become a maintenance burden, especially if you want to keep several versions available for different programming languages or runtimes.
Let’s explore how we can automate this process.
In the shersoft-ltd/evergreen-ci-and-base-images repository, we’ll start by creating some Amazon Elastic Container Registry (ECR) repositories for our images (modules/ecr-repository/main.tf):
resource "aws_ecr_repository" "main" {
name = var.name
}
We’re going to be making lots of images each day, so let’s also add in a lifecycle policy that will clean up any draft images, and those that are untagged:
resource "aws_ecr_lifecycle_policy" "main" {
repository = aws_ecr_repository.main.id
policy = jsonencode({
"rules" : [
{
"rulePriority" : 1,
"description" : "Retire draft images.",
"selection" : {
"tagStatus" : "tagged",
"tagPrefixList" : [
"draft"
],
"countType" : "sinceImagePushed",
"countUnit" : "days",
"countNumber" : 1
},
"action" : {
"type" : "expire"
}
},
{
"rulePriority" : 2,
"description" : "Retire untagged images.",
"selection" : {
"tagStatus" : "untagged",
"countType" : "sinceImagePushed",
"countUnit" : "days",
"countNumber" : 1
},
"action" : {
"type" : "expire"
}
}
]
})
}
With that in place, we’ll setup the following file structure:
runtimes/node/Dockerfile
runtimes/node/self-test.js
runtimes/node/self-test.sh
runtimes/node-ci-cd/Dockerfile
runtimes/node-ci-cd/self-test.js
runtimes/node-ci-cd/self-test.sh
runtimes/python/Dockerfile
runtimes/python/self-test.py
runtimes/python/self-test.sh
Each runtime we’re building images for will have a Dockerfile
that defines
how to build the image, and a self-test.sh
file that will be used to verify
that the built image works as we expect. Let’s have a look at the Dockerfile
for the node
runtime:
ARG RUNTIME
ARG VERSION
FROM node:${VERSION}
# Try and minimise active vulnerabilities by updating all OS packages
RUN apt-get update && \
apt-get dist-upgrade --yes && \
rm -rf /var/lib/apt/lists/*
COPY self-test.js /usr/local/bin/self-test.js
COPY self-test.sh /usr/local/bin/self-test
We’ll support teams developing in multiple runtime versions by varying the base
image with the VERSION
environment variable. For example, this could be 18
or 20
for Node LTS versions. If we had to vary the setup greatly by version,
we could use if functions in the Dockerfile, or call different scripts or
different Dockerfiles based on versions.
The repo linked above (shersoft-ltd/evergreen-ci-and-base-images) contains a full GitHub Workflow and CodePipeline / CodeBuild example. Let’s have a look at the GitHub version first. We’ll define a workflow that is run for pull requests, merges to the default branch and that also refreshes the images each day:
name: 'Build, verify and publish'
on:
push:
branches:
- main
pull_request:
schedule:
- cron: '25 6 * * *'
The first job will build the images and publish them with a draft-*
tag. This
lets us try them out in later stages, and also pull them down locally if
required.
jobs:
build-images:
runs-on: ubuntu-latest
timeout-minutes: 5
# We're going to authenticate with AWS, so we'll need an OIDC token
permissions:
contents: read
id-token: write
# Here's where we can define which runtime(s) and version(s) we're using
strategy:
matrix:
image:
- runtime: node
version: 18
- runtime: node
version: 20
- runtime: node-ci-cd
version: 18
- runtime: node-ci-cd
version: 20
- runtime: python
version: 3.11
- runtime: python
version: 3.12
fail-fast: false
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Authenticate with AWS
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::$:role/github-actions
aws-region: eu-west-1
- name: Login to Amazon ECR
id: login_to_ecr
uses: aws-actions/amazon-ecr-login@v2
# We build images for multiple platforms, and use QEMU for platforms
# that we're not running on
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build, tag, and push draft image to Amazon ECR
env:
REGISTRY: $
RUNTIME: $
VERSION: $
working-directory: runtimes/$
run: |
docker buildx build \
--platform linux/amd64,linux/arm64 \
--tag ${REGISTRY}/${RUNTIME}:draft-${VERSION}-$ \
--build-arg RUNTIME=${RUNTIME} \
--build-arg VERSION=${VERSION} \
--push \
.
The next job will pull down the image we just built, and will run the self test
script that’s defined in the Dockerfile
. The self test script could do things
like:
-
Ensure certain language or runtime features are available;
-
Install a package to verify the package manager is working;
-
Check the runtime version that’s installed, and where it’s installed;
-
Check the package manager version that’s installed, and where it’s installed.
verify-images:
runs-on: ubuntu-latest
timeout-minutes: 15
needs:
- build-images
permissions:
contents: read
id-token: write
strategy:
matrix:
image:
- runtime: node
version: 18
- runtime: node
version: 20
- runtime: node-ci-cd
version: 18
- runtime: node-ci-cd
version: 20
- runtime: python
version: 3.11
- runtime: python
version: 3.12
fail-fast: false
steps:
- name: Authenticate with AWS
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::$:role/github-actions
aws-region: eu-west-1
- name: Login to Amazon ECR
id: login_to_ecr
uses: aws-actions/amazon-ecr-login@v2
# The 'setup-test' script is setup as a command in the `Dockerfile`
- name: Run verification script
env:
REGISTRY: $
RUNTIME: $
VERSION: $
run: docker run --entrypoint self-test ${REGISTRY}/${RUNTIME}:draft-${VERSION}-$
The final job then tags the image as the final version, and cleans up the
draft-*
tag.
push-images:
runs-on: ubuntu-latest
timeout-minutes: 5
needs:
- verify-images
permissions:
contents: read
id-token: write
strategy:
matrix:
image:
- runtime: node
version: 18
- runtime: node
version: 20
- runtime: node-ci-cd
version: 18
- runtime: node-ci-cd
version: 20
- runtime: python
version: 3.11
- runtime: python
version: 3.12
fail-fast: false
if: (github.event_name == 'push' && github.ref == 'refs/heads/main') || github.event_name == 'schedule'
steps:
- name: Authenticate with AWS
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::$:role/github-actions
aws-region: eu-west-1
- name: Login to Amazon ECR
id: login_to_ecr
uses: aws-actions/amazon-ecr-login@v2
# We can use this command to avoid pulling, tagging, and pushing the image
- name: Publish Docker image without draft prefix
env:
REGISTRY: $
RUNTIME: $
VERSION: $
run: docker buildx imagetools create --tag ${REGISTRY}/${RUNTIME}:${VERSION} ${REGISTRY}/${RUNTIME}:draft-$-$
# Now we use ECR's API via the AWS CLI to clean up the tag
- name: Remove draft tag
env:
REGISTRY: $
RUNTIME: $
VERSION: $
run: |
aws ecr \
batch-delete-image \
--repository-name ${RUNTIME} \
--image-ids imageTag=draft-${VERSION}-$
The CodeBuild version is considerably more involved, as we’ll build each platform (e.g. ARM / X86) as a separate job. This technique is adapted from the AWS blog post Creating multi-architecture Docker images to support Graviton2 using AWS CodeBuild and AWS CodePipeline.
Let’s start with the build job:
version: 0.2
batch:
fast-fail: false
build-graph:
# This is very similar to the matrix we provided to GitHub Actions, even if
# it's a bit more verbose. We define each of the runtime(s) and version(s)
# to build.
- identifier: node_18_arm
env:
type: ARM_CONTAINER
image: aws/codebuild/amazonlinux2-aarch64-standard:3.0
variables:
RUNTIME: node
VERSION: "18"
ARCHITECTURE: arm
ignore-failure: false
- identifier: node_18_x86
env:
type: LINUX_CONTAINER
variables:
RUNTIME: node
VERSION: "18"
ARCHITECTURE: x86
ignore-failure: false
- identifier: node_20_arm
env:
type: ARM_CONTAINER
image: aws/codebuild/amazonlinux2-aarch64-standard:3.0
variables:
RUNTIME: node
VERSION: "20"
ARCHITECTURE: arm
ignore-failure: false
- identifier: node_20_x86
env:
type: LINUX_CONTAINER
variables:
RUNTIME: node
VERSION: "20"
ARCHITECTURE: x86
ignore-failure: false
- identifier: node_ci_cd_18_arm
env:
type: ARM_CONTAINER
image: aws/codebuild/amazonlinux2-aarch64-standard:3.0
variables:
RUNTIME: node-ci-cd
VERSION: "18"
ARCHITECTURE: arm
ignore-failure: false
- identifier: node_ci_cd_18_x86
env:
type: LINUX_CONTAINER
variables:
RUNTIME: node-ci-cd
VERSION: "18"
ARCHITECTURE: x86
ignore-failure: false
- identifier: node_ci_cd_20_arm
env:
type: ARM_CONTAINER
image: aws/codebuild/amazonlinux2-aarch64-standard:3.0
variables:
RUNTIME: node-ci-cd
VERSION: "20"
ARCHITECTURE: arm
ignore-failure: false
- identifier: node_ci_cd_20_x86
env:
type: LINUX_CONTAINER
variables:
RUNTIME: node-ci-cd
VERSION: "20"
ARCHITECTURE: x86
ignore-failure: false
- identifier: python_3_11_arm
env:
type: ARM_CONTAINER
image: aws/codebuild/amazonlinux2-aarch64-standard:3.0
variables:
RUNTIME: python
VERSION: "3.11"
ARCHITECTURE: arm
ignore-failure: false
- identifier: python_3_11_x86
env:
type: LINUX_CONTAINER
variables:
RUNTIME: python
VERSION: "3.11"
ARCHITECTURE: x86
ignore-failure: false
- identifier: python_3_12_arm
env:
type: ARM_CONTAINER
image: aws/codebuild/amazonlinux2-aarch64-standard:3.0
variables:
RUNTIME: python
VERSION: "3.12"
ARCHITECTURE: arm
ignore-failure: false
- identifier: python_3_12_x86
env:
type: LINUX_CONTAINER
variables:
RUNTIME: python
VERSION: "3.12"
ARCHITECTURE: x86
ignore-failure: false
phases:
pre_build:
commands:
- echo Login in to Amazon ECR
- aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin $REGISTRY
build:
commands:
- echo Check Docker version
- docker version
- echo Move into correct directory
- cd runtimes/$RUNTIME
- echo Build, tag, and push draft image to Amazon ECR
- docker buildx build --tag ${REGISTRY}/${RUNTIME}:draft-${VERSION}-${ARCHITECTURE}-${CODEBUILD_RESOLVED_SOURCE_VERSION} --build-arg RUNTIME=${RUNTIME} --build-arg VERSION=${VERSION} --push .
Once built, we can verify each image. I’ve removed the batch configuration to keep this example shorter:
version: 0.2
batch:
fast-fail: false
# ... snipped - see GitHub repo ...
phases:
pre_build:
commands:
- echo Login in to Amazon ECR
- aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin $REGISTRY
build:
commands:
- echo Test built image
- docker run --entrypoint self-test ${REGISTRY}/${RUNTIME}:draft-${VERSION}-${ARCHITECTURE}-${CODEBUILD_RESOLVED_SOURCE_VERSION}
Finally, we can push the published versions and clean up the draft tags. We don’t have to repeat this for each architecture, so the batch configuration is slightly less involved.
version: 0.2
batch:
fast-fail: false
build-graph:
- identifier: node_18
env:
variables:
RUNTIME: node
VERSION: "18"
ignore-failure: false
- identifier: node_20
env:
variables:
RUNTIME: node
VERSION: "20"
ignore-failure: false
- identifier: node_ci_cd_18
env:
variables:
RUNTIME: node-ci-cd
VERSION: "18"
ignore-failure: false
- identifier: node_ci_cd_20
env:
variables:
RUNTIME: node-ci-cd
VERSION: "20"
ignore-failure: false
- identifier: python_3_11
env:
variables:
RUNTIME: python
VERSION: "3.11"
ignore-failure: false
- identifier: python_3_12
env:
variables:
RUNTIME: python
VERSION: "3.12"
ignore-failure: false
phases:
pre_build:
commands:
- echo Login in to Amazon ECR
- aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin $REGISTRY
build:
commands:
- echo Publish ARM image
- docker pull ${REGISTRY}/${RUNTIME}:draft-${VERSION}-arm-${CODEBUILD_RESOLVED_SOURCE_VERSION}
- docker tag ${REGISTRY}/${RUNTIME}:draft-${VERSION}-arm-${CODEBUILD_RESOLVED_SOURCE_VERSION} ${REGISTRY}/${RUNTIME}:${VERSION}-arm
- docker push ${REGISTRY}/${RUNTIME}:${VERSION}-arm
- echo Publish X86 image
- docker pull ${REGISTRY}/${RUNTIME}:draft-${VERSION}-x86-${CODEBUILD_RESOLVED_SOURCE_VERSION}
- docker tag ${REGISTRY}/${RUNTIME}:draft-${VERSION}-x86-${CODEBUILD_RESOLVED_SOURCE_VERSION} ${REGISTRY}/${RUNTIME}:${VERSION}-x86
- docker push ${REGISTRY}/${RUNTIME}:${VERSION}-x86
- echo Create multi-arch image
- docker manifest create ${REGISTRY}/${RUNTIME}:${VERSION} ${REGISTRY}/${RUNTIME}:${VERSION}-arm ${REGISTRY}/${RUNTIME}:${VERSION}-x86
- echo Publish image
- docker manifest push ${REGISTRY}/${RUNTIME}:${VERSION}
- echo Delete draft tags
- aws ecr batch-delete-image --repository-name ${RUNTIME} --image-ids imageTag=draft-${VERSION}-arm-${CODEBUILD_RESOLVED_SOURCE_VERSION} imageTag=draft-${VERSION}-x86-${CODEBUILD_RESOLVED_SOURCE_VERSION}
Conclusion
With a little help from a CI/CD system, we can keep a set of Docker images that fit our team’s needs up-to-date and minimise the number of active vulnerabilities in them. Our service image Dockerfiles can avoid some repetition, as they’ll know that the OS packages are already up-to-date and ready to go.
Check out the GitHub repo (shersoft-ltd/evergreen-ci-and-base-images) to see the full examples in GitHub Actions and CodePipeline/CodeBuild.
Learning AWS' CDK in TypeScript? Check out my course on Udemy.