Docker packaging is complicated—and you’re about to ship to production
You’re about to ship your Python application into production using Docker: your images are going to be critical infrastructure.
And that means you need to follow best practices, otherwise you risk:
- Slow builds eaving you twiddling your thumbs.
- Unidentifiable images impeding your attempts to reproduce production bugs.
- Security bugs allowing attackers to take over your servers.
How can you ensure your Docker image packaging is doing the right thing? How can you follow best practices for security, speed, reproducibility, and ease of debugging?
Doing your own research: difficult and time-consuming
Unfortunately, while the information you need is out there on the Internet, researching best practices on your own is slow, tricky work.
- Plenty of advice is from people who don’t really know what they’re doing: Follow the wrong answer from StackOverflow and your private repository’s password will be available to anyone who gets access to your image.
- Plenty of advice is obsolete and out-of-date: Advice about small images from 2017 doesn’t necessarily apply any more.
- Best practices are different for Python than for other languages: You don’t want to use Alpine Linux, even if it’s fine for Go.
Elsewhere on my site you’ll find my guide to production Docker packaging, which solves some of these problems. But if you’ve read it you’ll have noticed there’s quite a lot of different things to get right—and the guide doesn’t even cover all the relevant best practices.
Which leads us to the next problem.
Too many details, too many best practices
Applying Docker packaging correctly involves:
- Shell scripting,
- the
Dockerfile
format, - Docker’s build caching,
- signal handling,
- Python packaging,
- Git,
- CI,
- security updates,
- and on and on.
And all these details interact with each other, sometimes in unexpected ways. For example, build caching can make your security updates happen.
So where do you get started? What do you next? What’s required and what is optional?
From zero to best practices
You want to deploy your Python application to production with confidence, knowing the Docker image follows best practices. You want secure images, fast builds, and operational correctness.
And you want to be guided through the process by an expert who knows what they’re doing.
That’s where the Python on Docker Production Quickstart comes in. Based on real-world experience, more than a year’s worth of research, and students’ questions when teaching live classes, the quickstart:
- Focuses specifically on Python running in production.
You’re not writing Go, you’re writing Python–and that makes a big difference. - Covers over 60 best practices.
This includes 25 best practices you won’t find in the free content elsewhere on this site. - Provides a step-by-step iterative implementation plan.
Given that many best practices, it’s easy to get overwhelmed. So instead of trying to figure out where to start on your own, you can follow the included plan. - Won’t waste your time.
It’s a concise summary, ~80 pages of to-the-point information, with examples and references to additional information.
“I’ve been using [it] a lot over the last two weeks and it’s been absolutely fantastic.
Learning about multi-stage builds has meant I can combine a Django/React app into one Dockerfile. This also lead to me discovering compose build sections can target a named stage so now I’ve got local dev using the same Dockerfile, which is great.”
– George Hickman
The included best practices cover the following areas:
- Security.
- Running in CI.
- Easier debugging.
- Operational correctness.
- Reproducible builds.
- Faster builds.
- Smaller images, including multi-stage builds.
- Application-specific best practices.
“We applied the secrets handling and multi-stage builds from the Quickstart. Having said that, everything in it was an eye opener.”
— Gregory Ronin, Software Enginer at Deloitte Digital
“Everything in [the Quickstart] was an eye opener”
“It’s super valuable. I now have a big list of TODOs to update our build scripts and Dockerfiles.”
– Eric Pederson
Python on Docker Production Quickstart |
Ship faster with the Template bundle |
---|---|
|
|
Want the quickstart for your whole team? 10-person team edition for $249. |
Note: I prefer not to take money from teams working on projects for the military, prisons, police, fossil fuel extraction, surveillance, national security, or the like. If that applies to you, please don’t purchase this product.
Quickstart changelog
June 17, 2020
More best practices:
- Various BuildKit features that help speed up builds.
- Dropping capabilities.
- Avoiding listening on ports < 1024.
- Running a different command altogether based on command-line arguments.
- Bytecode compilation.
- Additional image size optimization in multi-stage builds.
- Warm up the build cache for per-branch builds.
- Requirements for running on Heroku and Google Cloud Run.
- BuildKit ssh-agent forwarding.
- BuildKit secrets when using Docker Compose.
Other tweaks and improvements throughout the text.
June 8, 2020
The Checklist has been renamed, and is now known as a Quickstart. To help make that change:
- Added a new introductory chapter with a plan to help you figure out which best practices to implement when.
- Tweaked the chapter structure.
Additionally:
- Added a best practice on finding large layers.
- Split off
init
into its own best practice. - Explained the goal of responding to health checks quickly more broadly, rather than in specific implementation terms of process/thread pool.
- Added more nuance to the section on updating dependencies once a month.
- Restored the Pipenv instructions, since it’s now being maintained again.
- Make the
DEBIAN_FRONTEND
best practice standalone.
June 2, 2020
Added multiple new best practices:
- Additional tips on improving build caching.
- A better default tag.
- For better reproducibility, you can create a custom base image.
- Size checks for images.
- Security scanners.
Also updated existing best practices:
- You can make a build arg available at runtime by using
ENV
. - For build secrets another alternative is short term keys.
April 27, 2020
- Added new best practice on timely security updates via automatic notifications.
- Switched some examples from shell session transcripts to
Dockerfile
or shell script. - Noted
docker build
support for targeting named stages earlier in the checklist.
April 1, 2020
- Documented two-stage install with Poetry 1.0+.
- Added link to
docker-autoheal
.
February 24, 2020
Added many more examples:
- Configuring
logging
. - A smoke test.
- Passing in secrets with BuildKit.
.dockerignore
file.- System package upgrade script for CentOS/RHEL.
Dockerfile
healthcheck.- And a few more expanded examples here and there.
About me
Hi, I’m Itamar Turner-Trauring.
I’ve been writing Python since 1999, and I first started using Docker in 2014, when I was part of a team that wrote one of the first distributed storage backends for Docker.
I’ve since built Telepresence, a remote development tool for Kubernetes that was adopted as a Cloud Native Computing Foundation sandbox project and has more than 2000 stars on GitHub. I’ve also deployed a number of production Python applications as Docker images.
Over the past year I’ve been researching Docker packaging for Python in production, resulting in a free guide elsewhere on this site, live training classes, an introductory book, a template implementing these best practices, and of course this quickstart.