Using Python in a Docker image "can" be as simple as:
And for small scripts, this simple base image usually gets you up and running!
Unforunately, using the official Python image from Docker can be limiting as your application grows.
You may reach for a slightly more flexible image such as those based on Debian or Alpine, but this still makes some assumptions about how you will use Python and the tools it provides.
You may need to install a system package to compile a Python library that your new application uses.
A great example of this is the library Pillow. Pillow requires at LEAST libjpeg and zlib, and an optional 8 other system packages for further functionality.
While Pillow, and many other Python packages, claim to have the binaries to their related system pacakges bundled with the wheel varients this is not always the case (or they don't always work as described).
Instead of relying other folks to provide functionality for you, we'll build our own Python image from scratch that handles all of our use cases.
A few key points about this image:
We will be making use of Layers to reduce build times and image size.
We are NOT focusing on making the slimmest image possible however.
We are NOT assuming 100% lockdown. We will include a few items for debugging.
Let's treat the image like a fresh VM. In my world I use the LTS (long term support) of Ubuntu, we that's what we'll base our image on.
You may have noticed a line you haven't used before! Having # syntax=docker/dockerfile:1 tells our container builiding tool which syntax is valid in this Dockerfile. We'll explicity set the standard one for Docker, you can read more about BuildKit frontends here.
We'll be using Ubunt 22.04 (jammy) via ubuntu:latest refers to the latest LTS release of Ubuntu by it's codename.
We'll then be calling this layer base so we can reference it later in the Dockerfile!
This is the layer where we'll install system dependencies required for Python and your application's Python dependencies to be compiled.
You will only need to update this layer if a Python package you install fails and explicitly asks for a system package to be installed.
A common culprit for updating this layer would be changes to PyTorch.
We'll build and install Pyenv from scratch, then install LTS Python and set it as the system's global Python.
After a new global Python has been installed, we'll pip install Poetry.
Again, this step is only done if we want to change versions of Python or Poetry so this layer stays constent for long periods of time reducing build times!
Installing Your Application's Python Dependencies
Now that we have system packages and Python installed, we'll make a new layer that is YOUR applications Python dependencies.
These will change far less than the code you write so making a reusable layer out of this is a great idea.
Great! Now we have a layer with your Python packages ready to go. We'll copy the built packages to the final image.
Extra Credit: Installing PyTorch for CPU Only
If you have PyTorch specified in your local development settings or for images that use GPU, we will actually remove it and install the CPU only varients here if you need to access PyTorch functions in environments where you do not have GPU.
Add Your Code Into The Image
Finally, we've got a layer that has a built version of Python and our system dependencies with NO development dependencies.
We'll first install any system dependencies that are required at runtime and cannot be set aside in another layer.
Then we'll copy over our codebase.
Now you've got it! A Python Docker image that can be used for any Python application.
Putting It All Together
Here's the full Dockerfile for you to use! You can find it and an example of how to use it in the GitHub repository