Container images

It is my impression that we sometimes take for granted that when working in Kubernetes, we need container images for everything, how those are built is well understood and is part of the development process, and we don't need to worry about it any further. I have recently been involved in a number of discussions that made me think they deserve another look, not just in terms of how you build them, but also how they are scanned and vetted, the local storage they occupy, means to download them into a cluster and so forth. So I'll try to describe those aspects in a bit more detail here and provide pointers to where you can dig deeper.

First of all, when building a new container image, you start with declaring the "base" for that image, in other words, another container image that yours is going to be based on. Often, you will pick a base image that is derived from an operating system, in other words, which contains operating system files and executables that will make the container appear as if it were actually running on that operating system. In the early days of building containerized software at IBM, we used a wide variety of base images, the majority probably using  Ubuntu and Alpine. We later switched to using Red Hat's Universal Base Image, which I think of as a redistributable version of RHEL for containers. Doing so has served us very well, I think, for example, UBI appears to introduce much fewer vulnerabilities than other base images, and those that are there are resolved more quickly. Fairly recently, Red Hat added a 'micro' version of that image, which is very useful, of course, if you're looking to keep your image size at a minimum.

A container image consists of multiple layers, in other words, it is not a monolithic artifact. That is important because layers can be shared and reused across multiple separate images. The base I mentioned above is one of those layers, and probably the best candidate to be shared. But with some clever structuring of your Dockerfiles (each separate command in the Dockerfile leads to a new layer, so concatenating commands will reduce the number of layers), or by creating a base image that adds more content to the base OS image (for example, adding a JRE, or python, or ...), you can increase the level of reuse further. Another consideration is the use of multi-stage builds, even though those do not benefit all images. There are plenty of blogs and articles about best practices for container image size optimization that will also help you. 

I found "dive" to be a great tool for analyzing and inspecting container images. You can use it to take a look at some of the internals of not only images you may have built, but also any other image you use. It shows you the layers within your image, the Dockerfile commands that were used to build them, as well as the content of that layer. it also gives an "efficiency score" that lets you determine if you have a lot of not needed content in the image. Below is a sample screenshot showing a dive report for the Bitnami redis image. The efficiency score is 93%.


You can also (ab)use container images as a way to store data in a portable way. For example, you can pre-load an entire database into your image, so that every container that gets starts from that image will have the database readily available and won't have to load it from somewhere. The downside of that approach is that the size of the image will increase.

So why do I obsess about image size so much? Because, for starters, you have to download images into your cluster! This can happen either on demand, where images are pulled into a worker node when needed (and according to the Image Pull Policy), or you can 'pre-pull' images, even though I have to admit that I have never done that. 
Most IT organizations I work with will not pull images from a public registry as a matter of principle. Instead, they will mirror images into a local registry, so that they can scan and vet and test them before being made available to their organization. That means the download happens outside of the cluster, and hopefully the pull of an image from a local registry will always be faster than pulling from a public registry.

Secondly, images will use up disk space on your worker nodes. I have seen cases where the disks attached to worker nodes were relatively small (100GB seems to be common), and over 60% of the available space was used by images. Pruning helps,  of course, and I have seen (OpenShift) clusters where disk pressure on certain nodes was relieved by pods being moved to other nodes and the images being removed - automatically! 
It definitely makes sense to keep an eye out for disk warnings and log into your worker nodes to see how the available disk storage is being used. Since OpenShift uses CRI-O as the container runtime, we typically use crictl commands. I'll admit that there are many factors playing into the use of disk space, but image size is a factor that can play a big role in it. 

I mentioned above that many organizations keep images on a local registry. And I think that makes perfect sense, if only because it gives you a chance to run vulnerability scans on these images. I wrote a separate blogpost on that topic, so I'll just refer to that here and not repeat myself. It was written before the log4j debacle happened, but that made this topic only more important.

Another security-related aspect is image signing. The process is straightforward, even though there are differing implementations to (a) "sign the image", i.e. adding a digital signature to the image, and (b) verifying that the image hasn't been tampered with by validating that signature. It is certainly a good idea to include image signing for software vendors offering container-based software, like we do. At IBM, we have a central signing service that is run by our own CISO team and we sign all of our images.

Finally, we use digests for all our images instead of tags, simply to ensure that we assign a unique identity to each version of the image. Tags can easily be reapplied and changed on images, in other words, there is no guarantee that two images with the same tag are actually the same images. A digest fixes that. And by the way, never use "latest", since you never know what you are going to get when asking for that. 

In summary, images need to be properly built for small size, they need to be signed and use digests as a security measure, and you need to have an automated mechanism for continuously scanning images for vulnerabilities, paired with the ability to patch your image and re-deploy your containers accordingly.

PS: a colleague and I wrote an article about container images almost six years ago, it is a bit dated, but I think it is still accurate, so you may want to check that out, too, if you're interested. 


Comments

Popular Posts