Shipping Your Computer

The analogy between intermodal containers and software containers sounds cliché, but it works pretty well, which is the reason why we will keep using it in this article. Let us recap: before the standardization around intermodal containers after the Second World War, freight had to be manually loaded and unloaded from boats to trains to trucks, over and over again. Before the rise of software containers of the 2010s, software had to be manually loaded and unloaded from developer laptops to test environments to cloud systems, over and over again.

Yes, the analogy works. Even the 2021 Suez Canal obstruction can likely be felt whenever Kubernetes complains of not finding a s…

Yes, the analogy works. Even the 2021 Suez Canal obstruction can likely be felt whenever Kubernetes complains of not finding a suitable node to schedule your pod. Le sigh.

In the final scene of the 1989 buddy cop comedy “Lethal Weapon 2”, one of the “bad guys” is (finally!) crushed by the fall of an intermodal container, triggered by a heavily wounded Martin Riggs (played by Mel Gibson). Similarly, developers worldwide have been hit in the head by an artifact of similar virtual mass and devastating effects.

Software containers, still colloquially known as “Docker containers” and shyly introduced over a decade ago, ended up being a surprisingly useful tool, and become one of those concepts everyone in the IT community agreed to be good idea more or less simultaneously. Other examples of such tacit agreements are, in no particular order, the Web, Agile methodologies, Open Source, Git, and the IEEE 754 standard. The jury is still out and debating about Worker Unions, but we can guess what the final verdict will be.

Interestingly, containers fit perfectly well with Web technologies, agile teams, open source, and Git repositories. Think about it. Most web applications nowadays are built in pipelines stored next to the Git repositories hosting their source code, built using open source tooling, by agile teams having their daily standup meeting every morning at 9 AM. The circle is closing.

Becoming a standard meant quite a bit of lobbying and back and forth, but for the past decade or so containers have become the daily bread and butter of countless teams around the world.

Of course not all apps could be containerized; in particular, those that did not follow the “Twelve-Factor App” principles had a very hard time being put into such constrained environments, and this triggered yet another massive rewriting of the world.

And this is where an interesting thing happened: containers provoked a schism among programming languages. On one side, scripting languages; on the other, compiled ones.

The former group yielded gargantuan containers, including gigabytes of runtime libraries and dependencies tightly packed together, albeit with ridiculously fast container build times; Python, Ruby, JavaScript, and PHP being the most remarkable ones in this category. On the other side, instead, we find languages that provide smaller containers but with the added cost of longer build times: Java (most commonly through the use of Quarkus or Spring Boot), .NET, Rust, and even the venerable C++ language representing this category.

(Well, arguably there is a third group, occupied solely by the Go programming language, able to provide smaller containers in absurdly short build times. But regular readers of this magazine know that we consider Go to be a triumph of engineering, anyway.)

This split of functionality in programming languages provided a definitive answer to the question “what programming language should I use for my project?”. If all else fails, choose one that will yield developer productivity and short build times and small containers for production. Finding this trifecta became the holy grail of web application development for the past decade. It also brought capacity management to the masses: Kubernetes lets you know in no ambiguous terms the amount of memory and CPU your nodes have, so you had better not schedule too many containers at once.

Even better, try to make them smaller, and get more container for the buck.

Once you reach the sweet spot of cost and size and speed (choose three), containers can be used for a myriad of purposes beyond the mere packaging of web applications: for example, for testing; as a mechanism preventing DLL Hell; to share command-line tools between team members and your CI/CD infrastructure; to quickly deploy and run artificial intelligence models; and even for educational purposes.

But the most important transformation provided by containers was not just the discrimination of programming languages, but rather something even deeper:

Over time it became clear that the benefits of containerization go beyond merely enabling higher levels of utilization. Containerization transforms the data center from being machine-oriented to being application-oriented.

(Burns, Brendan, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. “Borg, Omega, and Kubernetes.” Communications of the ACM 59, no. 5 (2016): 50–57. https://doi.org/10.1145/2890784.)

Let us be crazy: we can argue that containers also provided yet another way for Unix to find revenge. Because honestly, Windows containers are a joke, and now you can even build and manage OCI containers from macOS. Containers, since the times of chroot and FreeBSD jails, have been a Unix affair, one that Linux took to a whole new level.

But there is another contribution of container images: secret leakage. Remember when everyone was urging developers to stop committing secrets into Git repos? Well, turns out they started bundling them inside container images, and even better, pushing them to Docker Hub.

In this paper, we analyze 337,171 images from Docker Hub and 8,076 other private registries unveiling that 8.5 % of images indeed include secrets. Specifically, we find 52,107 private keys and 3,158 leaked API secrets, both opening a large attack surface, i.e., putting authentication and confidentiality of privacy-sensitive data at stake and even allow active attacks.

(Dahlmanns, Markus, Constantin Sander, Robin Decker, and Klaus Wehrle. “Secrets Revealed in Container Images: An Internet-Wide Study on Occurrence and Impact.” Proceedings of the ACM Asia Conference on Computer and Communications Security, July 10, 2023, 797–811. https://doi.org/10.1145/3579856.3590329.)

Insert facepalm emoji here. In my own personal experience, I have to say that I see more and more teams adopting secure container and Kubernetes practices, including the use of ad-hoc secret management databases:

By default, Kubernetes stores secret data as plaintext in ’etcd’. In that case, if a malicious user gets access to ’etcd’, then the malicious user can retrieve sensitive information, such as database user names, passwords, and queries. Although Kubernetes does encrypt ’etcd’, the key for the encryption is stored as plaintext in the config file in the master node. For that reason, practitioners recommend using secret management tools for additional security, such as ‘Vault’ for encryption.

(Islam Shamim, Md. Shazibul, Farzana Ahamed Bhuiyan, and Akond Rahman. “XI Commandments of Kubernetes Security: A Systematization of Knowledge Related to Kubernetes Security Practices.” 2020 IEEE Secure Development (SecDev), IEEE, September 2020, 58–64. https://doi.org/10.1109/SecDev45635.2020.00025.)

Insert sigh of relief emoji here.

In an article published 6 years ago, in our issue dedicated to the subject of DevOps, published right before the COVID pandemic broke, we said that:

Applications became small Docker containers, lightweight virtual machines that can start and stop in a heartbeat. The programming language does not matter anymore. The IDE does not matter anymore. The framework does not matter anymore. The build tool does not matter anymore.

Through containers, the work of software developers reaches as far as possible into the realm of operation teams; the design of their containers becoming a form of non-trivial calisthenics, as we mentioned in another article:

Different software developers have to deal with different constraints. For example, DevOps engineers busy stuffing code inside a Docker container might not realize the amount of design they have to deal with at any step of the way: should they base their work on alpine:latest or scratch? Can they? Should they include all libraries, or can they strip a few to slash some megabytes? What programming languages would be better suited to make this container run faster and smaller?

A very-well known Internet meme, inspired by a scene from the 2004 movie “Finding Neverland”, features a sequence of images in which a very young Freddie Highmore tearfully admits that “it works in my machine”, to which Johnny Depp replies “then we’ll ship your machine”, with the last caption stating “and that is how Docker was born”.

We do not need to say anything else at this point.

Cover photo by David Trinks on Unsplash.

Continue reading Solomon Hykes or go back to Issue 088: Containers. Download this issue as a PDF or EPUB file and read it on your preferred device. Did you like this article? Consider contributing to the sustainability of this magazine. Thanks!

Similar Posts