To continue the series and follow up on Part I which was about deployment and container orchestration, here we dive deeper into container security. Containers offer significant advantages over traditional virtual machines and bare metal servers. Unlike VMs, containers utilize operating-system-level virtualization and share the host kernel, resulting in smaller sizes, faster deployment, and efficient resource management. However, container security should be a top priority.
Introduction to Container Technology
Containers have become synonymous with application development. But how did we get to this point and what exactly does a container offer that traditional bare metal servers and virtual machines were lacking?
Traditional Virtual Machines versus Containers
The two most prominent mechanisms for virtualization are traditional virtual machines (VMs) and containers. Both of these technologies allow us to maximize the utility of a physical server by sharing compute resources. However, the way in which VMs and containers share these resources are quite different:
- VMs use a hypervisor to share and manage hardware. The hypervisor is a software or firmware layer that exists outside the guest operating system (OS) and is responsible for handling commands sent to the physical hardware. Its purpose is to allow multiple "guest" operating systems to run on the same physical machine.
- VMs contain a complete operating system that can be up to several GB in size.
- Each VM contains its own kernel, creating a very isolated OS with little to no shared resources.
- Containers can be considered operating-system-level virtualization, as they do not utilize a hypervisor.
- Containers are bound to the host operating system and share the kernel with other containers.
- The size of containers can be significantly smaller than a VM if built efficiently, making deployment and management of container images less time-consuming.
- With a shared kernel, security concerns may arise. While containers offer a high level of isolation from other processes on the machine, the risk for a vulnerability in the kernel or container "escape" is inherently greater than a hypervisor-based virtualization mechanism, as the hypervisor has more limited functionality and has been battle-tested for many years.
Anatomy of a container
Containers are not a new concept; much of what they utilize to build and run a container runtime environment has been around in the Linux kernel for many years. Most of the core tenets of what make up a container are available right out of the box in Linux.
In order to begin assembling what looks like a container from scratch in Linux, start with the use of a namespace. Namespaces come in different flavors:
- The PID namespace partitions kernel resources such that one set of processes may be provided with an independent set of process IDs (PIDs).
- Network namespaces create virtual networking interfaces to allow programs to run on any port without conflict.
- Mount namespaces enable the mounting and un- mounting of filesystems without affecting the host filesystem.
In Kubernetes, namespaces are a particularly helpful way to create logical separation within a cluster.
Control groups (cgroups) simply limit the amount of resources a collection of processes may use – such as CPU, memory, and disk I/O. Cgroups help ensure containers do not consume more resources than permitted, especially in a Kubernetes environment.
Container and Kubernetes Security Considerations; Isolation
Containers donʼt just run on the same host; they also share the same kernel. Isolating containers has to do with limiting their access to resources as needed. It is an extremely important security control, as you wouldnʼt want an attacker that compromises a container to be able to escape the container and get access to the host or other containers.
Container isolation can be achieved through various sandboxing techniques:
- Seccomp is a mechanism for restricting the set of system calls that an application is allowed to make. A filter is used to indicate what to do when a system call matches a given filter. Actions include returning an error, terminating modules, changing the time on the host, rebooting the host, and more. Unfortunately, Kubernetes doesnʼt apply a seccomp filter by default. Starting from version 1.19, seccomp filters are supported, but you need to create and load one yourself. In v1.22.0 there is the kubelet feature gate SeccompDefault as well, which changes the seccomp profile default to that of the container runtime profile. Understand where Seccomp is relevant in Kubernetes Security in this basics blog as well.
- AppArmor is a Linux Security Module (LSM) that can be enabled in the Linux kernel and can associate a profile with an executable file to determine what it is allowed to do in terms of capabilities and file access permissions. There is a default Docker AppArmor profile, but Kubernetes does not use it by default. Enabling AppArmor is highly recommended in terms of security.
- SELinux is another LSM developed by RedHat; it lets you constrain what a process is allowed to do in terms of its interaction with files and other processes. You can limit an application to have access only to its own files and prevent any other process from being able to access those files. If an application becomes compromised, you will have limited the number of files that can be affected.
- Third-party open source projects such as gVisor, Kata Containers, and the Firecracker VM aim to further restrict dangerous system calls to the Linux kernel. Each of these projects handles isolation differently and should be considered when running containers in particularly hostile environments where running workloads are untrusted. These projects are not highly used because they limit the natural flexibility of a containerized environment.
Both SELinux and AppArmor have a log-only mode which is useful for creating and testing new profiles. Creating a SELinux profile requires in-depth knowledge of the files needed by the application. Overall, Seccomp, SELinux and AppArmor work at a very low level. Creating a complete profile from scratch can be rather cumbersome.
Maintaining the profile can be equally hard, as even a small change in an application may require significant changes to the profile in order to work. At the same time, using all three tools is highly recommended from a security perspective. If you donʼt have the resources or the expertise to build custom profiles, it is always better to use the default ones rather than not using them at all.
Container image threats
An image is essentially a snapshot or template of a container in the form of an immutable file. A ‘running’ image can be considered a container.
There are several threats associated with container images:
- Add malware into the image. Access secrets.
- Enumerate the network topology accessible from the build infrastructure.
- Attack the build host.
- Vulnerable base images.
- Tampering with images stored in a registry or modifying images during build.
- Attacking the deployment through the build machine.
An image works great if you are building and launching containers on your laptop. But how do you distribute the image to other individuals or systems in the CI/CD pipeline? This is where a registry comes into the picture.
A registry is simply a location used to store and distribute container images. The registry can either be open to the public or private, and is tightly integrated with the Command Line Interface (CLI) tool. It serves as a target for running push and pull commands.
You can use trusted public registries or run your own private registry. Only images that are stored on those registries should be allowed. If you choose to run a private registry, make sure you secure it by placing it behind a firewall and controlling who can upload and download images from it. You may be tempted to allow access to your entire team. This may be convenient, but donʼt forget the principle of least privilege as this way you are actually increasing your attack surface.
Securing Container Images: Vulnerability Scanning
Due to their layered nature and extensive use of third-party packages, container images are inherently dangerous to pull into a trusted environment and run blindly. For example, if a given layer in an image contains a version of OpenSSL that is susceptible to the Heartbleed attack, you can identify all upstream software with vulnerabilities simply by looking for images built with that layer. Images can be patched quickly by simply replacing the layer containing the vulnerability and rebuilding the container to use up-to-date, fixed packages.
There are a number of ways to handle the scanning of container images. Ideally the task is a pass/fail step in your Continuous Integration and Continuous Delivery (CI/CD) pipeline, where images with known vulnerabilities are rejected before deployment. You can run vulnerability scans even during development so that you can fix issues that arise before pushing the code to a repository. Several image registries have vulnerability scanning built in. It is a good idea to periodically scan all images in the image registry, as new vulnerabilities are continually discovered. Identify and replace running containers based on vulnerable images.
Other command line tools – such as Trivy, Clair or Anchore –allow scanning automation to easily run in a CI/CD build pipeline. During such scans you should be analyzing the content and composition of images to detect apart from vulnerabilities, security issues, and misconfigurations. Scan the operating system, libraries running within the container and their dependencies. A good idea is to also check if there are any secrets stored in your images.
Container Security Best Practices
The following recommendations can help you safeguard your container images against threats like the ones we have described here:
- Use a secure base image from a trusted registry, such as the Docker Official Image Registry. Some organizations mandate the use of pre-approved images.
- Try to use small base images, as you will end up with a smaller attack surface.
- Prefer to use a container-specific operating system. Try to reduce your attack surface as much as possible: avoid adding packages, libraries, executables, and any unnecessary code into an image.
- Specify a non-root USER so that containers based on these images donʼt run as root.
- Control access to your Dockerfiles, and monitor and review changes in them. Create an alert when a RUN command is introduced, as this allows an attacker to execute any command.
- Donʼt mount sensitive directories into a container, such as /bin , /etc , or log directories such as /var/log . Generally, try to limit host volume mounts to as few as possible.
- Donʼt include sensitive data in the Dockerfile (such as credentials or passwords).
- Include everything your container needs from the beginning. Donʼt allow adding packages in runtime, as you cannot check them in terms of security.
- Avoid including binaries with the setuid bit, as that allows for escalated privileges. Avoid the use of ADD and use COPY instead. Avoid curl bashing in RUN directives.
- Follow a multi-stage build approach to eliminate unnecessary contents in the final image.
Container Security is not Kubernetes Security
We have not yet gotten to the section about orchestration and Kubernetes security in this series, but it is important to call out early on that container security is not Kubernetes security. You can run containers without using Kubernetes to orchestrate them. See these key pre-requisites to Kubernetes security and see more detail here with a full explainer.
One thing to note is that many third-party tools for Kubernetes are installed via container images, so container security that checks images running in production, not just prior to production, is also relevant for understanding CVEs and dangers in those images, like the recent CVEs for Fluid and Bare Metal Operator.
Container vulnerabilities in the wild
The fact that a Docker image is stored in a public registry does not automatically mean that it is safe to use and free of vulnerabilities. In 2020 Prevasion conducted an analysis of 4 million container images stored at Docker Hub. Of these, 51% contained packages or dependencies that had at least one critical vulnerability. There were also 6000 images that contained malware, crypto-mining software, trojans, and other types of malicious software.
Container Build Process Compromised with Backdoor
In early 2021, Codecov experienced a breach that affected a large number of company CI/CD build pipelines. Codecov gives developers tools to help ensure tests are efficient and CI is streamlined. Their own Docker build process was compromised: an attacker took advantage of the script called “bash-uploader,” and the image was distributed through normal channels to users. The unknowing developers who ran this version of Codecov were in for a surprise.
The Codecov CI plugin siphoned secrets, environment variables, AWS account tokens, and more straight from their CI systems (including Github Actions and other tools.
Never blindly pull images and run them in highly sensitive environments. Always inspect the image if possible, scan it for vulnerabilities, and watch how it behaves in a sandbox environment in a running state (for example, look for outbound network connections and side- loaded binaries).
Containers offer advantages such as efficient resource utilization, faster deployment, and easy scalability. By sharing the host operating system's kernel, containers enable greater flexibility and faster startup times compared to virtual machines. However, container security is a crucial consideration, as containers share the same kernel and potential vulnerabilities can pose risks to the host and other containers. Implementing security measures like namespace partitioning, control groups, seccomp, AppArmor, SELinux, and vulnerability scanning can help mitigate these risks. In the next post in this series, we will get into container deployment techniques.