Use a User Namespace With a Pod

FEATURE STATE: Kubernetes v1.30 [beta] (enabled by default: false)

This page shows how to configure a user namespace for pods. This allows you to isolate the user running inside the container from the one in the host.

A process running as root in a container can run as a different (non-root) user in the host; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace.

You can use this feature to reduce the damage a compromised container can do to the host or other pods in the same node. There are several security vulnerabilities rated either HIGH or CRITICAL that were not exploitable when user namespaces is active. It is expected user namespace will mitigate some future vulnerabilities too.

Without using a user namespace a container running as root, in the case of a container breakout, has root privileges on the node. And if some capability were granted to the container, the capabilities are valid on the host too. None of this is true when user namespaces are used.

Before you begin

You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:

Your Kubernetes server must be at or later than version v1.25. To check the version, enter kubectl version.

  • The node OS needs to be Linux
  • You need to exec commands in the host
  • You need to be able to exec into pods
  • You need to enable the UserNamespacesSupport feature gate

The cluster that you're using must include at least one node that meets the requirements for using user namespaces with Pods.

If you have a mixture of nodes and only some of the nodes provide user namespace support for Pods, you also need to ensure that the user namespace Pods are scheduled to suitable nodes.

Run a Pod that uses a user namespace

A user namespace for a pod is enabled setting the hostUsers field of .spec to false. For example:

apiVersion: v1
kind: Pod
metadata:
  name: userns
spec:
  hostUsers: false
  containers:
  - name: shell
    command: ["sleep", "infinity"]
    image: debian
  1. Create the pod on your cluster:

    kubectl apply -f https://k8s.io/examples/pods/user-namespaces-stateless.yaml
    
  2. Attach to the container and run readlink /proc/self/ns/user:

    kubectl attach -it userns bash
    

Run this command:

readlink /proc/self/ns/user

The output is similar to:

user:[4026531837]

Also run:

cat /proc/self/uid_map

The output is similar to:

0  833617920      65536

Then, open a shell in the host and run the same commands.

The readlink command shows the user namespace the process is running in. It should be different when it is run on the host and inside the container.

The last number of the uid_map file inside the container must be 65536, on the host it must be a bigger number.

If you are running the kubelet inside a user namespace, you need to compare the output from running the command in the pod to the output of running in the host:

readlink /proc/$pid/ns/user

replacing $pid with the kubelet PID.

Items on this page refer to third party products or projects that provide functionality required by Kubernetes. The Kubernetes project authors aren't responsible for those third-party products or projects. See the CNCF website guidelines for more details.

You should read the content guide before proposing a change that adds an extra third-party link.

Last modified December 15, 2024 at 6:24 PM PST: Merge pull request #49087 from Arhell/es-link (2c4497f)