One of the first things you need to figure out when dealing with data protection is access control. On the surface, it sounds simple: Decide and enforce who has access to what. However, properly performing access control, so as to navigate the productivity-security trade-off requires deep technical understanding of your tech stack. “Too much” access is bad for security. “Too little” access is bad for productivity.
In this post, we dive deep into Kubernetes access control. At the end of this post, you will be able to understand the various mechanisms to restrict access to a Kubernetes cluster and how to leverage those mechanisms for your security needs.
Paraphrasing a popular book’s title, let’s start with why. Why do we need access control?
Why access control?
As our society is becoming increasingly digitized, more trust is needed in IT systems. Imagine handing out your medical records without being able to trust that only your doctor will have access to them.
Access control is a pillar of trustworthy IT systems, serving two main reasons: First, it keeps bad folks out. Second, it reduces the impact of human errors. All in all, it ensures data stays confidential, integral and available. Or put more “engineeringly” it avoids downtime, hacks and data loss.
In fact, access control is so important that some regulations specifically mandate it. For example, Art. 25 GDPR states: “[S]uch measures shall ensure that by default personal data are not made accessible without the individual’s intervention to an indefinite number of natural persons.”. As can be read, GDPR lets organizations decide how to perform access control. Other regulations are more prescriptive. For example, Swedish Patient Data regulations (HSLF-FS 2016:40) dedicates a whole chapter with 12 paragraphs on how to do access control.
Scope of Access Control
Before talking about access control, we need to set a scope. This post is focused on Kubernetes clusters. However, since Kubernetes is really just a building block for a platform – and not the whole platform by itself – we will also discuss components that are common in Kubernetes clusters or around Kubernetes clusters.
Specifically, we will include the Nginx Ingress Controller. An Ingress Controller is a standardized way to bring traffic to your application. In a way, it is a must have if you want your cluster to be useful to the outside world.
In 2013, common wisdom argued that identity is the new security perimeter. And while identity is still the most security perimeter, one should not neglect networks as an additional layer of security.
Network- or IP-based access control means that you allow or deny access based on the IP address that request is coming from, the source IP. An IP address has some correlation with a person, but should not be taken as an identity. It’s more useful to think of them as identifying a group of people. For example, IP-based access control can be used for coarse-grained access control, such as allowing access only from within an organization or only to specific departments. IP-based access control can also be used to enforce access “choke points”, like VPNs. With hybrid work being the norm, this is really important.
When it comes to IP-based access control, think of it like having a door code when entering an apartment building. Sure, the same code is shared by all neighbors, as well as postman and postwoman. However, it adds another barrier to a random passer-by.
Fine-grained access control should be done based on identity. Identity – in the security sense – is whatever uniquely and trustable identifies a person. The current standard is to use both something that a person knows (like a password or private cryptographic key) and something that the person has (such as a 2nd factor token) to establish identity.
To keep identities separate, and therefore access control based on them trustworthy, resist sharing credentials. Otherwise, the access control system cannot determine who should get to do what, since multiple people use the same identity.
Credential sharing is unfortunately very common with Kubernetes clusters. It is just too easy to hand out a shared SSH key or the cluster-admin credentials to everyone asking for them. Simply say “no!” and point them to relevant regulation. For example, Swedish Healthcare leaves no room for interpretation:
The care provider shall be responsible for ensuring that each user is assigned an individual authorization for access to personal data. [unofficial translation]
Always use individual SSH keys and OpenID for day-to-day access. Leave the cluster-admin outside normal reach and only use them in “break glass” situations.
Let us finally dive into the core of this post: Who needs access to a Kubernetes cluster and how to control that access. All people needing access to a Kubernetes cluster can be roughly grouped into three roles: application users, application developers and Kubernetes administrators.
Let’s start with the most important role, the application user. This role is what gives meaning to the application, and the application gives meaning to the underlying platform, i.e., the Kubernetes cluster. Without this role, you wouldn’t need to set up a Kubernetes cluster to begin with. However, somewhat paradoxically, application users should never ever have access to the Kubernetes cluster, specifically the apiserver. Instead, they should be granted access via an Ingress controller to the application.
As discussed above, you should really consider both IP-based and identity-based access control. For IP-based access control, my recommendation is to annotate Ingress resources with nginx.ingress.kubernetes.io/whitelist-source-range. This gives application developers maximum control and allows per-Ingress access control. Note that, for this to work, the load-balancer fronting the Kubernetes Nodes needs to preserve source IP. Usually, this is achieved by using the PROXY protocol between the load-balancer and the Ingress Controller.
Identity-based access control should best be done in the application. While it would be theoretically possible to perform access control by enforcing who has access to which HTTP verbs and routes, access to the application is too application-specific to be done in the platform. When it comes to security, we not only look for technical possibilities, but also for the best demarcation of responsibilities. And clearly application access control is the responsibility of application developers and not Kubernetes administrators.
However, the platform can help with authentication. Specifically, a few Ingress annotations and a deployment of oauth2-proxy will avoid writing tedious and non-differential code for completing the OpenID flow and getting the JWT ID token identifying the application users. Copy-paste-able code snippets can be found here and here.
We should also mention that the right NetworkPolicy needs to be in place to allow traffic to flow from the Ingress Controller to application Pods. However, we will leave access control within the Kubernetes cluster to a future post.
As a Kubernetes administrator, I see application developers as my direct customer. If they are happy, I’m happy. If their day was productive, then my day was productive. However, I’m still responsible for platform security and enforcing demarcation of responsibilities. Hence, I need to put in place proper access control: Not because I don’t trust them, but because I don’t want them to care about various ways of breaking a Kubernetes cluster.
For IP-based access control, make sure your application developers only get access to the Kubernetes API from within the corporate network. Depending on your Kubernetes setup, this can be done from a few places:
- If you provision a managed Kubernetes control plane, such as EKS, GKE or AKS, check their documentation. For example, on Azure you need to use api-server-authorized-ip-ranges.
- If you provision your Kubernetes control plane yourself in a virtualized environment like OpenStack, you can use SecurityGroups around Kubernetes control plane Nodes.
- If you run Kubernetes on bare-metal, set up firewall rules on the machine directly.
Many application developers work from home these days, at least partially. Make sure they use VPN access into the corporate network. This significantly simplifies compliance as allowlisting their home IP – which tends to be dynamic – opens up a can of worms.
For identity-based access control, make sure to integrate the Kubernetes cluster with your corporate identity provider. The Kubernetes API supports OpenID. In case multiple OpenID providers are needed, or non-OpenID providers are needed, Dex can be used as a rather stateless adaptation layer. Further tip: Try to make sure you propagate group claims from your identity provider. This will allow you to reuse corporate roles for Kubernetes Role-Based Access Control (RBAC). Here is copy-paste-able code to achieve that.
Finally, let us talk about you and your team: The all-mighty Kubernetes administrators. With great power comes great responsibility. This also applies to access control.
Administrators tend to have two levels of access: the Kubernetes API and direct SSH access to the Nodes. The former is necessary for day-to-day operations, platform component updates and investigations. The latter is necessary for updates, security patches and investigations at the Kubernetes, container runtime and operating system level.
For IP-based access control, make sure administrators only have access to the Kubernetes API and can only SSH into the Kubernetes Nodes from the trusted corporate network. Even better, ask for your team to have their own sub-network.
Let us now talk about identity-based access control. For SSH, make sure each admin has their own SSH key. I’m a big fan of HSMs, such as YubiKeys, to add one more layer of protection around SSH keys. (Sadly, they do not sponsor me.) HSMs store the SSH key on a physical token and further ask for a PIN. This enforces the “something you have and something you know” principle for establishing identity.
Regarding identity-based access control at the Kubernetes level, resist the temptation to share cluster-admin credentials. Instead, log in using OpenID – just as you ask your Application Developers to do – but make sure your group has more permissions. In case of “break glass” scenarios, where the Kubernetes-OpenID integration no longer works, you can always SSH into the control plane Nodes.
To minimize the risks associated with administrator access, try to minimize day-to-day usage. For example, security patching at the base operating system and container runtime can be automated using kured. Similarly, platform component security patching can be automated using solutions like Tekton, Flux or ArgoCD.
Access should not only be controlled, but also logged and regularly reviewed. In a future post, we will show how this can be done with Kubernetes audit logs, fluentd and OpenSearch.
In this post we exhaustively reviewed how access from outside the Kubernetes cluster is controlled. However, to reduce blast radius and reduce the risk of an attacker moving laterally, access should also be controlled within the Kubernetes cluster. We’ll discuss this in a future post too.