Vcluster allows running a logical Kubernetes cluster within a physical Kubernetes cluster, without needing cluster-admin permissions to the physical cluster. We heard people quickly making the analogy between virtual clusters and virtual machines. Indeed, virtual machines feel like “the real physical machine” but allow the user to install their own Linux kernel, Linux distribution, have full root access, etc., without stepping on the toes of another user in another virtual machine, nor angering the administrator of the physical machine.
So what about vcluster? Is it “like a real Kubernetes cluster”? Does it allow me to emulate cluster-admin privileges in a tightly RBAC-ed Kubernetes cluster? Can I try out running privileged containers or containers running as root without compromising the security of the whole cluster? Can I try out a platform-level technology, such as a new operator or a new service mesh, without waiting for the busy Kubernetes administrator to make time for it?
We wanted to find out what vcluster really is. Is it – like the fans talk about it – truly a “free but isolated” virtual cluster on top of a physical Kubernetes cluster, similar to how VMs are “free but isolated” on top of a physical machine? Or is it – like the haters call it – glorified Kubernetes namespaces?
Specifically, when running on top of a security-hardened Kubernetes cluster, users expect a virtual cluster to avoid most security restrictions – just like having “root” on a VM. Does vcluster live up to that expectation?
Spoiler alert: Unfortunately, our investigation shows that while you can get better isolation, a virtual cluster cannot break free of restrictions present in the physical cluster. In this article, we share the investigation that led to that conclusion.
What to investigate and why?
We at Elastisys are obsessed with creating a Kubernetes-based platform that meets three conflicting goals: (1) It should maximise developer experience. (2) It should make security and compliance with data protection regulations easy. (3) It should be portable across cloud providers.
In a nutshell, we want application developers to get the speed they want, with the security they need.
We generally encounter two types of application developers, let’s call them the “Heroku” developer and the “mystical DevOps wizards”, in reference to a previous article we wrote.
The “Heroku developer” is used to spending a lot of time in their IDE with hot-reloaded code. When they are happy, all they need to do is click on “deploy” and their magic can be used by the application users. Such developers appreciate an opinionated platform and a Golden Path. They want to be 100% focused on features and business code, and do not want to worry about platform concerns. We call them “Heroku developers” to praise the cloud application platform which raised the bar on what a platform obsessed with developer experience should be.
The “mystical DevOps wizards” seem to know everything from low-level C programming on embedded devices to ReactJS frontend development to Kubernetes administration. In contrast to Heroku developers, these developers are eager to find new ways to improve the platform. They are eager to try out the latest service mesh, the latest operator, find out how to tweak log forwarding, etc., which are usually privileged platform operations. And in regulated environments, responsibility needs to be separated: Just as infrastructure administrators cannot hand out “root” credentials on physical servers, platform administrators cannot simply hand out cluster-admin credentials to Kubernetes clusters.
We are often asked whether Vcluster can resolve this tension. So … can it? Can the mythical DevOps wizards simply use Vcluster to try things out in a security-hardened Kubernetes cluster?
The investigation
We went ahead and ran the Vcluster getting started guide against a Welkin v0.19.0 demo environment. Welkin includes the following safeguards by default:
No root, no privileged containers, no hostPath, etc.
Each Pod must have resource requests and limits.
Each Pod must be selected by a NetworkPolicy.
Application developers must negotiate a small set of trusted registries with the administrator.
Application developers have rather limited Kubernetes API access: They can only create namespaced resources in a few pre-configured namespaces and list only a handful of cluster-wide resources.
We believe these choices are sane for any security-hardened Kubernetes distribution.
To try out Vcluster, we set out to install fluentd and WordPress. Fluentd is a log collector which runs as a DaemonSet and requires quite elevated permissions, so as to collect logs from the underlying container runtime, be it containerd or Docker. It is not uncommon for the “mythical DevOps wizards” to want to make concrete suggestions on how to tweak fluentd, for example, to push application audit logs to long-term cold storage, as required in FinTech.
WordPress is a rather usual application. However, as is unfortunately the case with too many Helm Charts out there, one of the containers wants to run as root – we looked at MariaDB Helm Chart v11.0.0 – which is a big no-no in security-hardened Kubernetes clusters. Hence, a typical application developer might want to use Vcluster to try out any application out there, before investing effort in fixing the Helm Chart around tighter security.
In what follows, let us go step-by-step into out investigation and highlight encountered issues, and what workarounds we tried.
Step 1: Install Vcluster
While the Vcluster documentation is excellent to get started, we already hit a few bumps when installing it against a security-hardened Kubernetes cluster. First, one of its containers wants to run as root. Second, Vcluster does not come with NetworkPolicies.
Fortunately, both of these issues could easily be solved as shown below:
$ helm template my-vcluster vcluster --repo https://charts.loft.sh --set serviceCIDR=10.233.0.0/18 > my-vcluster.yaml
$ vim my-cluster.yaml
# Replace:
# securityContext:
# allowPrivilegeEscalation: false
# with:
# securityContext:
# allowPrivilegeEscalation: false
# runAsUser: 1000
# runAsGroup: 1000
$ cat my-vcluster-allow-all.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: vcluster-allow-all
spec:
egress:
- {}
ingress:
- {}
podSelector: {}
policyTypes:
- Ingress
- Egress
$ kubectl apply -f my-vcluster-allow-all.yaml
$ kubectl apply -f my-vcluster.yaml
Step 2: Connect to vcluster
This went as easily as highlighted by the documentation:
$ vcluster connect my-vcluster -n demo1
# In a new Terminal, same folder as above
$ export KUBECONFIG=./kubeconfig.yaml
Step 3: Install fluentd
We tried to install fluentd based on its official documentation, but this didn’t work, as shown below:
$ helm repo add fluent https://fluent.github.io/helm-charts
$ helm install -n kube-system fluentd fluent/fluentd
$ kubectl describe pod -n kube-system -l app.kubernetes.io/instance=fluentd
[...]
Warning SyncError 9s (x14 over 51s) pod-syncer Error syncing to physical cluster: pods "fluentd-8tqxg-x-kube-system-x-my-vcluster" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]
What happens is that the fluentd Pods in the virtual cluster correspond to actual Pods on the physical cluster. Since Pods on the physical cluster are not allowed to use hostPath, the Pods in the virtual cluster won’t start. The silver lining is that at least the application developer gets a clear and relevant error message.
Step 4: Install WordPress
Next off, we installed WordPress using the documentation from Bitnami:
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-wordpress bitnami/wordpress
$ kubectl logs -f my-wordpress-mariadb-0
mariadb 15:13:13.83
mariadb 15:13:13.84 Welcome to the Bitnami mariadb container
mariadb 15:13:13.84 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mariadb
mariadb 15:13:13.84 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mariadb/issues
mariadb 15:13:13.84
mariadb 15:13:13.84 INFO ==> ** Starting MariaDB setup **
mariadb 15:13:13.87 INFO ==> Validating settings in MYSQL_*/MARIADB_* env vars
mariadb 15:13:13.90 INFO ==> Initializing mariadb database
mariadb 15:13:13.93 WARN ==> The mariadb configuration file '/opt/bitnami/mariadb/conf/my.cnf' is not writable. Configurations based on environment variables will not be applied for this file.
mariadb 15:13:13.93 INFO ==> Installing database
2022-04-11 15:13:14 0 [ERROR] mysqld: Can't create/write to file '/opt/bitnami/mariadb/tmp/ibMiWVWe' (Errcode: 13 "Permission denied")
2022-04-11 15:13:14 0 [ERROR] InnoDB: Unable to create temporary file; errno: 13
2022-04-11 15:13:14 0 [ERROR] mysqld: Can't create/write to file '/opt/bitnami/mariadb/tmp/ibawv7vd' (Errcode: 13 "Permission denied")
2022-04-11 15:13:14 0 [ERROR] InnoDB: Unable to create temporary file; errno: 13
2022-04-11 15:13:14 0 [ERROR] InnoDB: Database creation was aborted with error Generic error. You may need to delete the ibdata1 file before trying to start up again.
2022-04-11 15:13:14 0 [ERROR] Plugin 'InnoDB' init function returned error.
2022-04-11 15:13:14 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2022-04-11 15:13:14 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2022-04-11 15:13:14 0 [ERROR] Aborting
Unfortunately, this didn’t work either. The MariaDB Pod in the Bitnami Helm Chart expects to run as root. While Vcluster successfully does some translation – i.e., the virtual Pod looks like it’s running as root, while the physical Pod is actually running as non-root – this translation fails to fully satisfy filesystem permissions as required during the MariaDB start-up process.
What if we “relaxed” security?
A Pod `my-wordpress-ff8559cd-smkk6` in namespace `default` in the logical cluster is actually running as Pod `my-wordpress-ff8559cd-smkk6-x-default-x-my-vcluster` in the namespace where the vcluster was created in the physical cluster. The implication is that all logical Pods must obey the restrictions of the physical cluster, e.g., OPA policies. In fact, to get the above example running, I had to disable all our default OPA policies:
diff --git a/wc-config.yaml b/wc-config.yaml
index 6f43065..6ae7596 100644
--- a/wc-config.yaml
+++ b/wc-config.yaml
@@ -19,8 +19,16 @@ user:
enabled: true
opa:
imageRegistry:
+ enforcement: dryrun
URL:
- harbor.ckdemo.a1ck.io
- quay.io/jetstack/cert-manager-acmesolver
+ networkPolicies:
+ enforcement: dryrun
+ resourceRequests:
+ enforcement: dryrun
Even with many protections disabled, various things still don't work like `hostPath` volumes. Also, the logical Pods seem to be confused about filesystem permissions (see WordPress error above).
Conclusions
Vcluster is a great tool to create virtual clusters on top of a single physical cluster. Virtual clusters are more isolated than namespaces, but less isolated than several physical clusters. Similarly, virtual clusters bring less overhead than creating separate physical clusters. As a result, vcluster brings a nice mid-point: medium isolation for little overhead.
Vcluster enables an unprivileged Kubernetes user to play around with cluster-wide resources, such as CustomResourceDefinitions, ValidatingWebhooks, to name a few. Hence, it can be used to develop an operator or an admission policy, without stepping on the toes of other users of the same physical cluster.
Unfortunately, vcluster raises expectations that it doesn’t currently fulfil. It is definitely incompatible with a security-hardened Kubernetes cluster: Restrictions that apply in the physical cluster will be enforced in the virtual clusters too. Sometimes these restrictions are at least clearly exposed to the users – as with hostPaths not being allowed – but sometimes they can lead to confusion – as with filesystem permissions.
What are you using Vcluster for?