Stateful data is stored on Persistent Volumes in Kubernetes. These can be either statically or dynamically provisioned. But only if a suitable Storage Class has been defined. This is a no-brainer in most public clouds, because they offer a block storage service such as AWS EBS. But what can you do to support e.g. bare metal deployments of Kubernetes, where no such block storage service is available?
Enter Rook. Rook is a storage service that provides a Kubernetes-compatible API. It sits as an abstraction layer between Kubernetes and an underlying storage service. In this blog post, we will set up virtual machines that will run Kubernetes and also run the Ceph storage service. Ceph is then made available to Kubernetes via Rook.
The result? A fully functioning Kubernetes cluster that can dynamically provision Persistent Volumes. Please note that although we use a cloud environment here to start the virtual machines, everything in this article works just as well on bare metal servers.
Overview
It is assumed that you can use cloud-init to configure your nodes.
Our cluster setup is as following:
- One control plane node: 2GB RAM, 2 vCPU, 50GB local storage.
- Three worker nodes: 8GB RAM, 4 vCPU, 100GB local storage each.
All nodes are running Ubuntu 20.04 LTS.
The cluster is running Kubernetes v1.18.10 and is installed using Kubespray 2.14.2.
Infrastructure preparation
Before deploying Kubernetes and Rook Ceph, we have to decide how to provide the storage for the Rook Ceph cluster, and prepare the nodes accordingly.
Choosing local storage option
Let’s start by looking at the Rook and the Ceph prerequisites.
In our case, what we need to decide on is which local storage option to provide:
- Raw devices (no partitions or formatted filesystems)
- Raw partitions (no formatted filesystem)
- PVs available from a storage class in block mode
On the cloud provider used for this example, it is only possible to use one device per node – the boot disk.
Using a separate disk for Rook Ceph is therefore not an option.
A raw partition could be created during boot using cloud-init
.
Considering that Rook Ceph clusters can discover raw partitions by itself, and we would have to create block mode PVs (PersistentVolumes) ourselves in order to use them, we will go with raw partitions.
Implementing local storage option
For simplicity, we will provide Ceph storage on all worker nodes.
Further configuration to specify which nodes to consider for storage discovery can be done in the Ceph Cluster CRD.
To create the raw partition on the worker nodes, they should have the following cloud-init config:
#cloud-config
bootcmd:
- [ cloud-init-per, once, move-second-header, sgdisk, --move-second-header, /dev/vda ]
- [ cloud-init-per, once, create-ceph-part, parted, --script, /dev/vda, 'mkpart 2 50GB -1' ]
In case you struggle to understand the commands above, fear not! We explained partitioning via cloud-init in our previous blog post.
Note the mkpart
command.
It will create a raw partition with start at 50GB, stretching until next partition or end of disk.
Change start from 50GB to something larger if you want to reserve more space for other partitions, such as the root partition.
Verify that the nodes have an empty partition:
# On the worker nodes
$ sudo parted -l
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 107GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
14 1049kB 5243kB 4194kB bios_grub
15 5243kB 116MB 111MB fat32 boot, esp
1 116MB 50.0GB 49.9GB ext4
2 50.0GB 107GB 57.4GB 2
See partition 2. It does not have a file system, and is leaving ~50GB for other partitions, as desired from our cloud-init configuration.
Now that the nodes are prepared, it is time to deploy the Kubernetes cluster.
The Kubernetes installation will not be covered in detail.
We will install a vanilla Kubernetes cluster using Kubespray, but most installers should do.
Deploying Rook
Now that the worker nodes are prepared with a raw partition and Kubernetes is deployed, it is time to deploy the Rook Operator.
At this point it is totally fine to follow the Ceph Quickstart, but we will use the Ceph Operator Helm chart instead.
The examples are based on using Helm v3.
Create the namespace for Rook and deploy the Helm chart:
helm repo add rook-release https://charts.rook.io/release
kubectl create namespace rook-ceph
helm install rook-ceph rook-release/rook-ceph \
--namespace rook-ceph \
--version v1.5.3
See that the operator pod starts successfully:
$ kubectl --namespace rook-ceph get pods
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-664d8997f-lttxz 1/1 Running 0 33s
The Rook repository provides some example manifests for Ceph clusters and StorageClasses.
In this case, we will deploy the sample production Ceph cluster cluster.yaml
.
Note that this requires at least three worker nodes – if you have fewer nodes in your cluster, use cluster-test.yaml
(NOT RECOMMENDED FOR PRODUCTION).
For storage class, we will go with the sample RBD storageclass.yaml
.
Note the less demanding but also less reliable storageclass-test.yaml
, if you are only testing this out.
Deploy the Ceph cluster and the storage class:
kubectl --namespace rook-ceph apply -f cluster.yaml
kubectl --namespace rook-ceph apply -f storageclass.yaml
Give the Ceph cluster a few minutes to get ready.
You can check the status of the cluster by running:
$ kubectl --namespace rook-ceph get cephclusters.ceph.rook.io
NAME ... PHASE MESSAGE HEALTH
rook-ceph ... Ready Cluster created successfully HEALTH_OK
At this stage, we had an issue where the Ceph cluster failed to reach a healthy state.
After some debugging we came to the conclusion that our issue was caused by an incomplete deployment of cert-manager, causing the Kubernetes API server to be unable to respond to requests from the Rook Operator.
Make sure your Kubernetes clusters are in a healthy state before deploying Rook!
Creating and consuming a PersistentVolumeClaim
Once the Ceph cluster is ready, we can create the sample PersistentVolumeClaim (PVC) and see that Rook Ceph creates a PersistentVolume (PV) for it to bind to:
$ kubectl create -f pvc.yaml
persistentvolumeclaim/rbd-pvc created
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rbd-pvc Bound pvc-c3af7bd1-d277-4475-8e03-d87beb719e75 1Gi RWO rook-ceph-block 113s
and finally consume the PVC with a Pod:
$ kubectl create -f pod.yaml
pod/csirbd-demo-pod created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
csirbd-demo-pod 1/1 Running 0 21s
By describing the pod, we see that it is using the PVC:
$ kubectl describe pod csirbd-demo-pod
Containers:
web-server:
...
Mounts:
/var/lib/www/html from mypvc (rw)
...
Volumes:
mypvc:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: rbd-pvc
ReadOnly: false
...
Further tweaking
These manifests are only sample manifests, and not guaranteed to fit your infrastructure perfectly.
Look through the different storage types, the Ceph cluster configuration and the storage class configuration to see what would fit your use case best.
Monitoring
Rook Ceph comes with Prometheus support.
This requires Prometheus Operator to be deployed, which we will not cover here.
Once Prometheus Operator is installed, monitoring can be enabled per Rook Ceph cluster by setting spec.monitoring.enabled=true
in the CephCluster CR (cluster.yaml
in our example).
The manifest can be safely reapplied after changing this value, and the Rook Operator will create the corresponding ServiceMonitor.
There are some Grafana dashboards referred to in the Rook documentation utilizing the metrics exposed by the Rook Ceph clusters, created by @galexrt:
Cleanup
Since Rook Ceph expects raw devices on the nodes it runs on, redeploying a cluster is not entirely straightforward (unless you can throw away and recreate the worker nodes).
For more detail, see the Rook Ceph Cleanup documentation.
Further reading
Read more of our engineering blog posts
This blog post is part of our engineering blog post series. Experience and expertise, straight from our engineering team. Always with a focus on technical, hands-on HOWTO content with copy-pasteable code or CLI commands.
Would you like to read more content like this? Click the button below and see the other blog posts in this series!
Want to keep up with the latest in cloud and Kubernetes?
Let us deliver it straight to your inbox!