Kubernetes

Kubernetes Persistent Volume with Rook

Damian Igbe, Phd
March 24, 2022, 5:31 p.m.

Subscribe to Newsletter

Be first to know about new blogs, training offers, and company news.

Rook is a File, Block, and Object Storage Services for your Cloud-Native Environments, hosted by Cloud Native Computing Foundation  (CNCF). It is the front end to orchestrate storage services in a Kubernetes cluster.
Rook can connect to a backend storage engine like Ceph, CockroachDB, Cassandra, EdgeFS, Minio, and NFS. As of now, Ceph is stable but the other backends are still at beta stage.
Rook turns storage software into self-managing, self-scaling, and self-healing storage services. It does this by automating deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management. Rook uses the facilities provided by the underlying cloud-native container management, scheduling and orchestration platform to perform its duties.

In this tutorial, you will learn how to deploy Rook unto a Kubernetes cluster. To follow the tutorial, you should have a running Kubernetes cluster and be familiar with the basics of Kubernetes. If you don’t have a cluster running, you can either use Minikube or deploy a multi-node Kubernetes cluster using Kubeadm link or KOPS.

Before we get started deploying Rook, let us understand Kubernetes storage primitives:

Kubernetes Storage

Kubernetes storages can be broadly divided into local storage and network-attached storage. Local storage is located inside the cluster node where the Kubernetes Pod is running while network attached storage are centralised on the network (see kubernetes architecture here). Local storages has the advantage of being fast and can be used for things like scratch space but are ephemeral. At best a local storage only exists for as long as the pod is alive. Once the pod is deleted the local storage gets deleted as well. For most applications like databases, this is not a desirable feature since preserving the state of the database is important. In Kubernetes databases are implemented as StatefulSets.

Applications like StatefulSets are dependent on persistent storage and cannot work without them. An opensource storage backend that works quite well and widely deployed is Ceph. Ceph is a unified, distributed storage system designed for excellent performance, reliability and scalability. However, setting up a storage infrastructures like Ceph is often complex  and this is where Rook comes in to help. You deploy Rook and then Rook becomes the operator managing the Ceph cluster on your behalf. Rook integrates deeply into cloud native environments leveraging extension points and providing a seamless experience for scheduling, lifecycle management, resource management, security, monitoring, and user experience.

StorageClasses, Persistent Volumes (PV) and Persistent Volume Claims (PVC)

To use a volume in Kubernetes first requires creating a storageclass to connect to a backend storage. For example, to use a Ceph cluster created through Rook, you would create a storage class pointing to Ceph, referred to as the provisioner in the yaml file below. The manifest file to create the storage class would look like the following:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  replicated:
    size: 3

---

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs

Once the storage class has been created, here is a summary of the steps required to connect a Pod to a persistent volume.

  1. A cluster administrator creates a PersistentVolume that is backed by physical storage.
  2. A cluster user creates a PersistentVolumeClaim, which gets automatically bound to a suitable PersistentVolume.
  3. The user creates a Pod that uses the PersistentVolumeClaim as storage.

Step 1: Create a Persistent Volume (PV)

After the storage class has been created the cluster administrator has the option to provision a persistent volume statically or automatically. Static storage provisioning is a manual process where the Cluster administrator creates volumes of different sizes and make them available for the cluster users (e.g. developers) to use in their applications. One of the problems with this method is that if a user requires a certain volume which is not provisioned by the Administrator, PV will assign the next available storage volume to the user, leading to waste.

In dynamic volume creation, when none of the static PVs the administrator created matches a user’s PersistentVolumeClaim, the cluster may try to dynamically provision a volume to meet the request of the PVC provided the administrator has created a StorageClasses. A PVC must request a specific storage class except a default is made available. Without a storage class, dynamic provisioning cannot occur.
To enable dynamic storage provisioning based on storage class, the cluster administrator need to enable the DefaultStorageClass admission controller on the API server by passing DefaultStorageClass in a comma-delimited, ordered list of values for the –enable-admission-plugins flag of the API server component.

The syntax of a persistent volumes looks like below:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: task-pv-volume
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"

Step 2: Create a Persistent Volume Claim (PVC)

A PVC is required to bind to a PV. Pods use PersistentVolumeClaims to request physical storage.  A PVC can specify a label selector to further filter the set of volumes. Only the volumes whose labels match the selector can be bound to the claim. Here is an example of a PVC.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: task-pv-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi

Step 3: Create a Pod to use PVC

The next step is to create a Pod that uses your PersistentVolumeClaim  to   mount a volume  at a specified mount point.

kind: Pod
apiVersion: v1
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
       claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage

Deploy ROOK with a Ceph Cluster Backend

Now that we know what Rook is all about and some knowledge of how to connect your Pod to a persistent volume, let us create a Rook cluster with a Ceph backend. We will test the Rook deployment with 2 sample applications that  creates PV, PVC and some pods.  Let us follow the following plan:

  1.  Deploy Rook Operator
  2.  Create a Rook Cluster
  3. Add Block Storage to Ceph
  4. Verify Block Storage Operation

Step 1: Deploy Rook Operator

Remember that you need a working Kubernetes cluster to follow this tutorial. This tutorial uses a Kubernetes cluster of 1 Master node and 2 Worker Nodes, making a 3-node cluster.

On your master node:

$ git clone https://github.com/rook/rook.git
$ cd rook/cluster/examples/kubernetes/ceph

Deploy the Rook Operator

$ kubectl create -f operator.yaml

Verify Rook Operator

Check that the following pods are in Running state before proceeding:

  • rook-ceph-operator
  • rook-ceph-agent
  • rook-discover
$ kubectl get pods -n rook-ceph-system | grep Running
NAME READY STATUS RESTARTS AGE

rook-ceph-agent-btzrg 1/1 Running 11 9d
rook-ceph-agent-nhs6l 1/1 Running 11 9d
rook-ceph-operator-b996864dd-2rptc 1/1 Running 9 8d
rook-discover-7j78x 1/1 Running 7 9d
rook-discover-zfvkx 1/1 Running 8 9d

Step 2: Create a Rook Cluster

Open cluster.yaml and edit the file to use filestore instead of bluestore:

$ vi cluster.yaml
storage: # cluster level storage configuration and selection
    useAllNodes: true
    useAllDevices: false
    deviceFilter:
    location:
    config:
      # The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
      # Set the storeType explicitly only if it is required not to use the default.
      # storeType: bluestore
      storeType: filestore

Create the storage cluster:

$ kubectl create -f cluster.yaml

Verify Rook Cluster

To see the list of running pods, use kubectl in the rook-ceph namespace. The number of object storage daemon(OSD) pods (refer to Ceph block architecture)  will depend on the number of nodes in the cluster and the number of devices and directories configured. To fully understand the roles performed by the pods, you will need an understanding of CEPH and how it works., which is not covered in this blog.

$ kubectl get pods -n rook-ceph | grep Running

NAME READY STATUS RESTARTS AGE
rook-ceph rook-ceph-mgr-a-7fcd6b6449-fljkt 1/1 Running 2 8d
rook-ceph rook-ceph-mon-g-565ff8cdf7-8fh4d 1/1 Running 2 8d
rook-ceph rook-ceph-mon-h-86f66b897-7fgjf 1/1 Running 2 8d
rook-ceph rook-ceph-mon-i-867d64c849-xp272 1/1 Running 2 8d
rook-ceph rook-ceph-osd-0-6495594f8d-2pf2z 1/1 Running 2 7d23h
rook-ceph rook-ceph-osd-1-7d694d4f5-thptb 1/1 Running 2 8d
rook-ceph rook-ceph-tools-76c7d559b6-gzpkz 1/1 Running 2 8d

Step 3: Add Block Storage

Now that Rook is up and running connected to a Ceph cluster backend, let us configure a storageclass to use the Block storage of Ceph. Block storage provides a traditional block storage device — like a hard drive – over the network. With the storage class created, one can provision a block storage device of any size and attach it to a pod. Once attached, it is treated like a normal hard disk. To make it ready for use, you need to format it with a filesystem of choice like Ext4, XFS or BtrFS. If desired, block volumes can combine multiple devices into a RAID array, or configure a database to write directly to the block device, avoiding filesystem overhead entirely.

Provision Storage

Open the file and make the following edit: size: 3 as shown below

vi storageclass.yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  # Specify the namespace of the rook cluster from which to create volumes.
  # If not specified, it will use `rook` as the default namespace of the cluster.
  # This is also the namespace where the cluster will be
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs

Create the storage class:

cd rook/cluster/examples/kubernetes/ceph
kubectl create -f storageclass.yaml
pool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created

Step 4: Verify Block Storage Operation

Example 1: StatefulSets

Let us verify the configuration by creating a statefulset. The manifest file for a statefulset is below:

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www 
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "rook-ceph-block"
      resources:
        requests:
          storage: 1Gi
kubectl create -f statefulsets.yaml
Kubectl get pods
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 1 3d
web-1 1/1 Running 1 3d
web-2 1/1 Running 1 3d

Example 2: WordPress and MySQL

In the rook folder you will find 2 yaml files for testing. One of them is mysql.yaml and the other one is wordpress.yaml. We want to ensure that the WordPress service is running on NodePort so let’s edit WordPress.yaml file and change the service type from
type: LoadBalancer to type: NodePort

apiVersion: v1
kind: Service
metadata:
  name: wordpress
  labels:
    app: wordpress
spec:
  ports:
  - port: 80
  selector:
    app: wordpress
    tier: frontend
  type: NodePort

Let us create them

cd /root/rook/cluster/examples/kubernetes
kubectl create -f mysql.yaml
kubectl create -f wordpress.yaml

service/wordpress created
persistentvolumeclaim/wp-pv-claim created
deployment.apps/wordpress created

Kubectl get pods
wordpress-7b6c4c79bb-wj7l4 1/1 Running 0 6m45s
wordpress-mysql-6887bf844f-rxsbj 1/1 Running 0 7m4s

Based on the results of running the yaml files in examples 1 and  2 above, we have several persistent volumes (PV) and several persistent volume claims (PVC) created. All the yaml files have their corresponding PVC sections while the PVs were created automatically. A snippet is shown below for wordpress.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: wp-pv-claim
  labels:
    app: wordpress
spec:
  storageClassName: rook-ceph-block
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-8d2874c3-4607-11e9-af13-02d2ebb64d8b 20Gi RWO Delete Bound default/mysql-pv-claim  rook-ceph-block 8d 
pvc-d8867ba3-4dad-11e9-b844-02d2ebb64d8b 20Gi RWO Delete Bound default/wp-pv-claim rook-ceph-block 8d 
pvc-3922fb62-4754-11e9-a461-02d2ebb64d8b 1Gi RWO Delete Bound default/www-web-1 rook-ceph-block 8d
pvc-42f62b50-4754-11e9-a461-02d2ebb64d8b 1Gi RWO Delete Bound default/www-web-2 rook-ceph-block 8d
pvc-63779225-4752-11e9-a461-02d2ebb64d8b 1Gi RWO Delete Bound default/www-web-0 rook-ceph-block 8d

kubectl get pvc

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Bound pvc-8d2874c3-4607-11e9-af13-02d2ebb64d8b 20Gi RWO rook-ceph-block 9d
wp-pv-claim Bound pvc-d8867ba3-4dad-11e9-b844-02d2ebb64d8b 20Gi RWO rook-ceph-block 66s
www-web-0 Bound pvc-63779225-4752-11e9-a461-02d2ebb64d8b 1Gi RWO rook-ceph-block 8d
www-web-1 Bound pvc-3922fb62-4754-11e9-a461-02d2ebb64d8b 1Gi RWO rook-ceph-block 8d
www-web-2 Bound pvc-42f62b50-4754-11e9-a461-02d2ebb64d8b 1Gi RWO rook-ceph-block 8d

Now that WordPress and mysql have been deployed, let us attempt to configure WordPress. Since we run the service as a NodePort, we can access  WordPress  with 192.168.205.12:30704  where 192.168.205.12 is the IP address of one of the cluster nodes (can be any worker node)  and 30704 is the NodePort assigned to the WordPress service.

To obtain the NodePort, let us view the WordPress service.

kubectl get svc wordpress
wordpress NodePort 10.99.137.116 80:30704/TCP 10m
wordpress-mysql ClusterIP None 3306/TCP 9d

Putting this on the browser will start the WordPress installation/configuration:  192.168.205.12:30704

The Rook Toolbox Pod

The Rook toolbox is a container with common tools used for Rook debugging and testing.

cd root/rook/cluster/examples/kubernetes/ceph
kubectl apply -f toolbox.yaml

kubectl  get pods -n rook-ceph | grep tool
rook-ceph-tools-76c7d559b6-gzpkz         1/1     Running     2          8d

 

Once the rook-ceph-tools pod is running, you can connect to it with:

 
kubectl exec -it rook-ceph-tools-76c7d559b6-gzpkz -n rook-ceph /bin/bash

 

Available tools in the toolbox are ready for your troubleshooting needs.

  • ceph status,
  • ceph osd  status
  • ceph df
  • rados df
[root@k8s-node-1 /]# ceph status
  cluster:
    id:     1b6005ac-ff2f-49b3-86eb-0b986a0d5b40
    health: HEALTH_WARN
            Degraded data redundancy: 147/441 objects degraded (33.333%), 70 pgs degraded, 100 pgs undersized
            mons m,n,o are low on available space
 
  services:
    mon: 3 daemons, quorum n,m,o
    mgr: a(active)
    osd: 2 osds: 2 up, 2 in
 
  data:
    pools:   1 pools, 100 pgs
    objects: 147  objects, 271 MiB
    usage:   15 GiB used, 4.2 GiB / 19 GiB avail
    pgs:     147/441 objects degraded (33.333%)
             70 active+undersized+degraded
             30 active+undersized
[root@k8s-node-1 /]# ceph osd status
+----+------------+-------+-------+--------+---------+--------+---------+-----------+
| id |    host    |  used | avail | wr ops | wr data | rd ops | rd data |   state   |
+----+------------+-------+-------+--------+---------+--------+---------+-----------+
| 0  | k8s-node-1 | 8170M | 1691M |    0   |     0   |    0   |     0   | exists,up |
| 1  | k8s-node-2 | 7218M | 2643M |    0   |     0   |    0   |     0   | exists,up |
+----+------------+-------+-------+--------+---------+--------+---------+-----------+
[root@k8s-node-1 /]# ceph df
GLOBAL:
    SIZE       AVAIL       RAW USED     %RAW USED 
    19 GiB     4.2 GiB       15 GiB         78.01 
POOLS:
    NAME            ID     USED        %USED     MAX AVAIL     OBJECTS 
    replicapool     1      271 MiB     18.39       800 MiB         147 

[root@k8s-node-1 /]# rados df
POOL_NAME      USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS     RD WR_OPS      WR 
replicapool 271 MiB     147      0    441                  0       0      147   4251 40 MiB   4169 379 MiB 

total_objects    147
total_used       15 GiB
total_avail      4.2 GiB
total_space      19 GiB

 

Conclusion:

This tutorials helps you to learn the Basic of Kubernetes storage but the main focus is on the deployment of Rook as a Persistent Block Volume Backend. This is especially useful  for applications like databases (StatefulSets) that require persistent volumes. Thanks for reading, I hope you like the blog and if you do, don’t forget to like and share the blog.

Zero-to-Hero Program: We Train and Mentor you to land your first Tech role