Rook, Ceph, CSI, Kubernetes, and K10: An All-in-One Stateful Experience on your Laptop
Dec 27: Updated to use the Rook v1.2.0 release.
With the rise of stateful applications in Kubernetes and the rapid adoption of the Container Storage Interface (CSI) by both Kubernetes distributions and storage vendors, it is now rare to find a public or private Kubernetes offering that does not support stateful applications out of the box.
However, we still haven’t seen the same out-of-the-box experience with Kubernetes installed on developer machines and local test environments. While one often finds hostpath storage provisioning (from the host’s local file system) in these local environments, this is a very poor substitute. Local provisioning such as hostpath have enough restrictions that they don’t mirror the experience in a production system. Local provisioning also omits a number of powerful lifecycle management functionality (e.g., snapshots) offered by CSI.
The restrictions hurt both agility as some storage actions cannot be reproduced locally as well as portability where applications and management systems cannot depend on a uniform abstraction. To remove these restrictions, this article walks through how one can easily create a local Kubernetes installation and convert them to a system with CSI-provisioned storage. While the below experiments were performed with Docker Desktop on OS X, we believe they should work with most local Kubernetes distributions1. We further demonstrate the power of having a uniform abstraction by showing how one can move full application stacks between a local developer environment and a production cluster (on-prem or in-cloud).
There are very few prerequisites to get to this all-in-one experience. All you need is:
Kubernetes 1.14+ (needs the VolumeSnapshotDataSource feature gate turned on2)
The Helm package manager for Kubernetes
git (to checkout Rook code)
I. Installing Rook+Ceph, CSI-based Storage for Kubernetes
We are going to use Rook to provision Ceph-based block storage in Kubernetes. While more details on the installation process are available elsewhere, we will walk through the “happy-path” steps here.
First, we need to obtain the Rook source code to set Ceph up.
$ git clone https://github.com/rook/rook.git
Cloning into 'rook'...
$ cd rook
# Switch to the latest Rook release at time of writing
$ git checkout v1.2.1
Note: checking out 'v1.2.1'.
HEAD is now at ccc10604 Merge pull request #4618 from travisn/manifests-1.2.1
Now, we will install Rook and then Ceph into the rook-ceph namespace.
$ cd cluster/examples/kubernetes/ceph
$ kubectl create -f common.yaml
$ kubectl create -f operator.yaml
At this point, we need to verify that the Rook Ceph operator is in the Running state. This can be performed via a command like
$ kubectl --namespace=rook-ceph --watch=true get pods
Once you have verified the operator is running, you can install a test Ceph cluster. Note that this cluster should never be used for production use.
$ kubectl create -f cluster-test.yaml
Setting Up Storage and Snapshot Provisioning
Once you have verified that all the above steps completed without errors and that all pods in the rook-ceph namespace are running without problems, we can turn on dynamic storage provisioning as well as volume snapshotting definitions.
Similar to the StorageClass, we also need to install Ceph’s VolumeSnapshotClass to enable the use of Ceph’s snapshot functionality.
$ kubectl create -f csi/rbd/snapshotclass.yaml
II. Installing K10, Data Management For Kubernetes
While the above instructions get you started with a CSI-compliant storage system on Kubernetes, that is only the beginning. To really leverage the power of your application’s data, we will now add data management to the mix. In particular, we will use the free and fully-featured edition of K10, a data management platform that is purpose-built for Kubernetes.
K10 will provide us, on our laptop, an easy-to-use and secure system for backup/restore and mobility of your entire Kubernetes applications. This includes use cases such as:
Simple backup/restore for your entire application stack to make it easy to “reset” your system to a good known state
Cloning within namespaces within your development cluster for debugging
Moving data from production into your test environment for realistic testing
Installing K10 and Integrating with CSI
Installing K10 is quite simple and you should be up and running in 5 minutes or less! You can view the complete documentation if needed but just follow these instructions to get started.
$ helm repo add kasten https://charts.kasten.io/
$ helm install --name=k10 --namespace=kasten-io kasten/k10
LAST DEPLOYED: Sat Nov 2 23:27:10 2019
We will next annotate the VolumeSnapshotClass we had created as a part of the Rook+Ceph install to indicate that K10 should use it. Note that just like a default StorageClass, there should be only one VolumeSnapshotClass with this annotation. We will also mark the deletion policy to retain snapshots here.
and then access the dashboard at http://127.0.0.1:8080/k10/#/.
As seen in the below images, you will then be presented with the dashboard that shows a card with all automatically-discovered applications. By clicking on this card, you will be presented with an informational description of these applications and you can either create a policy, or for experimentation, just do a full manual backup. The main dashboard will then show in-progress and completed activity as this action gets invoked and completed. Finally, if you want to, you can also restore (or clone in a new namespace) this application again after making changes via the same applications page.
Advanced Use Cases: Multi-Cloud Mobility
The above walkthrough is an example of a backup, restore, and clone workflow in a single cluster. However, this only scratches the surface of the power of K10. You can use the same system to export an entire application stack and its data from production clusters, potentially mask it, push it to an object storage system, and then read it in your local development cluster.
IV. BTW, A Snapshot is Not a Backup!
The above workflows, while extremely useful for local development, only invoke snapshots that are stored on Ceph alongside the primary data. This lack of fault isolation means that these snapshots are not meant to protect data against loss. For that, you truly want to create backups by exporting these snapshots to an independent object store. This is, obviously, supported in K10 today but we will cover this in a separate post soon.
This article has shown that, even in a resource-constrained developer environment, a few simple commands can deliver an all-in-one experience for stateful applications running in Kubernetes. Reducing differences between environments not only increases productivity and agility but also enables more powerful use cases such as cross-environment migration and cloning.
We would love to hear from you and see if this was useful! Find us on Twitter, drop us an email, or swing by our website.
While unsupported, it is possible to enable VolumeSnapshotDataSource in Docker Desktop (latest versions ship with Kubernetes 1.14.x+ or higher). Once you have Kubernetes enabled and running, disable it (but do not disable the Docker Engine) from Docker Desktop Preferences, run docker run -it --privileged --pid=host justincormack/nsenter1 and then edit kube-apiserver.yaml in /var/lib/kubeadm/manifests to add - --feature-gates=VolumeSnapshotDataSource=true as the last argument to the kube-apiserver command. [return]
Niraj Tolia is the General Manager and President of Kasten (acquired by Veeam), that he founded in order to solve the problem of Kubernetes backup and disaster recovery. With a strong technical background in distributed systems, storage, and data management, he has held multiple leadership roles in the past, including Senior Director of Engineering for Dell EMC's CloudBoost group and VP of Engineering and Chief Architect at Maginatics (acquired by EMC). Dr. Tolia received his PhD, MS, and BS in Computer Engineering from Carnegie Mellon University.
Kasten, Inc. 8800 Lyra Drive, Suite 450 Columbus, Ohio 43240
We value the critical role that the security community plays in helping us protect the confidentiality, integrity, and availability of our software, services, and information. If you have information about security vulnerabilities that affect Kasten software, services, or information, please report it to us via our HackerOne Vulnerability Disclosure Program, or anonymously via this form.