Skip to content
Openshift Disaster Recovery with K10 Kubernetes

As we continue to make deep inroads into the Global 2000 list of enterprises, we more often than not run into Red Hat OpenShift running in production within these companies. OpenShift, based on Kubernetes, has been designed to securely “build, deploy and manage your container-based applications consistently across cloud and on-premises infrastructure” and we see it being deployed today not just within private data centers but also in the public cloud.

Recently, Red Hat announced their release of OpenShift 4, a complete re-architecture of their entire Kubernetes stack. While you can find a lot more information on everything that was found in this action-packed release elsewhere, we do believe that it is a big step in further increasing the production footprint of Kubernetes in the enterprise.

More importantly, OpenShift 4 now comes with OpenShift Container Storage (OCS). OCS has moved away from the use of GlusterFS (found in Red Hat’s Container-Native Storage offering for older OpenShift versions) and has shifted to Ceph and Rook (based on CSI). We believe this decision to be the right one given that Ceph was a far more popular storage choice even with OpenShift 3.x.

Given Red Hat’s push behind OCS and the continuing growth of stateful cloud-native applications in production, we have been working hard to ensure that K10, our enterprise data management platforms works as well with OpenShift 4 as it did with OpenShift 3. If you are not familiar with K10, it has been purpose-built for Kubernetes and OpenShift and provides enterprise operations teams an easy-to-use, scalable, and secure system for backup/restore, disaster recovery, and mobility of Kubernetes applications. K10’s unique application-centric approach and deep integrations with relational and NoSQL databases, Kubernetes distributions, and all clouds provide teams the freedom of infrastructure choice without sacrificing operational simplicity.

To demonstrate the power of K10 with OpenShift 4, this blog will run through all the steps needed to get the combination of K10, OpenShift, Rook, Ceph, and your favorite applications up and running. Thanks to Red Hat, it was very easy to get OpenShift up and running via try.openshift.com on AWS but you can use any other installation method or cloud too! Apart from OpenShift, all you need to get up and running is the Helm package manager and git (to checkout Rook code).

The rest of this article is broken up into four parts:

  1. Enabling Snapshot Restore in OpenShift
  2. Installing Rook+Ceph in OpenShift
  3. Installing K10, An Enterprise-Grade Data Management Platform
  4. Backup and Restore Workflows With a Test Application

I. Enabling Snapshot Restore in OpenShift

To enable the restore of CSI-based Ceph volume snapshots, we need to enable the VolumeSnapshotDataSource feature gate. This is very easy to do with OpenShift and can be done either via the command-line oc tool or via the OpenShift console (follow the “Procedure” docs).

  • Via the OpenShift Console:

    Switch to Administration → Custom Resource Definitions → Find and click FeatureGate → Click Instances → Click cluster → Click YAML to edit.

  • Via the command line:

    $ oc edit featuregate cluster

Independent of the option you use, replace the empty spec section (it should just show {}) with the following. The final YAML should look like:

apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
  name: cluster

...

spec:
  customNoUpgrade:
    enabled:
      - VolumeSnapshotDataSource
  featureSet: CustomNoUpgrade

Once you make the changes, it will take a couple of minutes for the change to be applied to the cluster. To verify that the changes have been applied successfully, you can check that all OpenShift pods are running without errors (e.g., via oc get pods --all-namespaces --watch) and, in particular, ensure that all pods in the openshift-kube-apiserver namespace are in the Running state.

To ensure that the right feature gate was enabled, simply look for `"feature-gates":["VolumeSnapshotDataSource=true"] in the output of the following command:

$ oc --namespace=openshift-kube-apiserver rsh <pod-name> cat /etc/kubernetes/static-pod-resources/configmaps/config/config.yaml

Note: Enabling the above feature gate is not officially supported by OpenShift yet.


II. Installing Rook+Ceph, CSI-based Storage for OpenShift

Rook and Ceph Logos

We are going to use Rook to provision Ceph-based block storage in OpenShift. While more details on the installation process are available elsewhere, we will walk through the “happy-path” steps here.

Setting up a Ceph RBD (Block) Cluster

First, we need to obtain the Rook source code to set Ceph up.

$ git clone https://github.com/rook/rook.git
Cloning into 'rook'...
...

$ cd rook

# Switch to the latest Rook release at time of writing
$ git checkout v1.2.1
Note: checking out 'v1.2.1'.
...

HEAD is now at ccc10604 Merge pull request #4618 from travisn/manifests-1.2.1

Now, we will install Rook and then Ceph into the rook-ceph namespace.

$ cd cluster/examples/kubernetes/ceph

$ oc create -f common.yaml
namespace/rook-ceph created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
...

$ oc create -f operator-openshift.yaml
deployment.apps/rook-ceph-operator created

At this point, we need to verify that the Rook Ceph operator is in the Running state. This can be performed via a command like

$ oc --namespace=rook-ceph --watch=true get pods

Once you have verified the operator is running, you can install a test Ceph cluster.

$ oc create -f cluster-test.yaml
cephcluster.ceph.rook.io/rook-ceph created

If you would like to verify that Ceph is running correctly, simply run the following and ensure the cluster health is shown as HEALTH_OK.

$ oc create -f toolbox.yaml
$ oc --namespace=rook-ceph exec -it $(oc --namespace=rook-ceph \
    get pod -l "app=rook-ceph-tools" \
    -o jsonpath='{.items[0].metadata.name}') ceph status
  cluster:
    id:     <cluster id>
    health: HEALTH_OK
...

Setting Up Storage and Snapshot Provisioning

Once you have verified that all the above steps completed without errors and that all pods in the rook-ceph namespace are running without problems, we can turn on dynamic storage provisioning as well as volume snapshotting definitions.

First, install the StorageClass using

$ oc create -f csi/rbd/storageclass.yaml

Similar to the StorageClass, we also need to install Ceph’s VolumeSnapshotClass to enable the use of Ceph’s snapshot functionality.

$ oc create -f csi/rbd/snapshotclass.yaml

Finally, we need to switch the cluster’s default storage class to the new Ceph storage class. Assuming the old default was gp2 (if running in AWS), we can execute the following commands to make the switch.

$ oc patch storageclass gp2 \
    -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
$ oc patch storageclass rook-ceph-block \
    -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

III. Installing K10, An Enterprise-Grade Data Management Platform

Kasten K10 Logo

While the above instructions get you started with a CSI-compliant storage system on OpenShift, that is only the beginning. To really leverage the power of your application’s data, we will now add data management to the mix. In particular, we will use the free and fully-featured edition of K10, a data management platform that is deeply integrated into OpenShift. For example, K10 transparently works with OpenShift’s Security Context policies and also understands and captures OpenShift’s extensions such as DeploymentConfigs.

K10 will provide you an easy-to-use and secure system for backup/restore and mobility of your entire OpenShift application. This includes use cases such as:

  • Simple backup/restore for your entire application stack to make it easy to “reset” your system to a good known state
  • Cloning within namespaces within your development cluster for debugging
  • Moving data from production into your test environment for realistic testing

Installing K10 and Integrating with CSI-based Rook/Ceph

Installing K10 is quite simple and you should be up and running in 5 minutes or less! You can view the complete documentation if needed but just follow these instructions to get started.

$ helm repo add kasten https://charts.kasten.io/
$ oc create namespace kasten-io
# Remove --name= for Helm 3
$ helm install --name=k10 --namespace=kasten-io kasten/k10
NAME:   k10
LAST DEPLOYED: Fri Nov 22 22:54:09 2019
NAMESPACE: kasten-io
STATUS: DEPLOYED
...

We will next annotate the VolumeSnapshotClass we had created as a part of the Rook+Ceph install to indicate that K10 should use it. Note that just like a default StorageClass, there should be only one VolumeSnapshotClass with this annotation. We will also mark the deletion policy to retain snapshots here.

$ oc annotate volumesnapshotclass csi-rbdplugin-snapclass \
    k10.kasten.io/is-snapshot-class=true
$ oc patch volumesnapshotclass csi-rbdplugin-snapclass \
    -p '{"deletionPolicy":"Retain"}' --type=merge

IV. Backup and Restore Workflows With a Test Application

Redis Logo

Installing Redis as a Test Application

To demonstrate the power of this end-to-end storage and data management system, let’s also install Redis on your test cluster.

# The below instructions are for Helm 2.x and must be tweaked for Helm 3.x
$ helm install --name=redis stable/redis --namespace redis \
    --set securityContext.enabled=false

Once installed, you should verify that this provisioned new Ceph storage by running oc get pv and checking the associated StorageClass of the Redis Persistent Volumes (PVs).

Backup and Recovery Workflows

First, to access the K10 dashboard (API and CLI access is also available if needed), run the following command

$ oc --namespace kasten-io port-forward service/gateway 8080:8000
and then access the dashboard at http://127.0.0.1:8080/k10/#/.

As seen in the below images, you will then be presented with the dashboard that shows a card with all automatically-discovered applications. By clicking on this card, you will be presented with an informational description of these applications and you can either create a policy, or for experimentation, just do a full manual backup. The main dashboard will then show in-progress and completed activity as this action gets invoked and completed. Finally, if you want to, you can also restore (or clone in a new namespace) this application again after making changes via the same applications page.

Converting Snapshots into Backups

The above snapshot workflow, while extremely useful for fast recovery, only invokes snapshots that are stored on Ceph alongside the primary data. This lack of fault isolation means that these snapshots are not meant to protect data against loss. For that, you truly want to create backups by exporting these snapshots to an independent object store.

The easiest way to do this is to create a policy for periodic backup and snapshots and associating the policy with a mobility profile that uses object storage to store deduplicated backups. K10 supports a wide variety of object storage systems including all public cloud options as well as Minio, Red Hat’s Noobaa, and Ceph’s S3-compatible Object Gateway.

To support various workflows, and as the below screenshots show, the snapshot and backup retention schedule can be independent. This allows you to save a few snapshots for quick undelete with a larger number of backups stored in object storage to deliver cost, performance, and scalability benefits.

Advanced Use Cases: Multi-Cloud Mobility

The above walkthrough is an example of a snapshot, backup, and restore workflow in a single cluster. However, this only scratches the surface of the power of K10. You can use the same system to export an entire application stack and its data from production clusters and bring it up in a geographically separate DR cluster. Alternatively, you could mask data, push it to an object storage system, and then read it in your local development cluster. Documentation for these use cases is available here but you can look at this video to get a quick overview.


V. Summary

This article has shown how K10 can be easily integrated with OpenShift 4.x and Rook-based Ceph storage. However, K10 also integrates with a variety of storage systems that can be used with OpenShift including storage provided by all the major public cloud providers (AWS, Azure, Google, IBM), NetApp (via their Trident provider), and more. K10 also includes application-level integrates with a range of relational databases (e.g., MySQL or PostgreSQL) and NoSQL systems (e.g., MongoDB or Elastic). Check out our datasheet for more information.

Finally, we would love to hear from you to see how K10 could be useful in your OpenShift environment. 

Download Free Kasten K10

logo-aws-color

logo-azure-color

logo-digital-ocean-color

logo-google-cloud-color

logo-kubernetes-color

logo-openshift-color

logo-suse-rancher-color

logo-k3s-color

logo-vmware-tanzu-color

For information about Kasten K10

Contact Us

For information about Kasten K10, please send us a message using the form on this page, or email us at contact@kasten.io

For product support: Open a case via Veeam
Community support: Veeam Community

Address:

Kasten, Inc. 
8800 Lyra Drive, Suite 450
Columbus, Ohio 43240

We value the critical role that the security community plays in helping us protect the confidentiality, integrity, and availability of our software, services, and information. If you have information about security vulnerabilities that affect Kasten software, services, or information, please report it to us via our HackerOne Vulnerability Disclosure Program, or anonymously via this form.

Please Send Us a Message