Kanister Simplifies Application-level Data Operations on Kubernetes
IT departments have many choices for infrastructure and application deployment. Organizations choose containers for their portability, scalability and deployment speed, among other benefits. Adopting a cloud-native approach offers additional benefits such as increased agility, lower CAPEX costs, and improved scalability and reliability.
For teams working with Kubernetes, ensuring that all data, particularly application data, is protected can be a tricky proposition. With each version of Kubernetes release, users have seen improvements in running stateful workloads. However, on its own, Kubernetes lacks robust application data management capabilities.
Lifecycles and workflows behind cloud-native applications can be complex. Kubernetes currently allows admins to manage data in several ways: storage-centric snapshots, storage-centric snapshots with hooks or APIs into the application, and/or leveraging data services.
Admins may take a storage-centric approach or a data-centric approach using tools such as mysqldump or pg_dump. Briefly, each has its own pros and cons:
Rely on the underlying storage provider’s snapshot capabilities
Snapshot underlying volumes
Don’t interact with the application itself
Storage-centric snapshots with hooks or APIs
Freeze or unfreeze the application during snapshotting process
Data service-centric approach
Leverages tools provided by databases
Features such as encryption are provided by the tools
Recovery process is a bit complex
Use all of the above in a coordinated fashion
Let’s talk a little about the application-centric approach to data operations in Kubernetes.
An Application-centric Approach with Kanister
For DevOps teams using Kubernetes, Kanister is an open-source project that allows domain experts to capture application specific data management tasks in blueprints that can be easily shared and extended. First posted on GitHub almost three years ago, Kanister takes care of the tedious details around application data management on Kubernetes and presents a homogeneous operational experience across applications at scale.
Kanister comprises four primary components:
The Kanister controller: An operator based on the Kubernetes operator pattern, that helps to manage Blueprints, ActionSets and Profiles.
Blueprints: Custom resources used to define workflows for operations such as backup, restore or delete. Essentially, they provide the ability to hook into the data service(s).
ActionSets: Custom resources used to execute a specific action from a specific Blueprint.
Profiles: These determine the destination for backups or the source for restores (i.e., AWS, Azure blob storage or another target).
(at 10:05 in video; different iterations show flow between components)
Kanister offers two additional tools, Kanctl and Kando. Kanctl can be used to create ActionSets and Profiles. Kando helps move data to and from the object store within the container.
In this informative video, Kasten by Veeam’s Pavan Navarathna discusses data management challenges in Kubernetes and provides a demonstration of Kanister with real data. Users familiar with YAML can easily jump in and try it themselves. The demo was staged in conjunction with the DoKC, “an openly governed group of curious and experienced practitioners, taking inspiration from the CNCF and Apache Software Foundation.” DoKC’s goal is to “assist in the emergence and development of techniques for the use of Kubernetes for data.”
In addition to the current capabilities of Kanister, the team is planning to add a guide for writing Blueprints. There are a number of Blueprints for various popular databases, but for those users using a database that doesn’t currently have a Blueprint, the guide should help write one. Also in the works is a plan to add file storage as a destination for backups, and add encryption, compression and deduplication for the data being moved.
As Kasten’s engineering manager, Pavan designs and develops features for the Kasten K10 platform and actively contributes to Kanister, an open-source project for application-level data management on Kubernetes. Prior to joining Kasten, he worked at NetApp designing and developing filesystem features on AltaVault, a cloud-integrated storage appliance. Pavan began his career as a software developer at Aryaka, developing SMB protocol proxy features for the SD-WAN as-a-service product.
Kasten, Inc. 8800 Lyra Drive, Suite 450 Columbus, Ohio 43240
We value the critical role that the security community plays in helping us protect the confidentiality, integrity, and availability of our software, services, and information. If you have information about security vulnerabilities that affect Kasten software, services, or information, please report it to us via our HackerOne Vulnerability Disclosure Program, or anonymously via this form.