This blog post was co-authored with Julio Lopez.
Kasten uses and helps develop kopia, a fast and secure open-source tool to manage backups. Today, kopia can work standalone on data sets ranging from small laptops to large servers and comes with a variety of out-of-the-box options for encryption, deduplication, and compression. It is also a lock-free system that, unlike other similar tools, allows for concurrent multi-client operations including garbage collection.
Kopia's true value really starts to emerge when we pair it with Kubernetes and K10, Kasten’s Kubernetes backup platform:
- Faster, Smaller, Safer: With Kubernetes, kopia can focus on its core data transformation while Kubernetes takes care of orchestration, high availability, resource allocation, and scaling.
- Client-Side Architecture: Kopia requires no server-side resources (e.g., an always-on dedicated backup appliance or VM) presence and is completely client-side and co-located with the application. With concurrent multi-client support, this ensures that there is no centralized bottleneck when data is streamed to secondary storage as new containers are launched across all Kubernetes nodes during peak demand.
- “Serverless”: Unlike legacy deduplication engines that are designed for peak demand and have vertical scaling, K10 allows kopia to dynamically start and scale up in response to workloads and scale back down when not in use.This horizontal scaling model ensures that there is no overhead when not in use and that the system can grow with applications and the cluster.
- Application Storage Choice: Given kopia’s pluggable backend engine and K10’s application-centric architecture, every Kubernetes application can individually decide where its backups should go (on-prem storage, public cloud object storage, etc.) vs. being limited by what a centralized media repository supports or is configured with.
- Fault and Security Isolation: Kopia’s architecture allows K10 to offer each application its own deduplication store. This allows each application to have personalized encryption keys, improves fault tolerance across applications as the failure of one deduplication domain has no impact on another, and reduces risk during upgrades.
Kopia’s development has accelerated in 2020 and is quickly approaching 1.0. While a number of new features have shown up within the tool, this post will concentrate on the performance improvements made over the last few months. To do that, we will compare v0.4.0 (January, 2020), v0.5.2 (March, 2020), and v0.6.0-rc1 (July, 2020). We will additionally also compare it to restic, another popular open-source backup tool. All binaries were downloaded from GitHub. With the exception of the s2-standard compression scheme being enabled with kopia, the default options were used for all tools.
We measured how long it took to backup data using the above systems on an AWS EC2 m5dn.12xlarge VM (48 vCPUs, 192GB RAM, 2x900 NVMe SSD, 12 Gbps network bandwidth) running Ubuntu 18.04. On the VM, we generated two datasets that were stored on one of the local NVMe devices:
- Large Files: 200GiB of data spread across 1000 files were generated with fio (dedupe_percentage=40, buffer_compress_percentage=60)
- Small Files: 2M small files with 1KiB of random data
All systems pushed data to an AWS S3 bucket in the same region as the EC2 VM. We repeated each experiment three times and the results presented below are the average of three runs. For each experiment, we captured a number of metrics and report both the total experiment runtime and the final size of the data stored in S3 in the below graphs.
Large File Experiments
As can be seen in the above results, kopia’s performance has improved significantly over the last few releases. The time taken to backup 200GiB of data has been reduced from ~840 seconds to ~200! For just a single process, this translates to an effective processing bandwidth of 1 GiB/second and an upload bandwidth utilization of 3.5 Gbps. The majority of these performance gains can be attributed to a more efficient memory allocation algorithm, improved parallelization and other data pipeline improvements.
When compared to restic, we notice that restic not only takes 7X longer than the most recent kopia version, it doesn’t reduce the backup size because it doesn’t support compression. This results in object storage utilization of ~200GiB with restic vs. 90GiB for kopia.
Small Files Experiments
The small files workload is a pathological one as we have a large number (2M) of files with very little data (1KiB each, 1.9 GiB total) that won’t benefit from either compression or deduplication. The amount of metadata, relative to data, will also be high here.
As the graphs show, kopia’s performance has improved significantly during the last development cycle with an ~70% reduction in total run time. The changes responsible for this improvement include better memory management, reduced contention of key data structures and aggressive parallelization. Restic, while better than previous kopia releases, takes 2.3X longer than the upcoming kopia release for this benchmark.
Kopia’s backup footprint is the same across releases, and it is 20% smaller than Restic’s. This can be attributed to better object packing. Kopia uses efficient index on-disk structures that allow quick writes and lookups while remaining very compact.
Get Started With Kopia and K10
Kopia today has a lot more to offer than what this blog post could even start to cover. We highly encourage you to download it and take it for a spin, join the Slack channel (don’t forget to say hi to Jarek, the lead developer behind the project), and become a part of the community. If you are looking for Kubernetes backup in particular, don’t forget to download and deploy our forever-free K10 starter edition today.