.tech Podcast - Backing up Kubernetes

Michael Cade from Veeam talks about the importance of data backups according to the 3-2-1 backup rule. Then, he explains the importance of backing up Kubernetes persistent volumes. Kanister is an open source tool that can help with protecting Kubernetes clusters, while the commercial product Kasten K10 provides a management and orchestration layer on top of Kanister.

Michael is an expert in data management and backups. Here are some of the key highlights of his discussion about the importance of backups and how to manage them in Kubernetes.

Data backups and their importance

There is never a one button solution to data backups, as there is always a challenge moving large amounts of data. If the data is important to you, then you need to back it up somewhere else.

After first starting out in virtualisation, Veeam now focus on protection of data. They are massively advocating the 3-2-1 backup rule, regardless of exactly what solution you choose. The 3-2-1 rule states that there should be 3 copies of your data, on 2 different media with 1 copy being offsite.

As a concrete example, let’s say a company is running a MySQL database running on AWS.

  • The database running in one location is considered the first copy of the data. This data is vulnerable to corruption or deletion.
  • You can make Amazon EBS snapshots of the data at set intervals, for example one hour. These copies are durable and can be moved accross different regions.
  • While you could push a copy of the data in another region, customers usually choose to move another copy of the data to a different cloud. Kubernetes is portable and allows us to spin our application in a different cloud.
  • The three copies of your data consist of: the production database, the cloud snapshot, preferably on both EBS and S3 for two media, and one offsite - either in a different region, another cloud provider or even on premises.

In general, most companies aim to keep serving data in the case of a region failure, but you don’t have to aim for an RTO of 0. These outages are so rare that you might decide not to optimise for immediate rollover when these big outages happen.

These principles also apply for personal data. You don’t want to lose all your important files in the case of disk failure, so make sure to back up your data in the cloud!

Kubernetes

A lot of the tech industry are moving to Kubernetes for their container solution.

Up until recently, there has been a long debate about whether Kubernetes was designed for stateful applications.

  • There is some fault tolerance built into Kubernetes with the pod lifecycle, persistent storage still needs to be backed up.
  • There are very rarely fully stateless products, so Kubernetes workloads need to be typically backed up.

In Kubernetes, we have StatefulSets which allow us to create a persistent volumes, which can be backed by a storage. Pods connect to this storage, but this data remains as the containers come and go. The storage array that your pods rely on, now needs to be backed up.

One example could be a MongoDB database inside Kubernetes. Some possible solutions of backing up this database could be:

  • We can create a script which copies the file system to an EBS snapshot. We can then write another script to restore this back to MongoDB.
  • However, the method is clunky if your application is using multiple types of database, like a MySQL database alongside MongoDB, for example. The problem of maintaining these scripts just explodes, as we add more databases.
  • Furthermore, these snapshots also create higher loads on the database that we are snapshotting.

Kanister & Kasten K10

Kanister is an open source project that was created by the engineers at Kasten to solve exactly the problem of complicated backup processes.

Kanister is deployed within the Kubernetes cluster that you’re protecting. It works on a simple basis: you choose a blueprint which is aligned to the data service that you’re using.

  • There are ready made blueprints on GitHub for most common usecases.
  • The blueprints are the commands that you will leverage to create your backups of your data. It then uses the database native tools to create the backups in a consistent way.
  • You can also write your own blueprint.

The commercial product that builds on Kanister is Kasten K10 is for those that have outgrown Kanister. This provides snapshotting across clouds, as well as a management and orchestration layer on top of Kanister.

  • Provides 10 free nodes forever for those who want to experiment with it.

Kasten K10 provides three large aspects: application consistent backup using Kanister, disaster recovery/failover to another location and application migration/data transformation.

Kasten also provide a hands on learning platform at learning.kasten.io, where you can walk through the steps of setting up a Kubernetes cluster and experiment with K10.

by Adelina Simion Technology Evangelist

Further resources

Here are some other resources that you might find interesting:

.tech Podcast - Buying Cloud with Strategic Blue

Our host Kevin Holditch is joined by Frank Contrepois from Strategic Blue, who help clients buy Cloud on their own times. With Form3 an active customer of Strategic Blue's, Kevin and Frank discuss the advantages of Strategic Blue and the reasons why organisations choose to utilise Strategic Blue when buying Cloud.

.tech Podcast - Supercharge your Kubernetes clusters with Cilium

Dan Wendlandt, the CEO of Isovalent joins host Kevin Holditch for a discussion on the product they have created Cilium. Join them to learn how Cilium leverages eBPF to give your kubernetes cluster superpowers! These superpowers include cluster meshing (global kubernetes services across multiple clusters), lightning fast networking (no iptables), pod network security which can be scoped to a pod identity and include rules for both inside and outside of cluster and full network observability. Dan takes me through how Cilium enables all of these features.

.tech Podcast - CNCF's role in the cloud native world

In this episode our host Kevin Holditch was joined by Cheryl Hung from CNCF (Cloud Native Computing Foundation). Cheryl takes us through what CNCF do, how they oversee opensource projects to make them reliable and why you should strongly consider joining CNCF as an end user.