4 min read

The State of Stateful apps on Kubernetes

The State of Stateful apps on Kubernetes

Stateless applications don't need to care about underlying state of the application before moving to different nodes. They can scale out or scale in based only on inbound traffic. Nginx is a great example - it needs readonly access to config and data on each pod.

Stateless applications do not need to write data to local storage. The moment an application starts writing data to local storage (stateful), it becomes difficult to move that application to another node. But one of the most important promises of the container world is that if a node is down, applications can move to another node - no downtime.

So you see that stateful applications have this inherent friction with platforms like Kubernetes due to their core nature. But this doesn't mean such applications can't or shouldn't be run on Kubernetes. It is still possible to leverage cloud benefits - just that you need to think about the application and its needs deeply.

Enter Kubernetes Operators

Operators can be thought of as apps (like android, ios apps) if Kubernetes were an OS. Operators allow direct access to Kubernetes API and let the developer do exactly what they need to make sure the underlying application runs and scales smoothly.

Operators encode an application's deployment and scaling logic as another piece of software. So everything a human operator would have to do to deploy and run the software is now available in a software operator.

Operators essentially make it worthwhile deploy and run stateful applications on Kubernetes at scale. You can see this in action with almost all the major software product vendors - they have developer k8s operators for their software product.

The common pattern we're seeing today is that data related software vendors have gone all in on the operator pattern. However, end user software teams, IT teams etc are still apprehensive.

The purpose of this post is not to show you how to write an operator. There are several great posts on that topic.

💡
My goal here is to convince apprehensive teams and stakeholders that operators are here to stay. They do really take away operational overheads, and they will allow applications scale better.

Let's take a look at some of the popular operators from software vendors.

CockroachDB Operator

CockroachDB is one of most used distributed SQL Databases out there. Developers love the platform and thousands of developers use the product everyday. As of last funding round, CockroachDB Labs was valued at 5B USD.

CockroachDB labs went all in with Kubernetes Operator for their Kubernetes deployments. The operator repo is very active and has quite a list of new features / issues. Note that out of various stateful applications - SQL Databases are most tricky to develop and run at scale.

GitHub - cockroachdb/cockroach-operator: k8s operator for CRDB
k8s operator for CRDB. Contribute to cockroachdb/cockroach-operator development by creating an account on GitHub.

MinIO Operator

Another interesting class of stateful applications is object storage. MinIO is the leader in FOSS object storage category and MinIO Operator has been around for several years now.

MinIO Operator allows seamless deployment and scaling of a MinIO cluster with minimal effort and configuration. The operator repo is pretty active and has high number of stargazers as well.

Disclosure: I am one of the authors of MinIO Operator.

GitHub - minio/operator: MinIO Operator creates/configures/manages MinIO clusters on Kubernetes
MinIO Operator creates/configures/manages MinIO clusters on Kubernetes - GitHub - minio/operator: MinIO Operator creates/configures/manages MinIO clusters on Kubernetes

Druid Operator

OLAP applications are another class of stateful applications that need local persistence and high level of support. Apace Druid ingest high data volume and lets users query that data with subsecond latency.

Druid Operator helps users deploy various Druid components and scale them as needed.

GitHub - druid-io/druid-operator: Druid Kubernetes Operator
Druid Kubernetes Operator. Contribute to druid-io/druid-operator development by creating an account on GitHub.

Summary

I have been in the Kubernetes and cloud space for a while. The common pattern that has really popped up is that Kubernetes adoption is real and growing - end user applications and IT teams are leading this and moving to production workloads on Kubernetes swiftly.

However these teams still sometimes struggle when to comes to deploying and managing Stateful applications on their cluster. While software vendors have swiftly gone in on the k8s operator pattern.

The purpose of this post was to possibly nudge stakeholders towards the operator pattern via some of the most popular Kubernetes operators. Hope the message gets through and we see a wider adoption of operators - and teams see their power IRL!