Background

Years before the 1.0 release of Docker and Kubernetes, the 12 factor manifesto proposed patterns for architecture of a modern cloud hosted application. The 12 factors advised composing an application from stateless processes, relegating stateful components into a category managed outside the cloud orchestration platform. The 12 factor guideline called this a stateful backing service.

The 12 factor ban on stateful processes was really an artifact of technology limitations in the pre-Docker, pre-container-orchestrator world.

Kubernetes has been making steady progress on the mission to deliver support for hosting both stateless and stateful containerized services and service components. Single Pod stateful apps using persistent volume mounts have been in the stable feature category for some time. Support for stateful apps in a multi-Pod context is under active development.

The early versions of Kubernetes started out with the ReplicaSet. ReplicaSets are designed with a weak guarantee – that there should be N replicas of a particular pod template. ReplicaSets are best leveraged for stateless components like web servers, proxies, and application code which handle data but don’t store it.

Later the DaemonSet  was added which allowed leveraging Nodes as stable, long-lived entities. DaemonSets are suitable for various stateless use cases, but can also be a building block for some types of stateful workloads. A DaemonSet is like a “global” policy that makes sure all, or some labeled subset of Nodes, are running an instance of a Pod. Users might want to implement a sharded datastore in their cluster. A few nodes in the cluster, labeled ‘app=datastore’, might be responsible for storing data shards, and pods running on these nodes might serve data. This architecture requires a way to bind pods to specific designated nodes, which is not easily achieved using a Replication Controller. The DaemonSet could be a building block for certain types of software based storage.

If a stateful app requires that each clustered component instances retain a per-instance unique identity, the DaemonSet is not really equipped to deliver all the support required. Zookeeper, or a sharded PostgreSQL database are examples that fall into this category.

The StatefulSet set followed, and brought support for identity management and graceful deployment and scaling. StatefulSet is designated as a beta feature at this point.

Enhancements in the 1.7 Release

Kubernetes has featured a steady cadence of improving support for stateful workloads. The 1.7 release delivers these enhancements for supporting stateful services:

StatefulSets

StatefulSet Updates is a new feature in 1.7, allowing automated updates of stateful applications such as Kafka, Zookeeper and etcd. This feature can be used to upgrade the container images, resource requests and/or limits, labels, and annotations of the Pods. Update strategy is configurable (rolling updates, canary deployments, etc.)

If a StatefulSet does not require ordering, faster scaling and startup can be achieved through Pod Management Policy. This can be a major performance improvement.

DaemonSets

DaemonSets already had automated update support, but 1.7 added smart rollback and history capability.

Local Storage

Local Storage was one of most frequently requested features for stateful applications. Users can now access local storage volumes through the standard PVC/PV interface and via StorageClasses in StatefulSets.

There are some caveats that must be taken into consideration – Local Storage is designated as an alpha feature, subject to change and not recommended for production use.

Today, Kubernetes schedules solely on CPU and memory availability, but to be successful with local storage it must start scheduling based on the storage access a host provides. As of 1.7, storage is not considered a resource for scheduling and therefore an external provisioner cannot determine capacity requirements. This is all a step in a direction to build the foundational work to support future use cases.

Local Storage is likely to prove useful for on premises deployments using converged software based storage providers. It may also be useful for data driven apps that might want to consume consume on Node block storage and local raw block devices instead of just file systems.

Check out these slides from Michelle Au for more details on the new local storage feature.

Lastly, a new volume plugin for StorageOS was added.

The {code} Labs will feature the new 1.7 binaries in a short while, but in the meantime you can easily create a sandbox Kubernetes environment using the ScaleIO environment for testing out StatefulSets in their alpha state as well as DaemonSets.