Tim Toady or TMTOWTDI is better known as There’s more than one way to do it. It’s common among programming languages and everyday life. However, when there’s multiple ways to skin a cat in a new container technology it creates a sense of confusion and deeper insight is needed. Kubernetes has multiple ways of using PersistentVolumes from single PodsReplicaSets and more, but there are really two overarching objects that stand out as the preferred method for running applications. Persistent applications gain tremendous benefit with orchestration by equipping them with out of the box high availability (HA) or making an existent HA application fully automated.

The Difference on Paper

The StatefulSet acts as a controller in Kubernetes to deploy applications according to a specified rule set and is tailored towards the use of persistent and stateful applications. However, a Deployment can accomplish the same thing. So what’s stopping you other than the Kubernetes documentation explicitly calling out “If an application doesn’t require any stable identifiers or ordered deployment, deletion, or scaling, you should deploy your application with a controller that provides a set of stateless replicas. Controllers such as Deployment or Replica Set may be better suited to your stateless needs.“ Let’s examine.

 

Deployment StatefulSet
Will roll out a Replica Set and can scale to facilitate more load. Ordered, graceful deployment and scaling.
Rolling updates are done by updating the PodTemplateSpec of the Deployment. A new ReplicaSet is created and the Deployment manages moving the Pods from the old ReplicaSet to the new one at a controlled rate. Ordered, automated rolling updates.
Rollback to an earlier Deployment. No rollback, only deleting and scaling down.
Networking Service is separate. StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods.
Documentation doesn’t say it’s in beta but the api requires use of the v1beta1 tag. Currently in Beta.

From the table above, it’s safe to assume the biggest differentiators between the two are the requirement of a networking service and the documented examples showing persistent storage with StatefulSets, however there is more than meets the eye.

StatefulSets in Action

A requirement of a StatefulSet is satisfying the network identity of the Pods. To do this, a standard Kubernetes Service must be deployed using the selector and app to identify the logical mapping to the StatefulSet. For the examples in this post, Postgres is the application used.

The following YAML signifies a Kubernetes service that is going to be named pgnet. This service is exposing port 5432, which is suitably named pgport. In this instance, the published network type is clusterIP which means this Postgres service is only accessible within the cluster. Since the value of None is expressed, there isn’t a predefined IP. View the Publishing Services documentation to read how this can be exposed outside of Kubernetes and onto an external network. The most crucial part is the selector portion where it needs to match the name of app that’s being deployed as the StatefulSet.

 

apiVersion: v1
kind: Service
metadata:
name: pgnet
labels:
app: pgnet
spec:
ports:
- port: 5432
name: pgport
clusterIP: None
selector:
app: postgres

 

The StatefulSet syntax is very similar to a Deployment. There is the logical tie of the serviceName that must match the Service created earlier. The same parameters necessary for the containers is identical for ports, environmental variables, volumeMounts, etc.

The unique differences between a StatefulSet and a Deployment are present in a few different locations. The name plays a role in how kubectl displays the names which will be seen later. The volume portion is similar to a Deployment where there is a logical mapping from the container volumeMounts name to the name specified in volumeClaimTemplates but that’s where it ends. The StatefulSet has a requirement to use Dynamic Provisioning of volumes based on a predefined StorageClass. Whereas a Deployment can use a StorageClass or a pre-defined PersistentVolume. Look at the StorageClass for this example at sc.yaml.

 

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: pgdatabase
spec:
serviceName: "postgres"
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres01
image: postgres
ports:
- containerPort: 5432
name: pgport
env:
- name: POSTGRES_PASSWORD
value: "Password123!"
volumeMounts:
- name: pgvolume
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: pgvolume
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: sio-small
resources:
requests:
storage: 16Gi

 

As mentioned earlier, once a StatefulSet is deployed the name plays a role in how it’s display. The pods in this example take on the name pgdatabase but in sequential order based on the number of replicas vs a Deployment where each would have a unique identifier of multiple random characters. The other unique difference is when a pod is deleted. With a deployment, a new pod is created and a new unique identifier is attached to the name. In a StatefulSet, the pod is recreated with the same name and properties to maintain a consistent identity.

Another interesting difference is when a Pod is deleted. In a Deployment, a Pod will be rescheduled on a different node in the Kubernetes cluster. With a StatefulSet, it’s recreated on the same exact node as it was running on previously. This is the default nature of a StatefulSet because there shouldn’t be a reason to forcefully delete a Pod from a StatefulSet. Kubernetes assumes that the applications running in a StatefulSet require a stable network identity and stable storage and multiple instances of the same Pod can lead to data corruption. If a node/host becomes unreachable because it’s down for scheduled maintenance or becomes partitioned from the network, the Pod will go into an “Unknown” or “Terminating” state. The Pod will not be rescheduled to a different node in the Kubernetes cluster unless the Node object is manually deleted or deleted through the Node Controller, the kubelet on the unresponsive Node starts responding and kills the Pod to remove its entry from the apiserver, or is forcefully deleted. Forceful deletions may result in a volume that doesn’t properly become unmounted and the rescheduled Pod will fail.

Why is one preferred over the other?

After reading the high availability practice of a StatefulSet, why would you want to use that as the recommended method for persistent applications when a Deployment seems to do the job much better? Kubernetes doesn’t want a Pod to be immediately restarted on another node to reduce a possible split-brain scenario. However, if the storage provider is capable, pod.Spec.TerminationGracePeriodSeconds can be added to the StatefulSet and the Controller will automatically restart the Pod on a new host after it’s gone into an “Unknown” state. volumeClaimTemplates for persistent volumes are used to ensure state can be kept across component restarts. It’s also an assumption that this action is taking place because Kubernetes now supports local storage. If storage isn’t centralized or replicated between nodes, then there is no other option to safely restore your data unless the Node comes online again.

Watch the video below to see Deployments and StatefulSets in action using ScaleIO. You can spin up your own Kubernetes environment to test out persistent applications in the {code} Labs. If you want to read more about StatefulSets, check out Stateful containerized applications with Kubernetes by Josh Berkus.