You’re ready to take Tier 1 apps into production on Kubernetes, but storage is holding you back. Here’s how to configure Datera to optimize for production and scale.
You’ve containerized apps in Kubernetes and are ready to move to production but accessing storage dynamically has been a roadblock. This is where your choice of underlying storage fabric can make or break your ability to scale.
Your infrastructure is a wide assortment of server and storage classes, generations, and media—each one with a different personality. With the right storage fabric, neither the app developer nor the storage admin needs to be prescriptive about which resources are the best fit—that should be determined in human language terms (e.g. super performant, highly secure) and based on Storage Classes defined in Kubernetes.
Defining Storage Classes
Optimizing storage in Kubernetes is achieved by managing a class of storage against application intent. Just like a Terraform script that defines application needs, the storage platform should supply storage needs using templates whether you are looking for IOPs, latency, scale, efficiency (compression and dedupe), security (encryption).
Dynamic storage provisioning in Kubernetes is based on the Storage Classes. A persistent volume uses a given Storage Class specified into its definition file. A claim can request a particular class by specifying the name of a Storage Class in its definition file. Only volumes of the requested class can be bound to the claim requesting that class.
Multiple Storage Classes can be defined to match the diverse requirements on storage resources across multiple applications. This allows the cluster administrator to define multiple types of storage within a cluster, each with a custom set of parameters.
Your storage fabric should then have an intelligent optimizer that analyzes the user request, matches to resources, and places them appropriately onto a server that matches the personality of the data. It should also be policy-driven and use telemetry to continuously inventory and optimize for application and microservice needs without human involvement.
Let’s say you want to create a Storage Class to access your MySQL data, add extra protection by making 3 replicas, and place it in a higher performance tier to serve reads/writes better. You can set this up in Datera with just a few steps.
Create a Storage Class for the Datera storage backend as in the following datera-storage-class.yaml file example here:
$ cat datera-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
labels:
name: datera-storage-class
namespace:
provisioner: io.datera.csi.dsp
reclaimPolicy: Delete
parameters:
template: “basic”
secretName: “mysecret”
secretNamespace: “default”
fsType: “xfs”
$ kubectl create -f datera-storage-class.yaml
Datera allows the Kubernetes cluster administrator to configure a wide range of parameters in the Storage Class definition. Another way to define a Storage Class is with labels (policies), e.g. gold, silver, bronze. These options convey the storage personality and define necessary policies at Datera Storage System level.
It’s very easy to compose three or 300 Storage Classes on the fly in Datera. To manage data services such as dedupe, compression and encryption for high velocity apps and microservices, you can attach volumes to a storage pool that can be further configured alongside policies for data protection and efficiency. Where typically this is done in a silo, Datera can achieve this level of velocity and scale for Tier 1 workloads.
If you have absolutely no idea what the app requirements will be, that’s ok—Datera uses AI/ML to find the best placement, and the resources and will automatically adjust based on inputs. For mature apps, you can graduate those to policies and templates.
No Scale Without Automation
Kubernetes lets applications scale along with a Storage Class without deviating from the base definition/requirements of resources. Datera keeps that promise intact by binding Storage Classes to templates, and homogeneously extend resources (volumes) to match the intent of applications (consumers).
As application needs change, the storage should adapt alongside it and not rely on human intervention. This is done by defining policies and templates. The storage fabric should also recognize and adapt to new nodes or hardware and automatically adjust to enhance performance, capacity, and resilience of a cluster, as well as making the resources available to all the underlying PVs.
Create the StatefulSet:
$ kubectl create -f consul-sts.yaml
List the pods:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
consul-0 1/1 Running 0 32s 10.38.3.152 kubew03
consul-1 0/1 PodInitializing 0 15s 10.38.4.4 kubew04
The first pod has been created. The second pod will be created only after the first one is up and running, and so on. StatefulSets behave this way because some stateful applications can fail if two or more cluster members come up at the same time. For a StatefulSet with N replicas, when pods are deployed, they are created sequentially, in order from {0…N-1} with a sticky, unique identity in the form $(statefulset name)-$(ordinal). The (i)th pod is not created until the (i-1)th is running. This ensures a predictable order of pod creation. However, if the order of pod creation is not strictly required, it is possible to create pods in parallel by setting the podManagementPolicy: Parallel option in the StatefulSet template.
List the pods again to see how the pod creation is progressing:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
consul-0 1/1 Running 0 29m 10.38.3.152 kubew03
consul-1 1/1 Running 0 28m 10.38.4.4 kubew04
consul-2 1/1 Running 0 27m 10.38.5.128 kubew05
Now all pods are running and forming the initial cluster.
Scaling the Cluster
Scaling down a StatefulSet and then scaling it up is similar to deleting a pod and waiting for the StatefulSet to recreate it. Please, remember that scaling down a StatefulSet only deletes the pods, but leaves the Persistent Volume Claims. Also note that scaling down and scaling up is performed similar to how pods are created when the StatefulSet is created. When scaling down, the pod with the highest index is deleted first: only after that pod gets deleted, the pod with the second highest index is deleted, and so on.
What is the expected behavior scaling up the Consul cluster? Since the Consul cluster is based on the Raft algorithm, we have to scale up our 3 nodes cluster by 2 nodes at the same time because an odd number of nodes is always required to form a healthy Consul cluster. We also expect a new Persistent Volume Claim is created for each new pod.
Scale the StatefulSet:
$ kubectl scale sts consul --replicas=5
By listing the pods, we see our Consul cluster gets scaled up:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
consul-0 1/1 Running 0 5m 10.38.3.160 kubew03
consul-1 1/1 Running 0 5m 10.38.4.10 kubew04
consul-2 1/1 Running 0 4m 10.38.5.132 kubew05
consul-3 1/1 Running 0 1m 10.38.3.161 kubew03
consul-4 1/1 Running 0 1m 10.38.4.11 kubew04
Check the membership of the scaled cluster:
$ kubectl exec -it consul-0 — consul members
NAME ADDRESS STATUS TYPE BUILD PROTOCOL DC SEGMENT
consul-0 10.38.3.160:8301 alive server 1.0.2 2 kubernetes
consul-1 10.38.4.10:8301 alive server 1.0.2 2 kubernetes
consul-2 10.38.5.132:8301 alive server 1.0.2 2 kubernetes
consul-3 10.38.3.161:8301 alive server 1.0.2 2 kubernetes
consul-4 10.38.4.11:8301 alive server 1.0.2 2 kubernetes
Also check the dynamic Datera storage provisioner created the additional volumes:
$ kubectl get pvc -o wide
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-consul-0 Bound pvc-91df35af-0123-11e8-86d2-000c29f8a512 1Gi RWO datera-storage-class 7m
data-consul-1 Bound pvc-a010257c-0123-11e8-86d2-000c29f8a512 1Gi RWO datera-storage-class 6m
data-consul-2 Bound pvc-adaa4d2d-0123-11e8-86d2-000c29f8a512 1Gi RWO datera-storage-class 6m
data-consul-3 Bound pvc-1b1b9bd6-0124-11e8-86d2-000c29f8a512 1Gi RWO datera-storage-class 3m
data-consul-4 Bound pvc-28feff1c-0124-11e8-86d2-000c29f8a512 1Gi RWO datera-storage-class 2m
Today, companies may be deploying apps by Storage Class. But according to IDC 90% of new applications will be built with a micro-services architecture this year. Data velocity must match micro-services with the right Storage Class, and continually and granularly enhance the storage environment with additional performance and capabilities—not via legacy arrays or monolithic and silo-ed major migrations, but via discrete racks. In other words, autonomous storage is a necessity for any business that needs to scale.
If you are planning to deploy Kubernetes, and experiencing any storage challenges with your legacy infrastructure, please click here to watch how Datera overcomes those challenges.