Hint: Container Storage Interface (CSI) Plug-In doesn’t mean Container Storage-Optimized.
Straight out of DevOps land comes this missive: 90% of the Fortune 1000 are using containers in production, restructuring their architectures to meet the challenges of the next decade, and migrating their applications from virtual and traditional infrastructure to container solutions and Kubernetes (known industry-wide as K8s). It’s going to be an interesting ride fraught with a huge level of misinformation as systems vendors slap a “K8s Ready” label on top of their preexisting products up and down the stack. While the introduction of K8s may not be too challenging at the compute layer, it will offer new complexity to the networking and storage layers which requires a new level of scrutiny on how containers are supported.
To help separate the signal from the noise, we’ve compiled ten key principles for evaluating on-premises, persistent storage platforms to support cloud native applications as you and your organization head down the inevitable path toward a container-centric future.
The Benefits and Challenges of Containers and Kubernetes
We hear a lot about containers and K8s today in conversations with our customers and partners and their desire to achieve the automation, composability, velocity, scalability and efficiency benefits they’ve seen in initial.
Given these potential benefits, it’s obvious why large enterprises, laden with hundreds of applications, are moving aggressively to containers. But selecting the right storage, often the last layer of the stack to move, is essential to realizing them, because hyperscale cloud native applications require persistent storage platforms with very unique characteristics.
As you embark on your journey, you will find systems providers touting their Container Storage Interface (CSI) which marks the most basic form of interoperability. But the storage layer needs more than just interoperability; it should match the dynamism of new applications based on containers and Kubernetes. Here we offer a framework for evaluating storage for cloud native applications that goes beyond buzzwords and is designed to help you get the right storage capabilities to achieve container and cloud native success.
10 Principles for Evaluating Enterprise Cloud Native Storage
1. Application Centric
The storage should be presented to and consumed by the application, not by hypervisors, operating systems or other proxies that complicate and compromise system operation. Your storage system is likely based upon older abstractions of storage such as LUNs, volumes or pools. These once were workable for monolithic applications such as Oracle and SQL databases running on physical servers, but modern storage systems now use an “appinstance” or “storage instance” construct that lets you apply templates programmatically, enabling your container workloads to rapidly consume and release resources as needs ebb and flow. For example, your DevOps team may spin up 100 Kubernetes or Docker containers a day, requiring a stateful, persistent appinstance for just that day and release it after an hour or two. This ensures your application gets what it needs and only when it needs it.
2. Platform Agnostic
The storage should be able to run anywhere in any location, with non-disruptive updates and expansions non-disruptive. While legacy arrays have proved reliable for blinding speed on monolithic applications, tuning them introduces many restrictions, compromises in your usage models and often requires a fleet of highly individualized administrators. Modern, cloud native workloads require a composable platform that can run in racks, aisles and multiple datacenters as YOUR needs grow, without requiring rebuilds and migrations to slow you and your users down. More importantly, in multi-node, scale-out systems, all upgrades MUST be rolling, non-disruptive and minimally impact performance. Look for systems that use standard iSCSI and Ethernet for maximum flexibility as your needs grow to include multiple datacenters, stretch clusters, and other disaster recovery implementations.
3. Declarative and Composable
Storage resources should be declared and composed as required by the applications and services themselves, matching the compute and network layers. Policy-driven systems allow you to make changes to the underlying resources seamlessly from the container perspective. For example, set a policy that includes dedupe, performance and encryption and as you add or remove nodes, the system should autonomously move and re-balance workloads across heterogeneous nodes that comprise the cluster. The system should automatically inventory resources and make split second decisions about the most efficient way to run your containers.
One additional tip is to ensure that these policies are changeable over time. As basic as it may sound, many systems based on policies give the illusion that they are dynamic, but in practice are static. Test the ability to change policies and have those changes ripple through the data so that your storage is as dynamic as possible.
4. Programmable & API Driven
Storage resources must be able to be provisioned, consumed, moved, and managed by API. Even better, these actions should be done autonomously by the system in response to application instantiation, which is at the heart of an infrastructure designed for self-service. Without this capability developers will not be able to generate their own storage when they want it, which becomes a bottleneck in the development process and requires the very thing that containers are designed to eliminate: manual intervention. In practice, programmability allows you to query the storage system to assign and reclaim persistent volumes on an as needed basis.
5. Natively Secure
The security of the storage should be endemic to the system. Storage must fit into the overarching security framework and provide inline and post-process encryption as well as Role Based Access Control. Bolting on security capabilities should be avoided since it often requires extra overhead and impacts storage performance. Your system should be able to programmatically encrypt at the container level and do so programmatically as well as utilized data at rest encryption capabilities to minimize any performance impacts.
6. Designed for Agility
The fundamental goal of a Kubernetes implementation is agility for DevOps and for the application overall. The storage platform should be dynamic in terms of capacity, location, and all other key storage parameters including system performance, availability (controlled via the number of replicas desired), and durability. The platform itself should be able to move the location of the data, dynamically resize, and take snapshots of volumes. Further, it should be easily tunable and customizable for each application or tenant via policy, using policies to create storage classes. The most powerful systems react dynamically when system resources are increased and during datacenter outages, when workloads may need to be shifted to other failure domains.
7. Designed for Performance
The storage platform should offer deterministic performance by application and by tenant to support a range of requirements across a complex topology of distributed environments. Performance is comprised of the IOPs thresholds, media type (flash, NVMe, Optane), data efficiency desires (compression, dedupe) and the dynamic reaction to changes in workload demand or cluster resources. In less dynamic applications, administrators could often set a single service level objective (SLO) and check in on the achievement of those targets intermittently. But in today’s environment, the system itself must “check in” on the achievement of SLOs constantly and react in real-time to orchestrate the changes needed to achieve them.
8. Continuous Availability
The storage platform should ensure and provide high availability, durability and consistency even as application needs change and the environment scales. For example, modern storage systems are leaving RAID behind and moving to shared-nothing designs where data replicas are dispersed to different storage nodes situated across failure domains or metro-geographies, all to maximize availability. This is the new way to drive availability levels at a lower cost.
Having fine-grained, programmatic control over availability levels by workload is essential since a fraction of your applications and data will inevitably be more important than others. Most enterprises will experience wide variances in data availability. While some applications may generate just three replicas for apps and data of lesser import—housing some fraction on the lowest cost media, while others use five replicas for the most important instances stretched across three data centers with an aggressive snapshot schedule. Having options beyond the standard and fixed RAID schemes is often deemed essential for cloud native environments, which is consistent with the architecture of most cloud service providers.
9. Support More than Cloud Native Applications
This is not a typo. The move to containers and cloud native application design is a generational shift, and as such will take many large organizations a generation to complete it. Embracing a storage system that supports more than containers is critical to avoiding the creation of yet another data silo. As you evaluate a new storage platform for cloud native workloads, you should similarly ensure that it serves virtualized and bare metal applications to ensure the freedom of data usage for applications as they transition. In other words, while your organization may be racing toward the future, your system also needs to enable the past without losing a step. The storage platform should serve Kubernetes, Docker, VMware and bare metal workloads.
10. Operate Like Kubernetes Itself — the Modern Way
Carving out traditional LUNs is a tried and true method for providing storage resources to traditional applications. But by embracing storage built on policies that are declarative and built with composability in mind, enterprises can mirror the dynamism of a cloud native compute environment. Storage needn’t be static — DevOps teams should be able to spin up new instances and make new replicas to serve the application as traffic grows.
Conclusion
These 10 principles will help ensure that you make the right choice for modernizing your on-premises storage infrastructure to accommodate containers.
At Datera, we designed our cloud data management platform to utilize commodity servers, allowing you to build an enterprise-grade storage environment with these principles in mind, yielding a system that is:
- Built for Hyperscale: Scale up, scale-out, and scale across the data center on commodity servers. Legacy systems often force organizations to be predictive in storage planning, which often restricts growth. With our approach, organizations start with a set of nodes and, as new nodes are added, performance, capacity and availability grow. At these moments, the system re-inventories itself and autonomously applies the extra performance against workload policies already in place. This yields a horizontally scaling, built for hyperscale environment.
- Built for Autonomous Operations: Using policies to drive the storage requirements minimizes operator overhead and ensures that those requirements can systematically and programmatically adapt as application or tenant needs change and environments scale. These policies are built for change, so that all changes apply retroactively to the data written earlier to the system.
- Built for Constant Change: Datera serves bare metal, hypervisors, and container deployments on a single platform, helping enterprises avoid another island of data in their data operations as they migrate workloads over time.
- Built for Continuous Availability: Datera’s shared nothing architecture eliminates single point of failure challenges, and overcomes multiple component failures and multiple node failures without loss of data or performance. The platform uses telemetry from all nodes to constantly monitor the system, ensuring availability and other parameters in the SLO manifest.
- Built for Tier 1 Performance With Services: Even when enabling dedupe, compression and encryption, Datera supplies sub 200 microseconds latency and can scale to millions of IOPs, which grow as nodes are added to the cluster. Further, Datera can provide this class of performance with essential data services like deduplication and encryption enabled with minimal performance impact.
Lastly, Datera operates like Kubernetes itself, letting you rapidly deploy persistent volume claims to support an application’s distinct requirements and take the appropriate action (e.g., add capacity, add high performance media, make a replica) to meet them. Datera autonomously supports container changes over time, spinning up new services, new tenants, or even recapturing resources automatically as you spin down Kubernetes instances.
Modern workloads may not require a modern approach, but enterprises can gain from new thinking and new capabilities that Datera’s software-driven approach delivers.
About the Author
Brett Schechter is the Sr. Director of Technical Marketing at Datera. He has spent time with many of the most innovative storage vendors of the past decade, as well as running product management for Rackspace storage, where he set the requirements for the first managed public/hybrid cloud storage products on the market. In his current role, he is responsible for collaborating with enterprises on their system requirements and test plans.