This article explores how to evaluate enterprise storage choices for hybrid and private cloud environments that use a mix of media from SATA to NVMe, and a mix of application deployment options such as VMs, containers, and bare metal.
I recently upsized to a big ol’ pickup truck, a Ford F-350 with four-wheel drive, a 460 cubic inch engine, more than 500 lb-ft. of torque, and heavy-duty axles. You see, we moved onto a new property out in the country a few months ago, and it turns out we need lots of dirt, rocks, gravel… 6,000 tons so far and counting. A rock quarry nearby gives us a great deal. Unfortunately, the quarry is located up a steep hill with sharp curves and several drop-offs (what people who don’t live in the mountains call ‘cliffs’).
With 15,000 lbs trying to push you over a cliff, you need the right tools for the job. My truck needed new tires. The previous owner had installed “light-duty” truck tires from a smaller truck he owned. It was simpler and cheaper to use what he already had than to install the correct tires for the truck.
When I meet with enterprise architects, I frequently discover that they are knowingly using the wrong tool for the job. They may be using an expensive storage system when more cost-effective approaches are appropriate, or they may be deploying their large-scale, mission-critical workloads on infrastructure designed for modest scale. These customers are some of the largest corporations in the world with very bright architects, so why would they do this? The answer is simple but problematic: the path of least resistance is to stretch something deployed rather than deploy something new. Devil-you-know sort of logic.
They may be entertaining the idea of using a given type of product, say hyper-converged or hardware-defined storage, significantly outside of its designed purpose. I am often asked: where does one type of product leave off and related types pick up? Disaggregated, converged – is it an either/or choice and how does one evaluate?
Let’s start with Converged Infrastructure. I am a big fan of hyper-converged infrastructure (HCI). It has the great benefit of administrative simplicity for a modest number of application instances (say, 10s), or 10s of terabytes of data. This maps well to small-medium business, departmental, and edge-compute needs where skill sets may be limited. Where it begins to face challenges is when the scale, rate of growth/change of the applications, or the data exceeds the administrative design center. Will it work? Probably. Will it deliver the simplicity you bought it for initially? Less and less. At some point, a proper at-scale deployment methodology is required.
I often get questions about the use of a hardware-defined legacy enterprise array for dynamic environments such as spinning up and down cloud-native apps in containers. Will it work? Again, the answer is “sort of”, but alternatives would likely be better due to the administrative burden and the lag in response time to the DevOps team. In some organizations using a familiar but overtaxed approach may be the right answer –- until it isn’t.
Having the right tires on my truck matters only periodically, say once every couple of weeks when I am hauling 15,000 lbs of gravel toward a hairpin turn with a cliff on the side. The rest of the time light-duty tires work just fine – if you’re not at the bottom of a cliff.
When we do proofs-of-concept (POCs) with large enterprises, inevitably they want to run some nasty combination of failures during a high workload to see how the Datera system responds. They do this not because they are trying to make us fail, but because they have gone off the proverbial cliff while hauling 15,000 lbs of gravel and don’t want to do it again. Inevitably, the back story they share is that they used a well-known company’s market-leading product and were told it would work just fine for their use case, but it failed at the worst possible time. Sometimes the person running the POC has the battle scars; sometimes it’s their replacement.
Now let’s talk about Disaggregated Infrastructure. Software-defined storage (SDS) has gone through several dramatic changes during its decades-long evolution to emerge triumphantly at last in its current form. Enterprises are looking to SDS to solve a number of problems, chief among them are a desire to better observe, control, and understand business data. With SDS, you gain more independence for your data from the underlying deployment model. Whether it’s on VMs, containers or bare metal doesn’t matter, as the data is accessible and portable across each of these. Performance and metadata is no longer tied to hardware, but is made accessible through software and thus easier to automate, as well. As a bonus, you are future-proofing your storage so that it can now support a wide range of commodity hardware with programmability via open APIs.
Whether you decide to use Datera SDS, vSAN, VxFlex OS, VxBlock, HPE Primera, Nutanix or others all depends on your use cases. The marketing of these products would have you believe that they are universally good, despite how they are actually designed. Rather, the customer must figure out the limits for their environment themselves, which brings us back to the key question: how?
Having been in the industry for more than 30 years now, I have talked to thousands of customers and been involved in the design of more than 25 storage architectures across nine different companies. Every design was a balance of value and tradeoffs. At every one of those nine companies, I have been asked by sales to provide the universal answer to the question: what is the right storage for my customer? The answer, stated with confidence, is always 11. Why 11? The answer makes as much sense as the question. Every enterprise is on a journey and every journey is different.
My automotive needs are diverse. I am a rural homeowner, college tennis coach, and sports car enthusiast. I need to haul rock, haul tennis balls, and like to haul ass occasionally.
My choice in transportation at any given moment is often dictated by past events. I will spend $10 in gas to go to the grocery store when the more efficient car is being used by my wife. I can’t get rid of the truck – when I need it, I really need it – and I already have it so it is easy to keep. But the more time I have to spend $10 in gas to go into town, the more likely I am to get another vehicle. What I may do is get something for the other needs when the opportunity arises.
The same is true for enterprises. There is no single storage solution that will fulfill all current and future use cases – though vendors have tried. But when enterprises can map out their desired state by looking ahead to the future, as well as to the past, architects can modernize based on anticipated needs. I created a chart to help.
The chart below distills many factors that should be considered in making deployment decisions for a new project. First, though, I’d like to dispense with several “red herrings” that often get added to a discussion, but are actually a distraction to an informed approach:
- SAN/NAS Versus SDS. Nobody will argue the historical value of SAN and NAS, but demand for agility and velocity clearly point to converged or disaggregated software-defined solutions looking forward.
- Virtual Machines, Containers, and Bare Metal. All of these are relevant and will be present in large enterprises. For a given workload the decision will have already been made when architecting the infrastructure to support the workload.
- Lock-in Concerns. Vendor lock-in (what the vendors call loyalty) is a non-technical factor in the relationship, good or bad, that exists. It is an important factor, just not one grounded in technical merit.
- Business Model. On-premises solutions can be paid as a service or purchased as a product for all leading vendors so this is a non-technical financial choice. Cloud solutions are paid as a service, and utilizing cloud is typically a decision made prior to architecting the infrastructure.
- Obfuscated Terminology. To expand into new markets, companies will invent new terminology and apply it to existing designs. The list is too long to go into here, but a recent example is dHCI (disaggregated hyper-converged infrastructure). It attempts to bridge alternative approaches and, as a result, takes meaning out of both alternatives (disaggregated ≠ converged). This confusion is often the intended purpose of the originator under the belief they can sell you what they have by calling it something new – a euphemism for this is “lipstick-on-a-pig.”
The answer is 11 – for the 11 factors to take under consideration in planning new deployments.
Along the X-Axis are five factors that the IT architect can influence by making implementation choices. Along the Y-Axis are six factors that constrain the choices – the challenges faced by the IT architect. To help simplify the chart I have removed legacy approaches (i.e. SAN and NAS), but as you will notice I have left DAS as it remains an important choice for many workloads.
A further distillation reveals that the choice between converged solutions (HCI such as vSAN, Nutanix, Portworx) and disaggregated solutions (e.g. Datera, Qumulo, WekaIO, Scality, vFlex OS) is most greatly influenced by scale, diversity, and predictability. The greater these factors, the more weight should be given to disaggregation, as these designs are built for these purposes. If a deployment is a <100 VMware instances in a single site and growing at <10% annual capacity, a VMware HCI solution is a simpler approach. If a project is 100s of application instances incorporating multiple deployment types (containers, bare metal, virtual machines) with fluctuating capacity and performance demands, a disaggregated approach is appropriate.
Separation of concerns in disaggregated solutions enables location, resiliency, performance etc. to be optimized independently (agility) to deal with scale, diversity, and change at the expense of complexity that is inherent at scale. Integration of concerns in converged solutions enables velocity and ease of administration, but must be constrained in scope to deliver this value.
As you will note from the chart, there is significant overlap in disaggregated and converged solutions. This is a good thing for enterprises as it allows these approaches to be used more extensively – leveraging skills sets and purchasing relationships, but at the same time setting up the situation of using the wrong tool by extending a given approach too far – heading for the proverbial cliff at exactly the wrong time.
It may seem odd to declare that my company, Datera, is not the right answer for many storage problems, but it is honest. If you have diverse, large scale, performance demanding, mission-critical workloads and need velocity and agility to deal with uncertainty and opportunity, give us a call.