Luck Shouldn’t Be Part of Your Storage Management Strategy

I always find it interesting when customers ask practical questions and get theoretical answers. This happens often, but nowhere more than around the performance and availability of storage systems.

Figure 1 – Calculated Availability? Huh?

Having attended the school of hard knocks, customers are really asking whether the system will be there when they need it.
The answers they often get are both theoretical and full of caveats – almost as if contrived to limit liability. That can’t be good for customer satisfaction.

When I talk to customers about this, I talk about operational availability. It’s the “measurement of how long a system has been available to use when compared with how long it should have been available to be used,” or so says Wikipedia. But while some storage systems might have an operational availability up to 10 years, the reality is that advances in technology and what people need or want to do with their data are responsible for the scourges known as the forklift upgrade and data migration.

Datera has gone above and beyond operational availability.

Recognizing this, we’ve designed our system for continuous operational availability (COA). COA means that the Datera system can be extended and adapted, minimizing the need to buy and deploy new systems for operations that are likely throughout the lifecycle of extracting value from data stored on the system. Historically, data lives much longer than the lifecycle of the infrastructure where it is stored. At Datera, we’ve built for this longevity.

I’m going to make a profoundly obvious statement – customer’s business needs change over time.

Interruptions to data accessibility and availability are far more likely to be caused by changes required by the business than by highly improbable events such as double or triple failures of a specific type and the sequence of failures (see theoretical availability.) Businesses need to deploy new applications, retire old applications, use data in new ways, manage fluctuations in application demand, exploit new technology to stay competitive and double their data every 12 – 18 months.

Until Datera, many of these seemingly normal business operations required deployment of entirely new systems as the old systems couldn’t adapt – they simply weren’t built to change.

The problem is that existing systems treat these operations, if they can do them at all, as what I call fingers-crossed events. These events tread upon seldom executed code paths that are extremely difficult to test, and hence seldom tested. Fingers crossed – this is where luck is being applied to your storage management strategy. Let’s hope it’s good luck and not the other kind!

Datera embraces change, wanting the customer to change it.

Datera comes at it from a different angle. The presumption is that the system will undergo constant change. And embracing change is liberating both in a practical and emotional sense.

In a practical sense, Datera is built on a common mechanism – a control plane – used in the execution of every I/O that assumes the system is changing, and it does this without imposing a performance penalty. Furthermore, the control plane is constantly evaluating the state of the system to manage:

  • Programmatically initiated change (allocating/deallocating application storage)

  • Administratively initiated change (changes to policy)

  • Planned or unplanned removal or addition of resources (failures/scaling)

  • Quality of Service (internally initiated change to optimize QoS)

In an emotional sense, a system built to be changed, and where change is part of the expected and normal behavior, emboldens a customer to both deploy sooner and with greater confidence. Equally important, it lets them seek out change to derive a business advantage when they know the system encourages changes to be made and changes can be made without involving luck.

How does this work?

Every transaction in the system contemplates both the present state and a desired future state and completes each transaction in consideration of both.

This is very powerful!

Let’s walk through an example…

Imagine you have 100 instances of an application, each of which has four volumes of varying types.

Some volumes for journals, some for data, some for indexes, some for logs. Some fast, some big…

As the application owner, you just got notified that a new version of the application is available that contains a much-awaited feature. Finally! The release notes indicate the benefits of the new feature are dependent on journal performance that is twice your current configuration, but that your bulk data can now be stored on less expensive media.

Unless you are using Datera, it’s safe to assume you are either using traditional storage or first-generation scale-out storage. Now, I’d like you to imagine the conversation with your storage administrator:

You: Hi Chris, how’s it going?

Storage Admin (Chris): Kind of crazy, everyone has a new application that needs storage, and nobody realizes it takes 27 steps to provision a single volume!

You: Well, I’m glad that I have my storage already! I wanted to mention that a new version of my application is coming out next week and it has a new feature I have been waiting more than a year to see.

Storage Admin: Cool, how can I help?

You: The feature depends on doubling the performance of the journal volumes, but the bulk data volumes can now be stored on cheaper media that should be half the cost. I have 100 application instances and each one uses four volumes. I just need the journal volumes and data volumes to be adjusted for the new parameters. Can you do that by next week, so I can test the new feature?

Storage Admin: Muttering… makes hand gesture, walks away, more indiscernible words…

You (thinking to yourself): I’m not sure what Chris said, but I do know what that gesture means. I don’t think I’m getting my storage any time soon…

Now, let’s imagine that you are using Datera, which was built for this opportunity.

When you deployed the application, Datera provided you a template for the application policy that specified best practices for the storage. Datera also saw the new release of the application and provided you with an updated set of best practices for the application.

The application policy was updated to reflect the new settings and the Datera system reallocated resources to all 200 volumes – without disruption – to meet the new objectives.

The Datera system did this for you automatically because, well, this is the way the Datera system works – all the time, no special cases, no luck required.

Inside the system the policy change caused the control plane to evaluate and reallocate resources. It created a new placement policy for the affected volumes. It gave this new placement policy, i.e. the desired future state, to the data plane and all I/Os after that point were executed against both the present and desired future state. After a short period of time the system had converged to the desired future state, becoming the new present state.

Equally important, the process used to change the application policy is the exact same approach used to scale the system up or down, add new technologies, manage tenants and quality of service. The same code, executed all the time. That’s smart, not lucky!

Datera: Continuous operational availability, built without compromise and built for change – empowering!