Taking a Pulse on Red Hat: Ceph and OpenShift in 2020

It’s been three years since I worked up a software-defined comparison between Red Hat Ceph and Datera, which you can see here for reference. That’s 30 years in technology time (one year of human life equals 10 in technology evolution), so it’s more than time for an update. And, as you’d expect, a sea change has occurred during that period not only for each storage offering, but also in the preeminence of containers and Kubernetes as a foundation of future applications.

Setting the Scene

The use of software-defined technologies in storage and other layers of the IT stack has gone mainstream, whereas just over our shoulder it was still a niche, early adopter market. Recently, we profiled just how far software-defined technologies in the data plane have come, outgrowing classical hardware-defined arrays by 5X.

Red Hat is also a different entity altogether, now part of IBM with a heightened ability to reach Big Blue customers and beyond. Both Red Hat and Datera continue to see significantly more customer adoption on the back of powerful new feature development powering performance and usability improvements. Datera has experienced increased adoption in the Global 1000 enterprise space, supporting transactional IO use cases like MySQL databases while Ceph is mainly deployed at service providers and in developer heavy environments.

In the meantime, containers have emerged as the central technology of the future, with now half of the Fortune 100 reporting they have rolled out Kubernetes in production. Red Hat has made a big investment in Kubernetes with its OpenShift platform, and similarly Datera has cemented its support for a variety of container orchestrators including OpenShift.

Technology Strides

Ceph has made strides and has IBM’s support, which is good news for the software-defined storage market, particularly in the following areas:

Ease of Rollout, Use and Reporting

Ceph has always been viewed as a powerful engine for the right kind of use cases, but considered to be a bit of a mixed bag on the ease of rollout and ease of use fronts. But our friends at Red Hat have taken many steps to emulate other SDS technologies like VMware vSAN and Datera that excel here. A case in point is Ceph’s Admin dashboard, which provides a graphic view of the overall cluster.

Software Defined Systems - Cluster Dashboard — Ceph’s Cluster Dashboard

SDDC - Data Analytics Cloud Portal — Datera’s Analytics Cloud Portal

Hyperscale

While making our platform easier to implement and improving our real-time analytics using telemetry data from every node remain fixtures on our engineering agenda, our focus in 2019 focused squarely on scaling our deployments and adding a slate of new media and server vendor options to reduce latency even further. On the hardware side, we have improved reporting across all our major supported server platforms including HPE, Fujitsu, Dell, Supermicro, Quanta, and Intel. We also added more predictive features around latency reporting and capacity projections. We are also tracking our customer’s production deployments and are happy to see 70%+ of all write IOs are serviced under 131 µs.

Our Fortune 1000 customers have challenged us to scale to entirely new levels, since our customers typically start with a petabyte of capacity and add from there. To this end, we put ease of rollout on display in multiple forms which are best left to the eye to view rather than discussed:

Demo: 20 Datera Nodes Up and Initialized in 4 Minutes

But getting nodes up and running is useless if the volumes aren’t set to deliver the right capabilities to individual applications or tenants. To this end, our engineering team has made doing this equally easy, which we refer to as developing a storage class—gold, platinum, silver or whatever precious metal or naming convention you prefer.

Datera Demo: 5 Storage Classes in 5 Minutes

Datera has continued to refine and extend our policy approach to management adding node labels and allowing users to craft granular control over volume placement throughout the system:

Datera Enterprise Storage System Policies

Kubernetes, OpenShift and Container Acceleration

Big Blue and Datera also share a commitment to supporting containers and container orchestrators. Red Hat has made a big bet on OpenShift and similarly, Datera has made optimizing Kubernetes & OpenShift core in its technology strategy, again better seen than yapped about:

Demo: Persistent Volume Claims and Datera PV Integration

On The Block: Ceph Bluestore and Datera’s Extent Store

Bluestore was released in 2017 and as an alternative to using traditional POSIX file systems (filestore) to manage data on each disk. Using existing file systems provides robust data integrity and block management but comes at a great cost to performance and latency as a block storage backend. Bluestore adds a method of managing the block metadata on the disk. To improve performance, metadata can be placed on separate media, which is a common technique for traditional file systems as well. In Bluestore’s case, it can be placed on an NVDIMM type device or Intel’s new Optane DIMM technology.

The diagram below shows a block diagram of how data, and metadata, flow with the Filestore backend and with the Bluestore backend:

SDDC Migration

In our case, our founding architects recognized from day 1 that to achieve ultra-low latency, they needed to build a system that did not rely on existing POSIX file system technologies. To this end, the Datera Extent Store is built using log structure techniques to increase performance and reduce wear on flash media. Log structure commits large blocks of data to media at a time, which we refer to as buckets, which can house either the data itself or simply the metadata. These buckets have different behaviors based on the type so that the storage can be further optimized.

Composable Infrastructure Solutions

Final Thoughts

Ceph has come quite some way in the last 30 technology years, offering a massive number of features and capabilities which regrettably come at a cost exacted in system complexity. Administrators need to thoroughly understand the deployment and operation of these features and the impact on the rest of the system, as well as keeping watch for those which may not yet be production-ready.

As for Datera, we remain focused first and foremost on block software defined storage enterprise-class deployments built and fully tested with industry software and hardware partners. Our goal is to help organizations make the inevitable move to a software-centric approach and remove reliance on aging legacy SAN and FC environments where they see fit.

SDX - Software Defined Everything

Red Hat and Datera share a commitment to a software-centric vision for the enterprise data center built on containers. While we offer two different paths, the destination is the same. I like to think of Ceph as an Orange and Datera as an Apple: if you are famished, you can bite into an unwrapped Orange and get nourishment, but you will not enjoy the taste if you do not take the time to carefully peel and prep it; with an Apple, it’s ready to go as soon as you pick it up.