At the OpenStack Summit in Boston last month, Datera presented the Elastic Data Fabric. It was a great event and a great community. One of the most asked questions at our booth was how does Datera compare to Ceph storage. In part 1 we will compare the design, in part 2 we will look at the performance.
Read the Teuto Success Story and learn how they found Datera to be 5 to 10 times faster than Ceph.
First, both solutions are “software defined storage”. If we think about the components that make up a commodity server: Processor, Network, and Media. A commodity server can be competitive with custom designed storage platforms. The ability to leverage the benefits of these commodity servers to build a storage large scale system is undeniably the future.
Next, both systems are based on the principle of “scale-out”. The idea is to add capacity AND performance to the system where additional nodes are joined. This scale-out principle lends itself to being able to grow to very large in terms capacity and performance. Compare this to a traditional “scale-up” array where a controller or a pair of controllers where typically all the performance is purchased up front and only capacity is added over time.
When we look under the covers this is where the differences between Datera and Ceph storage become clear. Datera built a block storage file system from the ground up. We did this because delivering high performance and low latency demands careful care across the entire IO stack.
Our work is designed around the principle of Log Structured such that all data writes are sequential for high performance on HDD and to mitigate wear on SSDs. This Block Store also has the benefit of low latency. Ceph storage is designed around a Reliable Autonomic Distributed Object Store (RADOS). RADOS uses existing Posix file systems like XFS and EXT4. These file systems are designed for a very different use case then serve as a block storage backend. This may change in time for Ceph but this type of development takes years of hard work to complete.
One of the main features of Ceph storage is the CRUSH map. The idea is to shift the metadata around where blocks are placed in the storage system from a table or meta data store to the power of the CPU to compute the placement. The CRUSH algorithm is extremely powerful but also has complexity to manage. Datera operates on the concept of “targeted placement” (see graphic below) in that data is effectively routed to the correct node and the node manages the local metadata placement. This allows the amount of metadata generated at the cluster level to be very low with the heavy lifting of managing block to media placement done on each node independently. The result is Datera can handle large scale deployments with ease, and workloads can be targeted to Flash or Hybrid as needed.
Both Ceph storage and Datera support enterprise features such as snapshots and clones. Datera has QoS capabilities and Ceph storage supports Block, Object, and file. For Block Datera uses industry standard iSCSI and Ceph storage uses a proprietary protocol.
In part 2 we’ll examine the performance differences between Datera and Ceph storage, which can be over 10X on the same hardware.