Disk Survey

Surveying the disks of the world

How Storage Layers Hurt

| Comments

There’s always been a war between the layer makers and the monolithic lovers, it’s ever present in the networking model with its multitude of specifically defined layers and it also exists in the storage world, whether the storage is networked or not doesn’t matter. There is a lot of simplicity in the upper layers when the lower layers take control and do something behind the scenes. The problem comes when the lower layers do things without being aware of what can be done in a higher layer and there is no way to communicate this information between the layers.

Take for example a presentation by Micron about their ClearNAND, titled “Why ECC-Free NAND Is the Best Solution for High-Performance Applications” it describes a small NAND controller below the SSD controller that deals with the ECC for itself and thus reduces the need for the increasing size of ECC handling in the FPGA/ASIC of the top controller of the SSD. The first thought that comes to mind is how smart handling of slightly deteriorated NAND can be performed if the lower level hides the information? How would something like the Anobit (RIP) or DensBits smarts come to life if all the upper layer could get is either good data or forever corrupt data?

The same happens at the higher layers, a typical RAID device has multiple devices with redundancy and is capable of recovering from errors in less time than it would take a disk to retry and yet there is no method to communicate the problem from the disk (SSD or HDD doesn’t matter) and let the upper RAID level to handle the problem and only return to the disk with a request to do more work if it’s higher level recovery failed to work. Performance of storage devices would increase even in the face of media problems, and the world will be better for it. But the laziness of some software developers at the higher layers and the inelasticity of the developers at the lower layers prevent it so far. The defined interfaces between a host and the disk do not help much either.

There is a lot to be said for the ultimate control that integration such as Fusion-IO does with their products, working their way from NAND to the top application. And yet there is a world of difference between a caching product and aa full blown SAN storage device that makes the life of administrators much easier.

There should also be a middle-ground, I wonder if it will ever come to life?