Tearless Tiering
Regardless how “low” disk hardware pricing may be trending, there appears to be a renewed emphasis in driving it even lower through added attention on effective tiering of data within the storage infrastructure.
I am old enough to remember the days when there were relatively few choices. The disks were physically large and by today’s terms, the capacities were relatively small. If you really needed it, a full string of disks meant “…That wall has to go…”
Once storage moved from the Henry Ford model, as it were, companies started trying to squeeze every penny out of the disk storage expenditure – moving to two, then three, and now we find there are companies creating as many as five different tiers through the use of Solid State, Fibre Channel, SATA, FATA and SAS disks. Some are even including tape as an active tier in their infrastructures.
I believe the concept of tiering is in some part simply the result of Data Center Managers looking at their once pristine – and empty – raised floors and asking “Where did all these fibre channel disks come from?” All those years of not thoroughly understanding the demands of the business caused them to play it safe by buying too many – too small – too expensive disks, simply because no one wanted to be the on the other end of the accusations that the system may be underperforming.
The solution became to buy a mix of disks, and assign names like Gold, Silver, and Bronze, and then take a guess at what data should go where. If you were lucky, you got it right – but the odds were probably against you from the start.
There is a point to which the savings that are generated through tiering are spent through the management of excessive levels of tiering. When it comes to tiering, keep it simple: Two tiers of disks should be plenty.
If you don’t believe me, then ask yourself if you’re truly stressing the performance of your subsystem. Chances are, you could be buying larger capacity, lower cost disks – and perform just fine for the majority of application needs. If you’re not pushing the upper limits of the subsystem performance, then trying to squeeze out a few dozen more IOPS per disk doesn’t seem to make a whole lot of sense.
Take IBM’s XIV Storage: the design of XIV effectively says “if I can give you Tier 1 performance using Tier 3 disks, then why do you need to worry about tiering at all?”
Good point.
The parallelization of SATA disks in the XIV storage seems to be working, as XIV is a hot seller, and a rock solid platform – in spite of the efforts of the naysayers who claim SATA disks can’t be trusted for Tier 1, critical enterprise applications. Users simply store the data on the XIV, and the system spreads it out across all of the spindles that are available, much like water poured into a vessel will occupy the entire width of the vessel itself, before it fills up.
Another strategy that seems to be working is the one used by Compellent. Rather than assigning individual volumes to tiers, Compellent’s Fluid Data ™ architecture monitors the activity of the individual data blocks, and moves the inactive data to lower cost, larger capacity drives.
In a given database, there are bound to be data blocks that are accessed more frequently than other data blocks. If that is the case, then why put the entire database itself onto a more expensive disk tier if the majority of it is simply going to just “sit there”?
Compellent’s offers a hands-off design to storage management, allowing the subsystem to manage itself using predefined policies. What’s more – it works even if the system has only one disk size, since data blocks can be stored using different parity levels on the same set of disks.
So while many technologies seem to be holding fast to the idea of segregating data to a specific disk capacity and speed in a dedicated enclosure, XIV and Compellent are taking on the tiering challenge by allowing the technology to do the work – removing the human element – and by doing so, they’re showing their customers the real way to save money through tiering. …Without tears.
