Decoupling storage performance from storage capacity is an interesting concept that has gained extra attention in recent times. Decoupling is predicated on a desire to scale performance when you need performance and to scale capacity when you need capacity, rather than traditional spindle-based scaling delivering both performance and capacity.
Also relevant is the idea that today’s legacy disk systems are holding back app performance. For example, VMware apparently claimed that 70% of all app performance support calls were caused by external disk systems.
The Business Value of Storage Performance
IT operations have spent the last 10 years trying to keep up with capacity growth, with less focus on performance growth. The advent of flash has however shown that even though you might not have a pressing storage performance problem, if you add flash your whole app environment will generally run faster and that can mean business advantages ranging from better customer experiences to more accurate business decision making.
A Better Customer Experience
My favorite example of performance affecting customer experience is from my past dealings with an ISP of whom I was a residential customer. I was talking to a call centre operator who explained to me that ‘the computer was slow’ and that it would take a while to pull up the information I was seeking. We chatted as he slowly navigated the system, and as we waited, one of the things he was keen to chat about was how much he disliked working for that ISP : o
I have previously referenced a mobile phone company in the US who replaced all of their call centre storage with flash, specifically so as to deliver a better customer experience. The challenge with that is cost. The CIO was quoted as saying that the cost to go all flash was not much more per TB than he had paid for tier1 storage in the previous buying cycle (i.e. 3 or maybe 5 years earlier). So effectively he was conceding that he was paying more per TB for tier1 storage now than he was some years ago. Because the environment deployed did not decouple performance from capacity however, that company has almost certainly significantly over-provisioned storage performance, hence the cost per TB being higher than on the last buying cycle.
More Accurate Business Decision Making
There are many examples of storage performance improvements leading to better business decisions, most typically in the area of data warehousing. When business intelligence reports have more up to date data in them, and they run more quickly, they are used more often and decisions are more likely to be evidence-based rather than based on intuition. I recall one CIO telling me about a meeting of the executive leadership team of his company some years ago where each exec was asked to write down the name of the company’s largest supplier – and each wrote a different name – illustrating the risk of making decisions based on intuition rather than on evidence/business intelligence.
Decoupling Old School Style
Of course we have always been able to decouple performance and capacity to some extent, and it was traditionally called tiering. You could run your databases on small fast drives RAID10 and your less demanding storage on larger drives with RAID5 or RAID6. What that didn’t necessarily give you was a lot of flexibility.
Products like IBM’s SAN Volume Controller introduced flexibility to move volumes around between tiers in real-time, and more recently VMware’s Storage vMotion has provided a sub-set of the same functionality.
And then sub-lun tiering (Automatic Data Relocation, Easy Tier, FAST, etc) reduced the need for volume migration as a means of managing performance, by automatically promoting hot chunks to flash, and dropping cooler chunks to slower disks. You could decouple performance from capacity somewhat by choosing your flash to disk ratio appropriately, but you still typically had to be careful with these solutions since the performance of, for example, random writes that do not go to flash would be heavily dependent on the disk spindle count and speed.
So for the most part, decoupling storage performance and capacity in an existing disk system has been about adding flash and trying not to hit internal bottlenecks.
Traditional random I/O performance is therefore a function of:
- the amount/percent of flash cf the data block working set size
- the number and speed of disk spindles
- bus and cache (and sometimes CPU) limitations
Two products that bring their own twists to the game:
Nimble Storage
Nimble Storage uses flash to accelerate random reads, and accelerates writes through compression into sequential 4.5MB stripes (compare this to IBM’s Storwize RtC which compresses into 32K chunks and you can see that what Nimble is doing is a little different).
Nimble performance is therefore primarily a function of
- the amount of flash (read cache)
- the CPU available to do the compression/write coalescing
The number of spindles is not quite so important when you’re writing 4.5MB stripes. Nimble systems generally support at least 190 TB nett (if I assume 1.5x compression average, or 254 TB if you expect 2x) from 57 disks and they claim that performance is pretty much decoupled from disk space since you will generally hit the wall on flash and CPU before you hit the wall on sequential writes to disk. Also this kind of decoupling allows you to get good performance and capacity in a very small amount of rack space. Nimble also offers CPU scaling in the form of a scale-out four-way cluster.
Nimble have come closer to decoupling performance and capacity than any other external storage vendor I have seen.
PernixData FVP
PernixData Flash Virtualization Platform (FVP) is a software solution designed to build a flash read/write cache inside a VMware ESXi cluster, thereby accelerating I/Os without needing to add anything to your external disk system. PernixData argue that it is more cost effective and efficient to add flash into the ESXi hosts than it is to add them into external storage systems. This has something in common with the current trend for converged scale-out server/storage solutions, but PernixData also works with existing external SAN environments.
There is criticism that flash technologies deployed in external storage are too far away from the app to be efficient. I recall Amit Dave (IBM Distinguished Engineer) recounting an analogy of I/O to eating, for which I have created my own version below:
- Data in the CPU cache is like food in your spoon
- Data in the server RAM is like food on your plate
- Data in the shared Disk System cache is like food in the serving bowl in the kitchen
- Data on the shared Disk System SSDs is like food you can get from your garden
- Data on hard disks is like food in the supermarket down the road
PernixData works by keeping your data closer to the CPU – decoupling performance and capacity by focusing on a server-side caching layer and scaling alongside your compute ESXi cluster. So this is analagous to getting food from your table rather than food from your garden. With PernixData you tend to scale performance as you add more compute nodes, rather than when you add more back-end capacity.
To Decouple or not to Decouple?
Decoupling as a theoretical concept is surely a good thing – independent scaling in two dimensions – and it is especially nice if it can be done without introducing significant extra cost, complexity or management overhead.
It is however probably also fair to say that many other systems can approximate the effect, albeit with a little more complexity.
———————————————————————————————————-
Disclosures:
Jim Kelly holds PernixPrime accreditation from PernixData and is a certified Nimble Storage Sales Professional. ViFX is a reseller of both Nimble Storage and PernixData.
Filed under: Flash, Nimble, PernixData, VMware | Leave a comment »