Big Data can take a variety of forms but what better way to get a feeling for the performance of a big data storage system than using a standard audited benchmark to measure large file processing, large query processing, and video streaming.
From the www.storageperformance.org website:
“SPC-2 consists of three distinct workloads designed to demonstrate the performance of a storage subsystem during… large-scale, sequential movement of data…
- Large File Processing: Applications… which require simple sequential process of one or more large files such as scientific computing and large-scale financial processing.
- Large Database Queries: Applications that involve scans or joins of large relational tables, such as those performed for data mining or business intelligence.
- Video on Demand: Applications that provide individualized video entertainment to a community of subscribers by drawing from a digital film library.”
The Storage Performance Council also recently published its first SPC-2E benchmark result. “The SPC-2/E benchmark extension consists of the complete set of SPC-2 performance measurement and reporting plus the measurement and reporting of energy use.”
It uses the same performance test as the SPC-2 so the results can be compared. It does look as though only IBM and Oracle are publishing SPC-2 numbers these days however and the IBM DS5300 and DS5020 are the same LSI OEM boxes as the Oracle 6780 and 6180, so that doesn’t really add a lot to the mix. HP and HDS seem to have fled some time ago, and although Fujitsu and Texas Memory do publish, I have never encountered either of those systems out in the market. So the SPC-2 right now is mainly a way to compare sequential performance among IBM systems.
XIV is certainly interesting, because in its Generation 2 format it was never marketed as a box for sequential or single-threaded workloads. XIV Gen2 was a box for random workloads, and the more random and mixed the workload the better it seemed to be. With XIV Generation 3 however we have a system that is seen to be great with sequential workloads, especially Large File Processing, although not quite so strong for Video on Demand.
The distinguishing characteristic of LFP is that it is a read/write workload, while the others appear to be read-only. XIV’s strong write performance comes through on the LFP benchmark.
Drilling down one layer deeper we can look at the components that make up Large File Processing. Sub-results are reported for reads, writes, and mixed read/write, as well as for 256 KiB and 1,024 KiB I/O sizes in each category.
So what we see is that XIV is actually slightly faster than DS8800 on the write workloads, but falls off a little when the read percentage of the I/O mix is higher.