In 1997 IBM launched what it called the Seascape architecture, which essentially was about building storage systems out of snap-together technology rather than building individually hand-crafted monoliths. This architecture has been attributed to Michael Hartung, who was appointed an IBM Fellow in 2002.
As an aside, I realize I have oft quoted Michael Hartung without knowing it – according to Grady Booch (the father of OO Programming and another IBM Fellow) Michael is responsible for the famous saying about the difference between hardware and software “Hardware eventually fails; Software eventually works”.
But back to Seascape, this is one of those rare times where the future turns out to be just like you said it was going to be, with products like SVC and DS8000 conforming to the Seascape architecture. As we go on it seems IBM is driving this architectural concept even harder with XIV and now with SONAS.
The essential components of SONAS (backwards) are:
- Storage tiers (disk, tape)
- Storage owning nodes (Lintel servers)
- Infiniband network plumbing
- File serving nodes (Lintel servers)
- General Parallel File System (gpfs) scale-out and ILM engine
- Clustered SAMBA and NFS
- Integrated Management
While one could argue that the biggest piece of value in SONAS is the integrated management (which sets it apart from SoFS for example) the two most technologically interesting pieces are gpfs and Clustered SAMBA. gpfs delivers two main points of value to SONAS. It’s the scale-out engine, and it’s the policy-based ILM engine, moving files between disk tiers without upsetting their place in the file tree. e.g. a file placement rule might say that if a file has a .dbf extension it goes on the SAS tier, but if it has a .mp4 extension it goes on the SATA tier. Another rule might say that no matter what the extension, if the file hasn’t been touched for a month it goes to SATA. That part is easy-peasy for gpfs and has been for years.
This ILM function is not HSM and it doesn’t leave stubs and complex trails of pointers. I haven’t mentioned HSM until now because this is something folks get confused about. If you want ILM to tape (e.g. a rule that says if a file hasn’t been touched for 13 months, move it to tape and leave only a stub on disk) SONAS uses a TSM server and HSM to do that. That doesn’t mean your backup server has to be TSM by the way.
The other interesting part is Clustered SAMBA. Clustered SAMBA is open source. I’m told that IBM contributed a lot of code to C/S but that the version used in SONAS is an IBM in-house enhanced version has been kiln-fired. It doesn’t really matter. The point is that back in the day when Netapp added CIFS to their filers, they probably had to write it from the ground up, these days IBM can work with the open source community to deliver feature-rich software. The IBM value-add is the integrated product, the whole of SONAS.
This is turning into a long post, as is my wont, and we haven’t even got to stuff like remote replication yet, so I will pause here for now except for a quick personal view of the NAS market. The filer market seems to be to be made up of opportunities of increasing size in the following categories. I am sure IDC will have some figures that say differently, and I also know which view I would rather trust : )
- Folks who want filers because they think it’s cheaper than doing block I/O (it usually isn’t)
- Folks looking to replace their MSCS with a decent robust clustered filer environment
- Folks who have a specific technology dependency e.g. Commvault disk to disk backup
- Folks who have 100’s of TBs of documents, some active, some not, which they need to manage
SONAS is initially targeted squarely at number 4.
Image is the hi-density SONAS disk drawer…
Filed under: SONAS |