A Quick IBM ProtecTIER (Dedup VTL) Update

This is a very brief update designed to help clarify a few things about IBM’s ProtecTIER dedup VTL solutions. The details of the software functions I will leave to the redbooks (see links below).

What is ProtecTIER?

The dedup algorithm in ProtecTIER is HyperFactor, which detects recurring data in multiple backups. HyperFactor is unique in that it avoids the risk of data corruption due to hash collisions, a risk that is inherent in products based on hashing algorithms. HyperFactor uses a memory resident index, rather than disk-resident hash tables and one consequence of this is that ProtecTIER’s restore times are shorter than backup times, in contrast to other products where restore times are generally much longer.

The amount of space saved is mainly a function of the backup policies and retention periods, and the variance of the data between them, but in general HyperFactor can deliver slightly better dedup ratios than hash-based systems. The more full-backups retained on ProtecTIER, and the more intervening incremental backups, the more space that will be saved overall.

One of the key advantages of ProtecTIER is the ability to replicate deduped data in a many to many grid. ProtecTIER also supports SMB/CIFS and NFS access.

While Tivoli Storage Manager also includes many of the same capabilities as ProtecTIER, the latter will generally deliver higher performance dedup, by offloading the process to a dedicated system, leaving TSM or other backup software to concentrate on selecting and copying files.

For more information on the software functionality etc, please refer to these links:

 

ProtecTIER Systems

In the past IBM has offered three models of ProtecTIER systems, two of which are now withdrawn, and a new one has since appeared.

  • TS7610 (withdrawn) – entry level appliance up to 6 TB and 80 MB/sec.
  • TS7620 – new entry level system. Up to 35 TB of deduped capacity. Backup speed of 300 MB/sec was originally quoted, but with recent capacity increases I am still trying to confirm if the rated throughput has changed.
  • TS7650A (withdrawn) – the midrange appliance which was rated at up to 36 TB and 500 MB/sec. This appliance was based on a back-end IBM (LSI/Netapp) DS4700 disk system with 450GB drives in RAID5 configuration.
  • TS7650G – the enterprise gateway, which is currently rated at 9 TB per hour backup and up to 11.2 TB per hour restore. Each TS7650G has support for multiple Storwize V7000 or XIV disk systems, both of which offer non-disruptive drive firmware update capability.

Sizing

There are a couple of rules of thumb I try to use when doing an initial quick glance sizing with the TS7650G with V7000 disk.

  • Every V7000 disk will give you another 20 GB per hour of ProtecTIER backup throughput. The I/O profile for files is approx 80/20 random R/W with a 60KB block size and we generally use RAID6 for that. Metadata is generally placed on separate RAID10 drives and is more like 20/80 R/W.
  • Backup storage (traditionally on tape) can be five to ten times the production storage capacity, so assuming a 10:1 dedup ratio, you might need a dedup disk repository between half and the same size as your production disk. However, if you know you are already storing x TB of backups on tape, don’t plan on buying less than x/10 dedup capacity. The dedup ratio can sometimes be as high as 25:1 but more typically it will be closer to 10:1.
  • It’s probably not a good idea to buy a dedup system that can’t easily grow to double the sized initial capacity. Dedup capacity is notoriously hard to predict and it can turn out to need more than you expected.

Those rules of thumb are not robust enough to be called a formal sizing, but they do give you a place to start in your thinking.

ProtecTIER

Advertisements

What do you get at an IBM Systems Technical Symposium?

What do you get at an IBM Systems Technical Symposium? Well for the event in Auckland, New Zealand November 13-15 I’ve tried to make the storage content as interesting as possible. If you’re interested in attending, send me an email at jkelly@nz.ibm.com and I will put you in contact with Jacell who can help you get registered. There is of course content from our server teams as well, but my focus has been on the storage content, planned as follows:

Erik Eyberg, who has just joined IBM in Houston from Texas Memory Systems following IBM’s recent acquisition of TMS, will be presenting “RAMSAN – The World’s Fastest Storage”. Where does IBM see RAMSAN fitting in and what is the future of flash? Check out RAMSAN on the web, on twitter, on facebook and on youtube.

Fresh from IBM Portugal and recently transferred to IBM Auckland we also welcome Joao Almeida who will deliver a topic that is sure to be one of the highlights, but unfortunately I can’t tell you what it is since the product hasn’t been announced yet (although if you click here you might get a clue).

Zivan Ori, head of XIV software development in Israel knows XIV at a very detailed level – possibly better than anyone, so come along and bring all your hardest questions! He will be here and presenting on:

  • XIV Performance – What you need to know
  • Looking Beyond the XIV GUI

John Sing will be flying in from IBM San Jose to demonstrate his versatility and expertise in all things to do with Business Continuance, presenting on:

  • Big Data – Get IBM’s take on where Big Data is heading and the challenges it presents and also how some of IBM’s products are designed to meet that challenge.
  • ProtecTIER Dedup VTL options, sizing and replication
  • Active/Active datacentres with SAN Volume Controller Stretched Cluster
  • Storwize V7000U/SONAS Global Active Cloud Engine multi-site file caching and replication

Andrew Martin will come in from IBM’s Hursley development labs to give you the inside details you need on three very topical areas:

  • Storwize V7000 performance
  • Storwize V7000 & SVC 6.4 Real-time Compression
  • Storwize V7000 & SVC Thin Provisioning

Senaka Meegama will be arriving from Sydney with three hot topics around VMware and FCoE:

  • Implementing SVC & Storwize V7000 in a VMware Environment
  • Implementing XIV in a VMware Environment
  • FCoE Network Design with IBM System Storage

Jacques Butcher is also coming over from Australia to provide the technical details you all crave on Tivoli storage management:

  • Tivoli FlashCopy Manager 3.2 including Vmware Integration
  • TSM for Virtual Environments 6.4
  • TSM 6.4 Introduction and Update plus TSM Roadmap for 2013

Maurice McCullough will join us from Atlanta, Georgia to speak on:

  • The new high-end DS8870 Disk System
  • XIV Gen3 overview and tour

Sandy Leadbeater will be joining us from Wellington to cover:

  • Storwize V7000 overview
  • Scale-Out NAS and V7000U overview

I will be reprising my Sydney presentations with updates:

  • Designing Scale Out NAS & Storwize V7000 Unified Solutions
  • Replication with SVC and Storwize V7000

And finally, Mike McKenzie will be joining us from Brocade in Australia to give us the skinny on IBM/Brocade FCIP Router Implementation.

You can’t always get what you want

There have been a raft of new storage efficiency elements brought to market in the last few years, but what has become obvious is that you can’t yet get it all in one product. Continue reading

ALL YOUR BASE ARE BELONG TO US

There are four reasons I can think of why a company wants to buy another:

  1. To take a position in a market you didn’t expect to be in but has suddenly become important to you (e.g. EMC buying VMware)
  2. To take a position in a market you did expect to be in, but the internal projects to get you where you wanted have failed (e.g. HP buying 3PAR)
  3. To gain mass in a market in which you already play successfully (e.g. Oracle buying JDE and PeopleSoft)
  4. To prevent your competitor gaining an asset that they could use to attack your market (e.g. Oracle buying Sun/MySQL) Continue reading

When Space, Time & Vendor Charges Collide…

Well the whole snapshot and replication thing got me thinking about vendor licensing. Licensing is a way to get a return on one’s R&D, it doesn’t really matter whether customers pay x for hardware and y for software, or x+y for the hardware ‘solution’ and zero for software functions etc, as long as the vendor gets the return it needs to keep its investors happy.

Vendor charges are like taxes, most of us appreciate that they are needed, but there are many different ways to levy the tax: e.g. flat tax rate, progressive, regressive, goods and services (GST/VAT/SalesTax).

I suspect that charging large licence fees for snapshot and replication functions has held IT back and IMHO the time has now come to set these functions free. Continue reading

Less is More – it’s a dedup vendor party

Dedup is happening fast all around us and the vendors are lining up, but it’s not always easy to compare what’s going on. Continue reading

%d bloggers like this: