FCIP Routers – A Best Practice Design Tip

Many years ago a Glaswegian friend of mine quoted someone as saying that the 1981 anti-apartheid protests in New Zealand (South African rugby tour) showed that New Zealand was not just a floating Surrey as some had previously suspected. While the Surrey reference might be lost on those not from England, I can tell you there are some distinct cultural and language differences between NZ and England.

For example, there was a (not very good) punk band called ‘Rooter’ back in the late 1970’s in New Zealand. They ended up having to change their name to The Terrorways because ‘Rooter’ was  considered too offensive by the managers of many pubs and clubs.

I guess that’s why in NZ we always pronounce ‘router’ to rhyme with ‘shouter’ even though we pronounce ‘route’ to rhyme with ‘shoot’. We’re kind of stuck in the middle between British and American English.

Pronunciation issues aside however, FCIP routers are a highly reliable way to connect fabrics and allow replication over the WAN between fibre channel disk systems. The price of FCIP routers seems to have halved over the last year or so, which is handy and live replicated DR sites have become much more commonplace in the midrange space in the last couple of years.

Apart from the WAN itself (which is the source of most replication problems) there are a couple of other things that it’s good to be aware of when assembling a design and bill of materials for FCIP routers.

  1. When you’re using the IBM SAN06B-R (Brocade 7800) we always recommend including the licence for ‘Integrated Routing’ if you’re going out over the WAN. This prevents the fabrics at either end of an FCIP link from merging. If a WAN link bounces occasionally as many do, you want to protect your fabrics from repeatedly having to work out who’s in charge and stalling traffic on the SAN while they do that. Without IR your WAN FCIP environment might not really even be supportable.
  2. Similarly I usually recommend the ‘Advanced Performance Monitoring’ feature. If you run into replication performance problems APM will tell you what the FC app is actually seeing rather than you having to make assumptions based on IP network tools.
  3. The third point is new to me and was the real trigger for this blog post (thanks to Alexis Giral for his expertise in this area) and that is if you have only one router per site (as most do) then best practice is to connect only one fabric at each site as per the diagram below.

The reason for this is that the routers and the switches all run the same FabricOS and there is a small potential for an error to be propagated across fabrics, even though Integrated Routing supposedly isolates the fabrics. This is something that Alexis tells me he has explored in detail with Brocade and they too recommend this as a point of best practice. If you already have dual-fabric connected single routers then I’m not sure the risk is high enough to warrant a reconfiguration, but if you’re starting from scratch you should not connect them all up. This would also apply if you are using Cisco MDS922i and MDS91xx for example, as all switches and routers would be running NXOS and the same potential for error propagation exists.

15 Responses

  1. I wanted to make a few comments about Extension infrastructures. I have been consulting, designing, building and troubleshooting FCIP architectures for over a decade now for the largest of companies, working with Nishan & CNT -> McDATA -> Brocade, Brocade 7500/FR, and now Brocade 7800/FX. Oh yeah, and Cisco 9222i 18/4 & SSN-16 (however the Cisco products are very low performance products,so, not so much). The Brocade 7800/FX are the “Who’s your Daddy!” of Extension devices.

    First, IBM is a very competent organization for Extension solutions. Kudos!

    Second, I am going to use the terminology “Channel Extender” for the 7800/FX because they were designed to be high performance and high efficiency extenders, unlike the 7500/FR which are FC Routers prior to Condor2 ASIC. Although, 7800/FX have FCR capability, as do all modern Brocade switches/directors, this is now just a function of the Condor2 ASIC and not specific to any platform. The 7800/FX are channel extenders for FC and FICON and very popular in mainframe environments. There is a LOT of technology in these boxes taken from both CNT and Nishan Systems.

    Third, what does IR do? It enables FCR (FC Routing) which provides a demarcation point (EX_Port) for fabric services. Fabric services are terminated at this demarcation point and do not extend across the WAN, therefore, the edge fabric services are not susceptible to WAN faults. Devices communicate through their local edge fabric to a local proxy device. FCR transports it across a Backbone Fabric, which may or may not be FCIP. This fabric should be designed without end devices attached so that if it suffers from WAN faults there will be no disruption. This is best practice.

    Now on with it…
    I’m not so sure that an IR licence is the best recommendation whenever going over the WAN. This falls firmly into the category of “It depends”. Here are the situations in which an IR license is really needed:

    Extending Production Fabrics
    When connecting production fabrics on each side, which is not a foregone conclusion. Let’s say you have a DS8800 or XIV, it is best practice to connect the channel extenders directly to the array and not go through the production fabric.

    Tape
    When you need ubiquitous connectivity for tape, it is best to go through a local fabric and the IR license is of value.

    IBM SVC
    When using IBM SVC, it is required that there be ubiquitous connectivity, therefore, the channel extenders will connect via the production fabric and the production fabric should be isolated using FCR.

    No Dedicated RDR Array Ports
    When you have an array that does not have dedicated RDR (Remote Data Replication) ports you need to connect that port to a fabric so that the port not only sees the hosts but also the channel extender, IR would be useful here. If a port can be dedicated to RDR then that is best practice.

    If there is no need to connect to the production fabrics, then IR is not really required and incurs additional expense. BTW, RDR and tape are not typically considered production, albeit, they may be connected via a production fabric.

    The license that is really needed almost every time is the AEX (Advanced Extension License). Almost never is this not needed. AEX gets you FCIP Trunking and ARL (Adaptive Rate Limiting). If one Ethernet interface from a 2 box Brocade 7800/FX solution is going into a single WAN link that is not shared (dedicated BW), then it is possible to not use the AEX license.

    Let’s look at the example diagram above. ARL is needed when more than one connection from the 7800/FX goes into the IP network and shares one or more WAN links with limited available bandwidth, ARL is needed to mitigate the use of that limited bandwidth (BW). It also permits the use of that BW when a connection from the 7800/FX is offline (bad optic, broken cable, maintenance, configuration change…). In a 4 box solution, it is even more important because there are usually 4 connections for box redundancy plus link redundancy. With static rate limiting, the rate limiter would not be able to adapt to failure conditions and there would be no resiliency of the FCIP infrastructure.

    FCIP Trunking is a mechanism that allows multiple FCIP circuits to form into a single ISL from the perspective of the fabrics. A single VE_Port is used on each side of the FCIP Trunk. This gets you the following: Aggregate BW, Lossless Link Loss (LLL is the ability to not lose data that has been lost in-flight when a backhoe takes out a fiber path), Automated Fail-over and Fail-back, IOD (In-Order Deliver), No FSPF routing changes. In this case above, the two red circuits from a 7800 would be FCIP Trunked into a one FCIP ISL. Since each circuit can be assigned to a different GE interface, there can be an aggregate of 2 Gbps of FCIP BW. If one of those links were to be lost, the data lost in flight would be resent over the remaining link and placed back in-order before being sent to the ULP (Upper Layer Protocols).

    Lastly, 2 box vs. 4 box solutions. Obviously, best practice is always a 4 box solution, however, this is not always practical for all size enterprises. Nonetheless, it is never considered good practice to connect a single 7800 to both the A & B production fabrics. This is not a matter of not joining fabrics together and preventing that by using FCR. That’s not it at all. It is a matter of a single Linux kernel that is running fabric services to both A & B up to that demarcation point. It is conceivable that a problem (Linux, HW, FOS, or human error) can cause both the A & B fabrics to fail at the same time. Yes, this means an entire SAN wide outage and a very bad day for all those involved. This is not a problem if the FCIP fabric is separate from the production fabric, only if connecting to the both production fabrics. Best practice for 2 box solution is to not connect to production fabric, BTW, that is what I believe is shown above. A completely separate RDR/Tape network. Or, if last resorts, only extend the “A” fabric and not the “B” until you get budget to buy 2 more boxes for “B”.

    Best Regards,
    Mark Detrick
    BCFP, BCAF, BCNE, CISSP, CCIE

    Like

    • Hi, I realise this post is about 15months old, but great post and great comments nevertheless, and I’m hoping you maybe able to assist with an issue I have.

      I have a dual-controller (canister I think maybe the terminology? Apologies I’m new to IBM storage) V7000 at Site A, which has standard FC connectivity to 9222i switches (Fabric A & B). Each 9222i has an FCiP tunnel to a partner 9222i in Site B.

      Problem I am having is getting consistent synchonisation of my VMware VMFS LUNs. They just keep timing out! We are looking not only at the V7000, but also VMware and the 9222i’s, but struggling to find a root cause.

      We also have one bare metal linux server with luns provisioned from the V7000, but this doesn’t seem to have any issues.

      Would you have any thoughts around this? Is this something you have seen before? We are wondering whether a specific O/S related issue?

      Any assistance greatly appreciated.

      Like

      • Common causes are lack of IP bandwidth or QoS, and not enough performance in the DR drives (e.g. SAS at production and NL-SAS/SATA at DR). Not so common for it to be host-specific. You could try switching to GM with Change Volumes. See the redbook for details. http://www.redbooks.ibm.com/redpieces/abstracts/sg247574.html
        Rgds, Jim

        Like

        • Thanks Jim, much appreciated. We are looking into various IP configuration and analysis now.
          Rgds
          Dominic

          Like

          • Jim and Dominic, I’m using pure FC link (DWDM) for DR but in synchronous replication. I’m planning a link consolidation and use FCIP for that. Could you share your experience?

            Like

            • Ricardo,

              If you are using FCIP for synchronous replication, you will need to use Brocade 7800s. They are the only FCIP extension products designed with the reduced added latency requirements needed for synchronous applications, including the use of compression and IPsec which are HW implemented for speed and performance.

              You will need to provide more information about the number of links, the BW amount of each link, and the aggregate BW you are planning on? Will the Ethernet from the FCIP be going over the DWDM? What’s the distance?

              Will you be connecting directly to the extension devices or via a fabric?

              Best Regards,
              Mark

              Like

              • Mark, thanks for your reply.

                Being transparent. What do you think about MDS? And the new feature of v7000 for IP replication? Are there comparation docs?

                I have two links for FC (4G each) using about 2G and two links for Ethernet (1G each) using about 700Mb. I’d like to eliminate the two 1G links and use only the 4G for FC and Ethernet.

                All links are DWDM and the distance is about 25km.

                Thanks again.

                Ricardo

                Like

                • Ricardo,

                  The MDS and the native IP in the v7000 (SANslide) are low performance solutions, which may work for you if you don’t need performance. For the money, the Brocade is clearly the best value for the performance and packs the most Extension technology.

                  The Cisco solution is very inefficient in its process to convert FC into FCIP and is not purpose built HW like the Brocade is, which is why Cisco MDS has so much more added latency and not really appropriate for synchronous applications. MDS does not have FCIP Trunking/Adaptive Rate Limiting (ARL) with Storage Optimized TCP like Brocade does. Brocade includes IPsec in the base unit (no added license cost).

                  The IBM v7000 native IP is included but has the least functionality with limited architectures (compared to FC) and no compression or IPsec. IBM shows a throughput graph over various latencies for SANslide compared to traditional TCP which is very misleading because it does not apply to the Brocade SO-TCP stack. Brocade SO-TCP outperforms SANslide at any latency. The Artificial Intelligence (AI) that SANslide claims is just marketing jibber jabber because it is not doing anything different than what Brocade has been doing.

                  As for your consolidation… You have two links at 4G that are being utilized at about 50% and with the remaining BW you want to consolidate the GE links into that free BW. Is the problem that you have to pay your service provider for each DWDM link? The DWDM is not managed by your company?

                  If I’m understanding this correctly, to combine traffic everything would have to be either FC or Ethernet. In this case, it is only practical if everything is Ethernet. After GE, the next step in BW is OC-48 (2.5 Gbps) or 10GE. Two OC-48s (5 Gbps) with FC compression may work well here. There may be other alternatives for cost-effective circuit sizes, you’ll have to ask your service provider about that. If your DWDM can accommodate OC-48 lambdas on the WAN side and GE interfaces on the LAN side… You can connect your current GE connections and connect 2-4 GE connections from each Brocade 7800 (A & B fabrics/controllers). Connecting more than 2x GE interfaces would be for the purpose of using ARL with or without metrics to facilitate failover of BW if one of the Extension legs were to fail.

                  Use compression and IPsec on the 7800s. Put all 4 FCIP GE connections into an FCIP Trunk and you’ll get perfect load balancing, true BW aggregation, failover/failback, In-Order Delivery, and Lossless Link Loss. All benefits to the proper operation of IBM v7000. This will work very well in the environment you describe at 25 km.

                  Like

            • In general a move from direct to FCIP is likely to add latency. The issue with sync is that every drop of latency you introduce into the network will add that latency to every production write I/O that is replicated and thereby affect application performance, also creating back-pressure on the Storwize internal queues for other workloads. I expect some things in Storwize 7.3 to improve replication performance, but going from direct to FCIP with sync still risks slowing down your apps, so you might be best to pilot that rather than take a design-and-hope approach. The easy escape route from this problem is to look at moving to Global Mirror or GM/Change Volumes as part of your move to FCIP if either of those approaches were acceptable.

              Like

              • I agree with Jim. I guess the question becomes, is an additional 0.28 milliseconds RTT too much? That is the added RTT latency added by passing through four 7800s, about 70 microseconds per Brocade 7800 to go from FC to FCIP. Again this is with HW compression and IPsec enabled.

                Like

  2. Hi, I want to implement site to site GM with two V7000’s and Fabrics A & B on Brocade 300 in meshed topology on each site (two ports from each canister to each fabric), this is a production SAN.
    Because budget limitations we only got two 7800 one for each site.
    I wonder if I can disconnect 1 port of each canister from the Fabrics A & B and connect directly to the 7800. this would allow redundancy on the connection from the 7800 to both canisters, eliminating the need to connect the 7800 to both fabrics. (as Mark suggested of creating a completely separated RDR network).

    Do you think this would work for the V7000?
    Any issue because reducing the # of path from 8 to 6 to the Host servers?
    Do you think that can be a problem in having unbalanced connections to the fabrics from each canister (2 and 1 to A and 1 and 2 to B) ?

    Thanks. in advance for the help.

    Like

    • With 6.4 or later you can now dedicate ports to GM like this. You still have a spof at each site when you have one router, but it means you don’t really need to buy the ‘Integrated Routing’ licenses because merging across fcip doesn’t directly affect your host fabrics. So should work out noticeably cheaper.
      I don’t really see an issue with the balance, but sharing all 4 ports into each fabric can be technically better because spreading the I/O across more ports can reduce queuing latency, but lighter loads should not notice a difference.

      Like

  3. Hi, One month ago, for the first time in my life, I plugged in a FC hard drive to a T-Card to a QLogic PCI card on my linux PC to see the speed and fiber cable reach with my own eyes. I have been reading FC related info and experimenting RAID. This thread has good insights.

    My next topic to look up will be whether SCSI protocol has target-to-target direct read and direct copy. My disk enclosures allows disk duplication without a PC. I need to look up whether such direct communication is standard SCSI.

    I will come back to this thread later.

    Like

  4. I agree if connecting to production SAN you need Integrated Routing but since we are talking about replication traffic and it is way different that regular Host/Array FC Traffic I always prefer dedicated FE Array ports connected directly to the Extension Switch and providing infrastructure to replication traffic only… A dedicated Replication Fabric(s) are the best and even better if the IP Link is also dedicated to only replication…

    Like

Leave a reply to Ricardo Cancel reply