This post in our series on storage for Hyper-V is devoted to the ways that a Hyper-V host can connect to storage. This will not be a how-to guide, but an inspection of the technologies.
Part 1 – Hyper-V storage fundamentals
Part 2 – Drive Combinations
Part 3 – Connectivity
Part 4 – Formatting and file systems
Part 5 – Practical Storage Designs
Part 6 – How To Connect Storage
Part 7 – Actual Storage Performance
Don’t Overdo It
The biggest piece of advice I hope all readers take away from this article is: do not over-architect storage connectivity. As you saw in part one, hard drives are slow. Even when we get a bunch of them working in parallel, they’re still slow. As you saw in my anecdote in part two, even people who should know better will expend a lot of effort trying to create ultra-fast connections to ultra-slow hard drives. Unless you have racks and cabinets brimming with hard drive enclosures and SANs, installing 16Gb/s Fibre Channel switches and 10 GbE switches to connect to storage is a waste of money.
Your primary architectural goals for storage connectivity are stability and redundancy. As for speed, figure out how much data your host systems need to move and how much your drive systems are capable of moving before you even start to worry about how fast your connectivity is. I guarantee that most people will be surprised to see just how little they actually need.
As a point of comparison, I managed a Hyper-V Server system with about 35 active virtual machines that included Exchange, multiple SQL Servers, a corporate file server, and numerous application servers for about 300 users. There was a single backing SAN with 15 disks and a pair of 1GbE connections. The usage monitor indicated that the iSCSI link was generally near flat and almost never exceeded 1Gb/s in production hours. The only time it got any real utilization was during backup windows. Even then, it was only as full as the backup target could handle. My storage network almost never reached its bandwidth capacity.
Fibre Channel (FC) is a fast and dependable way to connect to external storage. Such external storage will almost exclusively be mid- to high-end SAN equipment. The FC protocol is designed to be completely lossless, which means there is very little overhead. This allows FC communications to run at very nearly the maximum rated speed of the equipment (4, 8, and 16 Gbps are the most common speeds).
Historically, FC was the premier method to connect to storage. External storage was considered a top-shelf technology under any circumstances, and nothing else had anything near FC’s speed and reliability. These are still true today, but Ethernet is starting to close the gap rapidly. The issue is that FC has always been very expensive. It gets even more expensive when FC switches are required. Many FC SANs allow multiple systems to connect directly without going through a switch, but this limits the number of connected hosts and removes a few options. For instance, you might want your hosts to connect to multiple SANs or you might want to employ some form of SAN-level replication. These are going to need switching hardware.
Another issue with FC is that it’s really its own world. It requires understanding and implementation of World Wide Names and masking and other terms and technologies. They aren’t terribly difficult, but you can’t really bring in other knowledge and apply it to FC’s idiosyncracies. A serious side effect of this is that you can’t really find a lot of generic information on connecting a host to an FC device.
You pretty much need to work off of manufacturer’s documentation. For a new system, that’s not such a big deal; it probably (hopefully!) comes with the equipment. If you inherit a device, especially an older one that the manufacturer has no interest or incentive in helping you with, you might be facing an uphill battle. As the data center becomes flatter and administrators are expected to know more about a wider range of technologies, having portability of knowledge is more important.
When configuring FC, the best advice I can give you is to follow the manufacturer’s directions.
You do want to avoid using any equipment that converts the FC signal. You can find FCoE (Fibre Channel over Ethernet) devices, but please don’t use them for storage. The FC equipment believes that its signal will be received and doesn’t deal with loss very well, so converting FC into Ethernet and back can cause serious headaches — read that as “data loss”.
iSCSI is, in some ways, a successor to FC (although it’s a bit of a stretch to think that it will replace FC). It’s not as fast or as reliable, but it’s substantially cheaper. Instead of requiring expensive FC switches and archaic configurations, it works on plain, standard Ethernet switches. If you understand IP at all, you already know enough to begin deploying and administering iSCSI.
What hurts iSCSI is that it relies on the TCP/IP protocol. This protocol was designed specifically to operate across unreliable networks, and as such has a great deal of overhead to enable detection and retransmission of lost packets and resequencing of packets that arrive out of order. Even if all packets show up and are in the correct sequence, that overhead is still present. This overhead exists in the form of control information added on to each packet and extra processing time in packaging and unpackaging transmitted data. It’s very easy to overstate the true impact of this overhead, but it’s foolish to pretend that it’s not there.
A pitfall that some administrators fall into is treating iSCSI traffic like all other TCP/IP traffic. Their configurations usually work, just not as well as they could. The most common mistake is allowing iSCSI traffic to cross a router. A very common piece of advice given is to segregate iSCSI devices and hosts into separate subnets. This makes sense. This is often extended into a recommendation that if the hosts and hardware have multiple NICs, to also place those into unique subnets. This quality of this advice is questionable, but if properly implemented, not harmful. The following image represents what this might look like.
Here, all switching is layer-2. What that means is that source and destination systems are separated solely by MAC address, which is in the header portion of the Ethernet frame where it’s easy for the switch to access. As long as the switch isn’t overloaded, layer-2 connectivity happens very rapidly. It’s so fast that it’s difficult to measure the impact of a single switch. The only real dangers here are overloading the switch, having too much broadcast traffic on the same network as the iSCSI traffic, or forcing iSCSI traffic to go through numerous switches.
It’s a lot harder to overload switches than most people think, even cheap switches, but an easy fix is to just use switching hardware that is dedicated to iSCSI traffic. Broadcast traffic can be reduced by putting iSCSI traffic into its own subnet. If you like, you can further isolate it by placing it into its own VLAN. Finally, try to use no more than one switch between your hosts and your storage devices.
A problem arises when iSCSI initiators (clients) connect to iSCSI targets (hosts) that are on different subnets. This connection can only be made through a router. This often happens when the SAN’s adapters are teamed, won’t allow their adapters to participate on separate subnets, or the configuring administrator inadvertently connects the initiator to the wrong target. One possible mis-configuration is illustrated below:
The problem here is a matter of both latency and loss. The second adapter in the host cannot connect directly to the SAN’s team IP address. People often look at this and think, well, if the source adapter knows the MAC address of the destination adapter, why doesn’t it just send it there using layer-2? The answer is: that’s not how TCP/IP works. The source system looks at the destination IP address, looks at its own address and subnet mask, and realizes that it needs a router to communicate.
The destination MAC address it’s going to use when it creates the Ethernet frame is that of its gateway, not the SAN’s teamed adapter. The layer-2 switch at the top will then send those frames to the gateway because, being only layer-2, it has no way to know that the ultimate destination is the SAN. When the gateway receives the data, it will have to dig below the Ethernet frame into the header of the TCP packet to determine the destination IP address. It then has to repackage the packet into a new frame with the source and destination MAC addresses replaced and send it back to the layer-2 switch.
Hopefully, it’s obvious that layer-3 communications (routing) are a more involved operation than layer-2 switching. Layer-3 delays are much easier to detect than layer-2, especially if the router is performing any form of packet inspection such as firewalling. Also, routers are much more inclined to drop packets than simple layer-2 switches are. Remember that in TCP/IP design, data loss is accepted as a given. As far as the router is concerned, dropping the occasional packet is not a big deal. For your storage connections, the extra delays and misplaced packets can accumulate to be a very big deal indeed.
1GbE vs. 10GbE iSCSI
10GbE is all the rage… for people who make money on equipment sales. 10GbE cards are not terribly expensive, but 10GbE switches are. Unless you’re absolutely certain that your storage can make decent use of 10GbE, you should favor multi-path 1GbE in your storage connectivity purchase decisions. Again: don’t overarchitect your storage connectivity.
The size of the standard Ethernet frame is 1538 or 1546 bytes, with 1500 bytes as the payload and the rest as Ethernet information, resulting in about 3% overhead on each frame. That’s not a great deal, but superior ratios are available by employing jumbo frames. These allow an Ethernet frame to be as large as 9216 bytes. The amount of overhead bytes remains the same, however, reducing it to a trivial fraction of the frame’s size.
In aggregate, this means that far fewer frames are required to transmit the same amount of data. Less total data crosses the wire, increasing throughput. Another benefit is that packet-processing is decreased, since fewer headers need to be packaged, processed, and unpacked. With modern adapters and switching hardware, the effects of the reduced processing aren’t as significant as in the past, but they are still present.
Configuring jumbo frames for a standard network adapter is just a matter of changing its properties. Configuring in Server Core or for Hyper-V can be somewhat more complicated.