Hyper-V and Networking Part 8: Load-Balancing Algorithms

We’ve had a long run of articles in this series that mostly looked at general networking technologies. Now we’re going to look at a technology that gets us closer to Hyper-V. Load-balancing algorithms are a feature of the network team, which can be used with any Windows Server installation, but is especially useful for balancing the traffic of several operating systems sharing a single network team.

The selected load-balancing method is how the team decides to utilize the team members for sending traffic. Before we go through these, it’s important to reinforce that this is load-balancing. There isn’t a way to just aggregate all the team members into a single unified pipe.

I will periodically remind you of this point, but keep in mind that the load-balancing algorithms apply only to outbound traffic. The connected physical switch decides how to send traffic to the Windows Server team. Some of the algorithms have a way to exert some influence over the options available to the physical switch, but the Windows Server team is only responsible for balancing what it sends out to the switch.

Hyper-V Port Load-Balancing Algorithm

This method is commonly chosen and recommended for all Hyper-V installation based solely on its name. This is a poor reason. The name wasn’t picked because it’s the automatic best choice for Hyper-V, but because of how it operates.

The operation is based on the virtual network adapters. In versions 2012 and prior, it was by MAC address. In 2012 R2, and presumably onward, it will be based on the actual virtual switch port. Distribution depends on the teaming mode of the virtual switch.

Switch-independent: Each virtual adapter is assigned to a specific physical member of the team. It sends and receives only on that member. Distribution of the adapters is just round-robin. The impact on VMQ is that each adapter gets a single queue on the physical adapter it is assigned to, assuming there are enough left.

Everything else: Virtual adapters are still assigned to a specific physical adapter, but this will only apply to outbound traffic. The MAC addresses of all these adapters appear on the combined link on the physical switch side, so it will decide how to send traffic to the virtual switch. Since there’s no way for the Hyper-V switch to know where inbound traffic for any given virtual adapter will be, it must register a VMQ for each virtual adapter on each physical adapter. This can quickly lead to queue depletion.

Recommendations for Hyper-V Port Distribution Mode

If you somehow landed here because you’re interested in teaming but you’re not interested in Hyper-V, then this is the worst possible distribution mode you can pick. It only distributes virtual adapters. The team adapter will be permanently stuck on the primary physical adapter for sending operations. The physical switch can still distribute traffic if the team is in a switch-dependent mode.
By the same token, you don’t want to use this mode if you’re teaming from within a virtual machine. It will be pointless.

Something else to keep in mind is that outbound traffic from a VM is always limited to a single physical adapter. For 10 Gb connections, that’s probably not an issue. For 1 Gb, think about your workloads.

For 2012 (not R2), this is a really good distribution method for inbound traffic if you are using the switch-independent mode. This is the only one of the load-balancing modes that doesn’t force all inbound traffic to the primary adapter when the team is switch-independent. If you’re using any of the switch-dependent modes, then the best determinant is usually the ratio of virtual adapters to physical adapters. 

The higher that number is, the better result you’re likely to get from the Hyper-V port mode. However, before just taking that and running off, I suggest that you continue reading about the hash modes and think about how it relates to the loads you use in your organization.

For 2012 R2 and later, the official word is that the new Dynamic mode universally supersedes all applications of Hyper-V port. I have a tendency to agree, and you’d be hard-pressed to find a situation where it would be inappropriate. That said, I recommend that you continue reading so you get all the information needed to compare the reasons for the recommendations against your own system and expectations.

Hash Load-Balancing Algorithms

The umbrella term for the various hash balancing methods is “address hash”. This covers three different possible hashing modes in an order of preference. Of these, the best selection is the “Transport Ports”. The term “4-tuple” is often seen with this mode. All that means is that when deciding how to balance outbound traffic, four criteria are considered. These are: source IP address, source port, destination IP address, destination port.

Each time traffic is presented to the team for outbound transmission, it needs to decide which of the team members it will use. At a very high level, this is just a round-robin distribution. But, it’s inefficient to simply set the next outbound packet onto the next path in the rotation. Depending on contention, there could be a lot of issues with stream sequencing. So, as explained in the earlier linked posts, the way that the general system works is that a single TCP stream stays on a single physical path. In order to stay on top of this, the load-balancing system maintains a hash table. A hash table is nothing more than a list of entries with more than one value, with each entry being unique from all the others based on the values contained in that entry.

To explain this, we’ll work through a complete example. We’ll start with an empty team passing no traffic. A request comes in to the team to send from a VM with IP address to the Techsupportpk web address. The team sends that packet out the first adapter in the team and places a record for it in a hash table:

Source IP Source Port Destination IP Destination Port Physical Adapter 49152 80 1

Right after that, the same VM request a web page from the Microsoft web site. The team compares it to the first entry:

Source IP Source Port Destination IP Destination Port Physical Adapter 49152 80 1 49153 80 ?

The source ports and the destination IPs are different, so it sends the packet out the next available physical adapter in the rotation and saves a record of it in the hash table. This is the pattern that will be followed for subsequent packets; if any of the four fields for an entry make it unique when compared to all current entries in the table, it will be balanced to the next adapter.

As we know, TCP “conversations” are ongoing streams composed of multiple packets. The client’s web browser will continue sending requests to the above systems. The additional packets headed to the Techsupportpk site will continue to match on the first hash entry, so they will continue to use the first physical adapter.

IP and MAC Address Hashing

Not all communications have the capability of participating in the 4-tuple hash. For instance, ICMP (ping) messages only use IP addresses, not ports. Non-TCP/IP traffic won’t even have that. In those cases, the hash algorithm will fall back from the 4-tuple method to the most suitable of the 2-tuple matches. These aren’t as granular, so the balancing won’t be as even, but it’s better than nothing.

Recommendations for Hashing Mode

If you like, you can use PowerShell to limit the hash mode to IP addresses, which will allow it to fall back to MAC address mode. You can also limit it to MAC address mode. I don’t know of a good use case for this, but it’s possible. Just check the options on New- and Set-NetLbfoTeam. In the GUI, you can only pick “Address Hash” unless you’ve already used PowerShell to set a more restrictive option.
For 2012 (not R2), this is the best solution in non-Hyper-V teaming, including teaming within a virtual machine. For Hyper-V, it’s good when you don’t have very many virtual adapters or when the majority of the traffic coming out of your virtual machines is highly varied in a way that way that would have a high number of balancing hits. Web servers are likely to fit this profile.

In contrast to Hyper-V Port balancing, this mode will mode always balance outbound traffic regardless of the teaming mode. But, in switch-independent mode, all inbound traffic comes across the primary adapter. This is not a good combination for high quantities of virtual machines whose traffic balance is heavier on the receive side. This part of the reason that the Hyper-V port mode almost always makes more sense in a switch independent mode, especially as the number of virtual adapters increases.

For 2012 R2, the official recommendation is the same as with the Hyper-V port mode. You’re encourage to use the new Dynamic mode. Again, this is generally a good recommendation that I’m overly inclined to agree with. However, I still recommend that you keep reading so you understand all your options.

Dynamic Balancing

This mode is new in 2012 R2, and it’s fairly impressive. For starters, it combines features from the Hyper-V port and Address Hash modes. The virtual adapters are registered separately across physical adapters in switch independent mode so received traffic can be balanced, but sending is balanced using the Address Hash method. In switch independent mode, this gives you an impressive balancing configuration. This is why the recommendations are so strong to stop using the other modes. 

However, if you’ve got an overriding use case, don’t be shy about using it. I suppose it’s possible that limiting virtual adapters to a single physical adapter for sending might have some merits in some cases.

There’s another feature added by the Dynamic mode that its name is derived from. It makes use of flowlets. I’ve read a whitepaper that explains this technology. To say the least, it’s a dense work that’s not easy for mortals to follow. The simple explanation is that it is a technique that can break an existing TCP stream and move it to another physical adapter. Pay close attention to what that means: the Dynamic mode cannot, and does not, send a single TCP stream across multiple adapters simultaneously. The odds of out-of-sequence packets and encountering interim or destination connections that can’t handle the parallel data is just too high for this to be feasible at this stage of network evolution. What it can do is move a stream from one physical adapter.

Let’s say you have two 10 GbE cards in a team using Dynamic load-balancing. A VM starts a massive outbound file transfer and it gets balanced to the first adapter. Another VM starts a small outbound transfer that’s balanced to the second adapter. A third VM begins its own large transfer and is balanced back to the first adapter. The lone transfer on the second adapter finishes quickly, leaving two large transfers to share the same 10 Gb adapter. Using the Hyper-V port or any address hash load-balancing method, there would be nothing that could be done about this short of canceling a transfer and restarting it, hoping that it would be balanced to the second adapter. With the new method, one of the streams can be dynamically moved to the other adapter, hence the name “Dynamic”. Flowlets require the split to be made at particular junctions in the stream. It is possible for Dynamic to work even when a neat flowlet opportunity doesn’t present itself.

Recommendations for Dynamic Mode

For the most part, Dynamic is the way to go. The reasons have been pretty well outlined above. For switch independent modes, it solves the dilemma of choosing Hyper-V port for inbound balancing against Address Hash for outbound balancing. For both switch independent and dependent modes, the dynamic rebalancing capability allows it to achieve a higher rate of well-balanced outbound traffic.
It can’t be stressed enough that you should never expect a perfect balancing of network traffic. 

Normal flows are anything but even or predictable, especially when you have multiple virtual machines working through the same connections. The Dynamic method is generally superior to all other load-balancing method but you’re not going to see perfectly level network utilization by using it.

Remember that if your networking goal is to enhance throughput, you’ll get the best results by using faster network hardware. No software solution will perform on par with dedicated hardware.

No comments:

Powered by Blogger.