We’ve had a long run of articles in this series that mostly looked at
general networking technologies. Now we’re going to look at a
technology that gets us closer to Hyper-V. Load-balancing algorithms are
a feature of the network team, which can be used with any Windows
Server installation, but is especially useful for balancing the traffic
of several operating systems sharing a single network team.
The selected load-balancing method is how the team decides to utilize
the team members for sending traffic. Before we go through these, it’s
important to reinforce that this is load-balancing. There isn’t a way to just aggregate all the team members into a single unified pipe.
I will periodically remind you of this point, but keep in mind that
the load-balancing algorithms apply only to outbound traffic. The
connected physical switch decides how to send traffic to the Windows
Server team. Some of the algorithms have a way to exert some influence
over the options available to the physical switch, but the Windows
Server team is only responsible for balancing what it sends out to the
switch.
Hyper-V Port Load-Balancing Algorithm
This method is commonly chosen and recommended for all Hyper-V
installation based solely on its name. This is a poor reason. The name
wasn’t picked because it’s the automatic best choice for Hyper-V, but
because of how it operates.
The operation is based on the virtual network adapters. In versions
2012 and prior, it was by MAC address. In 2012 R2, and presumably
onward, it will be based on the actual virtual switch port. Distribution
depends on the teaming mode of the virtual switch.
Switch-independent: Each virtual adapter is assigned
to a specific physical member of the team. It sends and receives only
on that member. Distribution of the adapters is just round-robin. The
impact on VMQ is that each adapter gets a single queue on the physical
adapter it is assigned to, assuming there are enough left.
Everything else: Virtual adapters are still assigned
to a specific physical adapter, but this will only apply to outbound
traffic. The MAC addresses of all these adapters appear on the combined
link on the physical switch side, so it will decide how to send traffic
to the virtual switch. Since there’s no way for the Hyper-V switch to
know where inbound traffic for any given virtual adapter will be, it
must register a VMQ for each virtual adapter on each physical adapter.
This can quickly lead to queue depletion.
Recommendations for Hyper-V Port Distribution Mode
If you somehow landed here because you’re interested in teaming but
you’re not interested in Hyper-V, then this is the worst possible
distribution mode you can pick. It only distributes virtual adapters.
The team adapter will be permanently stuck on the primary physical
adapter for sending operations. The physical switch can still distribute
traffic if the team is in a switch-dependent mode.
By the same token, you don’t want to use this mode if you’re teaming from within a virtual machine. It will be pointless.
Something else to keep in mind is that outbound traffic from a VM is
always limited to a single physical adapter. For 10 Gb connections,
that’s probably not an issue. For 1 Gb, think about your workloads.
For 2012 (not R2), this is a really good distribution method for
inbound traffic if you are using the switch-independent mode. This is
the only one of the load-balancing modes that doesn’t force all inbound
traffic to the primary adapter when the team is switch-independent. If
you’re using any of the switch-dependent modes, then the best
determinant is usually the ratio of virtual adapters to physical
adapters.
The higher that number is, the better result you’re likely to
get from the Hyper-V port mode. However, before just taking that and
running off, I suggest that you continue reading about the hash modes
and think about how it relates to the loads you use in your
organization.
For 2012 R2 and later, the official word is that the new Dynamic mode
universally supersedes all applications of Hyper-V port. I have a
tendency to agree, and you’d be hard-pressed to find a situation where
it would be inappropriate. That said, I recommend that you continue
reading so you get all the information needed to compare the reasons for
the recommendations against your own system and expectations.
Hash Load-Balancing Algorithms
The umbrella term for the various hash balancing methods is “address
hash”. This covers three different possible hashing modes in an order of
preference. Of these, the best selection is the “Transport Ports”. The
term “4-tuple” is often seen with this mode. All that means is that when
deciding how to balance outbound traffic, four criteria are considered.
These are: source IP address, source port, destination IP address,
destination port.
Each time traffic is presented to the team for outbound transmission,
it needs to decide which of the team members it will use. At a very
high level, this is just a round-robin distribution. But, it’s
inefficient to simply set the next outbound packet onto the next path in
the rotation. Depending on contention, there could be a lot of issues
with stream sequencing. So, as explained in the earlier linked posts,
the way that the general system works is that a single TCP stream stays
on a single physical path. In order to stay on top of this, the
load-balancing system maintains a hash table. A hash
table is nothing more than a list of entries with more than one value,
with each entry being unique from all the others based on the values
contained in that entry.
To explain this, we’ll work through a complete example. We’ll start
with an empty team passing no traffic. A request comes in to the team to
send from a VM with IP address 192.168.50.20 to the Techsupportpk web address.
The team sends that packet out the first adapter in the team and places
a record for it in a hash table:
Source IP |
Source Port |
Destination IP |
Destination Port |
Physical Adapter |
192.168.50.20 |
49152 |
108.168.254.197 |
80 |
1 |
Right after that, the same VM request a web page from the Microsoft web site. The team compares it to the first entry:
Source IP |
Source Port |
Destination IP |
Destination Port |
Physical Adapter |
192.168.50.20 |
49152 |
108.168.254.197 |
80 |
1 |
192.168.50.20 |
49153 |
65.55.57.27 |
80 |
? |
The source ports and the destination IPs are different, so it sends
the packet out the next available physical adapter in the rotation and
saves a record of it in the hash table. This is the pattern that will be
followed for subsequent packets; if any of the four fields for an entry
make it unique when compared to all current entries in the table, it
will be balanced to the next adapter.
As we know, TCP “conversations” are ongoing streams composed of
multiple packets. The client’s web browser will continue sending
requests to the above systems. The additional packets headed to the Techsupportpk site will continue to match on the first hash entry, so they will
continue to use the first physical adapter.
IP and MAC Address Hashing
Not all communications have the capability of participating in the
4-tuple hash. For instance, ICMP (ping) messages only use IP addresses,
not ports. Non-TCP/IP traffic won’t even have that. In those cases, the
hash algorithm will fall back from the 4-tuple method to the most
suitable of the 2-tuple matches. These aren’t as granular, so the
balancing won’t be as even, but it’s better than nothing.
Recommendations for Hashing Mode
If you like, you can use PowerShell to limit the hash mode to IP
addresses, which will allow it to fall back to MAC address mode. You can
also limit it to MAC address mode. I don’t know of a good use case for
this, but it’s possible. Just check the options on New- and
Set-NetLbfoTeam. In the GUI, you can only pick “Address Hash” unless
you’ve already used PowerShell to set a more restrictive option.
For 2012 (not R2), this is the best solution in non-Hyper-V teaming,
including teaming within a virtual machine. For Hyper-V, it’s good when
you don’t have very many virtual adapters or when the majority of the
traffic coming out of your virtual machines is highly varied in a way
that way that would have a high number of balancing hits. Web servers
are likely to fit this profile.
In contrast to Hyper-V Port balancing, this mode will mode always
balance outbound traffic regardless of the teaming mode. But, in
switch-independent mode, all inbound traffic comes across the primary
adapter. This is not a good combination for high quantities of virtual
machines whose traffic balance is heavier on the receive side. This part
of the reason that the Hyper-V port mode almost always makes more sense
in a switch independent mode, especially as the number of virtual
adapters increases.
For 2012 R2, the official recommendation is the same as with the
Hyper-V port mode. You’re encourage to use the new Dynamic mode. Again,
this is generally a good recommendation that I’m overly inclined to
agree with. However, I still recommend that you keep reading so you
understand all your options.
Dynamic Balancing
This mode is new in 2012 R2, and it’s fairly impressive. For
starters, it combines features from the Hyper-V port and Address Hash
modes. The virtual adapters are registered separately across physical
adapters in switch independent mode so received traffic can be balanced,
but sending is balanced using the Address Hash method. In switch
independent mode, this gives you an impressive balancing configuration.
This is why the recommendations are so strong to stop using the other
modes.
However, if you’ve got an overriding use case, don’t be shy about
using it. I suppose it’s possible that limiting virtual adapters to a
single physical adapter for sending might have some merits in some
cases.
There’s another feature added by the Dynamic mode that its name is derived from. It makes use of flowlets.
I’ve read a whitepaper that explains this technology. To say the least,
it’s a dense work that’s not easy for mortals to follow. The simple
explanation is that it is a technique that can break an existing TCP
stream and move it to another physical adapter. Pay close attention to
what that means: the Dynamic mode cannot, and does not, send a single
TCP stream across multiple adapters simultaneously. The odds of
out-of-sequence packets and encountering interim or destination
connections that can’t handle the parallel data is just too high for
this to be feasible at this stage of network evolution. What it can do
is move a stream from one physical adapter.
Let’s say you have two 10 GbE cards in a team using Dynamic
load-balancing. A VM starts a massive outbound file transfer and it gets
balanced to the first adapter. Another VM starts a small outbound
transfer that’s balanced to the second adapter. A third VM begins its
own large transfer and is balanced back to the first adapter. The lone
transfer on the second adapter finishes quickly, leaving two large
transfers to share the same 10 Gb adapter. Using the Hyper-V port or any
address hash load-balancing method, there would be nothing that could
be done about this short of canceling a transfer and restarting it,
hoping that it would be balanced to the second adapter. With the new
method, one of the streams can be dynamically moved to the other
adapter, hence the name “Dynamic”. Flowlets require the split to be made
at particular junctions in the stream. It is possible for Dynamic to
work even when a neat flowlet opportunity doesn’t present itself.
Recommendations for Dynamic Mode
For the most part, Dynamic is the way to go. The reasons have been
pretty well outlined above. For switch independent modes, it solves the
dilemma of choosing Hyper-V port for inbound balancing against Address
Hash for outbound balancing. For both switch independent and dependent
modes, the dynamic rebalancing capability allows it to achieve a higher
rate of well-balanced outbound traffic.
It can’t be stressed enough that you should never expect a perfect
balancing of network traffic.
Normal flows are anything but even or
predictable, especially when you have multiple virtual machines working
through the same connections. The Dynamic method is generally superior
to all other load-balancing method but you’re not going to see perfectly
level network utilization by using it.
Remember that if your networking goal is to enhance throughput,
you’ll get the best results by using faster network hardware. No
software solution will perform on par with dedicated hardware.
No comments: