Physical Processors are Never Assigned to Specific Virtual Machines
This is the most important note. Assigning 2 vCPUs to a system does
not mean that Hyper-V plucks two cores out of the physical pool and
permanently marries them to your virtual machine. I’ve seen IBM systems
that do something like this, but I don’t believe that any other
hypervisor does. Hyper-V certainly doesn’t. You can’t actually assign a
physical core to a VM at all. So, does this mean that vendor request to
dedicate a core just can’t be met? Well, not exactly. More on that
toward the end.
Start by Understanding Operating System Processor Scheduling
Let’s kick this off by looking at how CPUs are used in regular Windows. Here’s a shot of my Task Manager screen:Nothing fancy, right? Looks familiar, right?
Now, back when computers never, or almost never, came in multi-CPU
multi-core boxes, we all knew that computers couldn’t really multitask.
They had one CPU and one core, so there was only one possible thread of
execution. But aside from the fancy graphical updates, Task Manager then
looked pretty much like Task Manager now. You had a long list of
running processes, all of them with a metric indicating what percentage
of the CPUs time they were using.
Then, as in now, each line item you see is a process (or, new in the recent Task Manager versions, a process group). A process might consist of one or many threads. A thread is nothing more than a sequence of CPU instructions (key word: sequence).
What happens is that (in Windows, this started in 95 and NT) the
operating system would stop a running thread, preserve its state, and
then start another thread. After a bit of time, it would repeat those
operations for the next thread. Remember that this is pre-emptive,
meaning that it is the operating system that decides when a new thread
will run.
The thread can beg for more, and you can set priorities that
affect where a process goes in line, but the OS is in charge of thread
scheduling.
The only difference today is that you have multiple cores and/or
multiple CPUs in practically every system (as well as hyper-threading in
Intel processors), so Windows can actually multi-task now.
Taking These Concepts to the Hypervisor
Because of its role as a thread manager, Windows can be called a
“supervisor” (very old terminology that you really never see anymore): a
system that manages processes that are made up of threads. Hyper-V is a
hypervisor: a system that manages supervisors that manage processes
that are made up of threads. Pretty easy to understand, right?
Task Manager doesn’t work the same way for Hyper-V, but the same
thing is going on. There is a list of partitions, and inside those
partitions are processes and threads. The thread scheduler works pretty
much the same way. What follows is a rethought version of the original
image that was submitted for the book, changed to avoid plagiarism:
Of course, there are always going to be a lot more than just nine
threads going at any given time. They’ll be queued up in the thread
scheduler.
What About Processor Affinity?
You probably know that you can affinitize threads in Windows so that
they always run on a particular core or set of cores. As far as I know
there’s no way to do that in Hyper-V with vCPUs. Doing so would be of
questionable value anyway; dedicating a thread to a core is not the same
thing as dedicating a core to a thread, which is what many people
really want to try to do. You can’t prevent a core from running other
threads in the Windows world.
How Does Thread Scheduling Work?
The simplest answer is that Hyper-V makes the decision at the
hypervisor level, but it doesn’t really let the guests have any input.
Guest operating systems decide which of their threads they wish to
operate. The image I presented is necessarily an oversimplification, as
it’s not simple first-in-first-out. NUMA plays a role, for instance.
Really understanding this topic requires a fairly deep dive into some
complex ideas, and that level of depth is not really necessary for most
administrators.
The first thing that matters is that (affinity aside) you never know
where any given thread is going to actually execute. A thread that was
paused to yield CPU time to another thread may very well be assigned to
another core when it is resumed.
Did you ever wonder why an application
consumes right at 50% of a dual core system and each core looks like
it’s running at 50% usage? That behavior indicates a single-threaded
application. Each time it is scheduled, it consumes 100% of the core
that it’s on. The next time it’s scheduled, it goes to the other core
and consumes 100% there.
When the performance is aggregated for Task
Manager, that’s an even 50% utilization for the app. Since the cores are
handing the thread off at each scheduling event and are mostly idle
while the other core is running that app, they amount to 50% utilization
for the measured time period. If you could reduce the period of
measurement to capture individual time slices, you’d actually see the
cores spiking to 100% and dropping to 0% (or whatever the other threads
are using) in an alternating pattern.
What we’re really concerned with is the number of vCPUs assigned to a system and priority.
What Does the Number of vCPUs I Select Actually Mean?
You should first notice that you can’t assign more vCPUs to a virtual machine than you have physical cores in your host.
So,a virtual machine’s CPU count means the maximum number of threads
that it is allowed to operate on physical cores at any given time. I
can’t set that virtual machine to have more than two vCPUs because the
host only has two CPUs. Therefore, there is nowhere for a third thread
to be scheduled.
But, if I had a 24-core system and left this VM at 2
vCPUs, then it would only ever send a maximum of two threads up to the
hypervisor for scheduling. Other threads would be kept in the guest’s
thread scheduler (the supervisor), waiting their turn.
But Can’t You Assign More Total vCPUs to all VMs than Physical Cores?
Absolutely. Not only can you, but you’re almost definitely going to.
It’s no different than the fact that I’ve got 40+ processes “running” on
my dual core laptop right now. I can’t actually run more than two
threads at a time, but I’m always going to have far more than two
threads scheduled.
Windows has been doing this for a very long time now,
and Windows is so good at it (usually) that most people don’t even
pause to consider just what’s going on. Your VMs (supervisors) will
bubble up threads to run and Hyper-V (hypervisor) will schedule them the
way (mostly) that Windows has been scheduling them ever since it
outgrew cooperative scheduling in Windows 3.x.
What’s The Proper Ratio of vCPU to pCPU/Cores?
This is the question that’s on everyone’s mind. I’ll tell you straight: in the generic sense, this question has no answer.
Sure, way back when, people said 1:1. Some people still say that
today. And you know, you can do it. It’s wasteful, but you can do it. I
could run my current desktop configuration on a quad 16 core server and
I’d never have any contention. But, I probably wouldn’t see much
performance difference. Why? Because almost all my threads sit idle
almost all the time. If something needs 0% CPU time, what does giving it
its own core do? Nothing, that’s what.
Later, the answer was upgraded to 8 vCPUs per 1 physical core. OK, sure, good.
Then it became 12.
And then the recommendations went away.
They went away because they were dumb. I mean, it was probably a good
rule of thumb that was built out of aggregated observations and
testing, but really, think about it. You know that mostly, operating
threads will be evenly distributed across whatever hardware is
available.
So then, the amount of physical CPUs needed doesn’t depend on
how many virtual CPUs there are. It’s entirely dependent on what the
operating threads need. And, even if you’ve got a bunch of heavy threads
going, that doesn’t mean their systems will die as they get pre-empted
by other heavy threads. It really is going to depend on how many other
heavy threads they wait for.
I’m going to let you in on a dirty little secret about CPUs: Every
single time a thread runs, no matter what it is, it drives the CPU at
100% (power-throttling changes the clock speed, not workload
saturation). The CPU is a binary device; it’s either processing or it
isn’t.
The 100% or 20% or 50% or whatever number you see is completely
dependent on a time measurement. If you see it at 100%, it means that
the CPU was completely active across the measured span of time. 20%
means it was running a process 1/5th of the time and 4/5th of the time
it was idle.
What this means is that a single thread can’t actually
consume 100% of the CPU the way people think it can, because
Windows/Hyper-V will pre-empt it when it’s another thread’s turn. You
can actually have multiple “100%” CPU threads running on the same
system.
The problem is that a normally responsive system expects some
idle time, meaning that some threads will simply let their time slice go
by, freeing it up so other threads get CPU access more quickly. When
you have multiple threads always queuing for active CPU time, the
overall system becomes less responsive because the other threads have to
wait longer for their turns. Using additional cores will address this
concern as it spreads the workload out.
What this means is, if you really want to know how many physical
cores you need, then you need to know what your actual workload is going
to be. If you don’t know, then go with the 8:1 or 12:1, because you’ll
probably fine.
What About Reserve and Weighting (Priority)?
I don’t recommend you tinker with CPU settings unless you really
understand what’s going on. Let the thread scheduler do its job. Just
like setting CPU priorities on threads in Windows can get initiates into
trouble in a hurry, fiddling with hypervisor vCPU settings can throw a
wrench into the operations. In fact, I’ll confess that I haven’t spent a
great deal of time testing it because I trust the hypervisor enough.
Let’s look at the config screen:
The first group of boxes is the reserve. The first box represents the
percentage that I want to set, and its actual meaning depends on how
many vCPUs I’ve given the VM. In this case, I have a 2 vCPU system on a
dual core host, so the two boxes will be the same. If I set 10 percent
reserve, that’s 10 percent of the total physical resources. If I drop
this down to 1 vCPU, then 10 percent reserve becomes 5 percent physical.
The second box, which is grayed out, will be calculated for you as you
adjust the first box.
The reserve is a hard minimum… sort of. If the total of all reserve
settings of all virtual machines on a given host exceeds 100%, then at
least one virtual machine isn’t going to start. But, if a VM’s reserve
is 0%, then it doesn’t count toward the 100% at all (seems pretty
obvious, but you never know). But, if a VM with a 20% reserve is sitting
idle, then other processes are allowed to use up to 100% of the
available processor power… until such time as the VM with the reserve
starts up. Then, once the CPU capacity is available,
the reserved VM will be able to dominate up to 20% of the total
computing power. Because time slices are so short, it’s effectively like
it always has 20% available, but it does have to wait like everyone
else.
So, that vendor that wants a dedicated CPU? If you really want to
honor their wishes, this is how you do it. You enter whatever number in
the top box that makes the second box the equivalent processor power of
however many pCPUs/cores they think they need. If they want one whole
CPU and you have a quad core host, then make the second box show 25%. Do
you really have to? Well, I don’t know. Their software probably doesn’t
need that kind of power, but if they can kick you off support for not
listening to them, well… don’t get me in the middle of that. The real
reason virtualization densities never hit what the hypervisor
manufacturers say they can do is because of software vendors’ arbitrary
rules, but that’s a rant for another day.
The next two boxes are the limit. Now that you understand the
reserve, you can understand the limit. It’s a resource cap. It keeps a
greedy VM’s hands out of the cookie jar.
The final box is the weight. As indicated, this is relative. Every VM
set to 100 (the default) has the same pull with the scheduler, but
they’re all beneath all the VMs that have 200, so on and so forth. If
you’re going to tinker, this is safer than fiddling with reserves
because you can’t ever prevent a VM from starting by changing relative
weights. What the weight means is that when a bunch of VMs present
threads to the hypervisor thread scheduler at once, the higher weighted
VMs go first. That’s it, that’s all.
But What About Hyper-Threading?
Hyper-Threading is an Intel-specific technology that lets a single
core process two separate instructions in parallel (called pipelines).
Neat, right? One problem: the pipelines run in lockstep.
If the
instruction in pipeline one finishes before the thread in pipeline two,
pipeline one sits and does nothing. But, that second pipeline shows up
as another core. So the question is, do you count it? As far as I know,
the official response is: No, Hyper-Threading should not be counted
toward physical cores when considering hypervisor processing
capabilities. Me, I’m a little more lenient. It’s not quite as good as
another actual core, but it’s not useless either. Your mileage may vary.
No comments: