Imagine you are driving a car, and you are stationary.
There could be several reasons for this. You may be waiting
to pick up someone, you may have stopped
to take a phone call,
or it might be that you have stopped
at a red light. The 1st two of these
(pick up, phone), you decided to stop the car to perform
a task. But in the 3rd case,
the red light
is stopping you doing something
you want to do. You spend the whole time at the red light ready to
move away as soon as you get a green
light. That time you spend waiting at a red light is ready time. When a VM wants to use the processor, but is stopped from doing so it accumulates ready time. This has a direct impact on the performance of the VM.
Ready Time can be accumulated even if there are spare CPU
MHz available. For any processing to happen all the vCPUs assigned
to the VM must be running at the same time.
This means if you have a 4 vCPU VM,
all 4 vCPUs need available cores or hyperthreads to run. So the fewer vCPUs a VM has, the more likely it
is to be able to get onto the processors. You
can reduce contention by having as few
vCPUs as possible in each VM. And if you monitor CPU Threads, vCPUs
and Ready Time for the whole
Cluster, then you’ll
be able to see if there is a correlation between increasing vCPU numbers and Ready Time.
Here is a chart showing
data collected for a VM. In each hour
the VM is doing ~500 seconds
of processing. The VM has 4 vCPUs.
Despite just doing 500 seconds of processing, the ready time accumulated is between
~1200 and ~1500 seconds. So anything being processed spends
3 times as long waiting
to be processed, as it does actually being processed. i.e. 1 second of processing could take 4
seconds to complete.
Now lets look at a VM on the same host, doing the same processing on the same day. Again we can see ~500 seconds of processing in each hour interval. But this time we only have 2vCPUs. The ready time is about ~150 seconds. i.e. 1 second of processing takes 1.3 seconds.
By
reducing the number of vCPUs in the first VM, we could improve transaction
times to somewhere between a quarter and a third of their current time.
Here’s a short video to show the effect of what is happening inside the host to schedule
the physical CPUs/cores to the vCPUs
of the VMs. Clearly most hosts have more than
4 consecutive threads that
can be processed. But let’s keep
this simple to follow.
Imagine the bottom
of a “ready” VM displayed, sliding across to a thread
and the top sliding
across as other VMs move off the Threads. So the VM is no longer rigid
it’s more of an elastic
band.
1. VMs that are “ready”
are moved onto the Threads.
2.
There is not enough space for
all the vCPUs in all the VMs. So some are left behind. (CPU Utilization
= 75%, capacity used = 100%)
3.
If a single vCPU VM finishes processing, the spare Threads
can now be used to process
a 2 vCPU vm. (CPU Utilization = 100%)
4. A 4 vCPU VM
needs to process.
5.
Even if the 2 single vCPU VMs
finish processing, the 4 vCPU VM cannot use the CPU available.
6.
And while it’s accumulating
Ready Time, other single vCPU VMs are able to take
advantage of the available Threads
7.
Even if we end up in a situation where only a single vCPU is being
used, the 4 vCPU VM cannot do any processing. (CPU utilization = 25%)
As
mentioned when we discussed time slicing, improvements have been made in the
area of co-scheduling with each release
of VMware. Amongst other things the time between individual CPUs being
scheduled onto the physical CPUs has increased, allowing for greater flexibility in scheduling VMs with large
number of vCPUs.
Acceptable
performance is seen from larger VMs.
Along with Ready Time,
there is also a Co-Stop
metric. Ready Time can be accumulated against any VM. Co-Stop is specific to
VMs with 2 or more vCPUs and relates to the time “stopped” due to Co-Scheduling contention. E.g. One or more vCPUs has been allocated a physical CPU, but
we are stopped waiting on other vCPUs to be scheduled.
Phil Bell
Consultant
Consultant
No comments:
Post a Comment