Imagine you are driving a car, and you are stationary. There could be several reasons for this. You may be waiting to pick up someone, you may have stopped to take a phone call, or it might be that you have stopped at a red light. The 1st two of these (pick up, phone), you decided to stop the car to perform a task. But in the 3rd case, the red light is stopping you doing something you want to do. You spend the whole time at the red light ready to move away as soon as you get a green light. That time you spend waiting at a red light is ready time. When a VM wants to use the processor, but is stopped from doing so it accumulates ready time. This has a direct impact on the performance of the VM.
Ready Time can be accumulated even if there are spare CPU MHz available. For any processing to happen all the vCPUs assigned to the VM must be running at the same time. This means if you have a 4 vCPU VM, all 4 vCPUs need available cores or hyperthreads to run. So the fewer vCPUs a VM has, the more likely it is to be able to get onto the processors. You can reduce contention by having as few vCPUs as possible in each VM. And if you monitor CPU Threads, vCPUs and Ready Time for the whole Cluster, then you’ll be able to see if there is a correlation between increasing vCPU numbers and Ready Time.
Here is a chart showing data collected for a VM. In each hour the VM is doing ~500 seconds of processing. The VM has 4 vCPUs.
Despite just doing 500 seconds of processing, the ready time accumulated is between
~1200 and ~1500 seconds. So anything being processed spends 3 times as long waiting to be processed, as it does actually being processed. i.e. 1 second of processing could take 4 seconds to complete.
Now lets look at a VM on the same host, doing the same processing on the same day. Again we can see ~500 seconds of processing in each hour interval. But this time we only have 2vCPUs. The ready time is about ~150 seconds. i.e. 1 second of processing takes 1.3 seconds.
By reducing the number of vCPUs in the first VM, we could improve transaction times to somewhere between a quarter and a third of their current time.
Here’s a short video to show the effect of what is happening inside the host to schedule the physical CPUs/cores to the vCPUs of the VMs. Clearly most hosts have more than 4 consecutive threads that can be processed. But let’s keep this simple to follow.
Imagine the bottom of a “ready” VM displayed, sliding across to a thread and the top sliding across as other VMs move off the Threads. So the VM is no longer rigid it’s more of an elastic band.
1. VMs that are “ready” are moved onto the Threads.
2. There is not enough space for all the vCPUs in all the VMs. So some are left behind. (CPU Utilization = 75%, capacity used = 100%)
3. If a single vCPU VM finishes processing, the spare Threads can now be used to process a 2 vCPU vm. (CPU Utilization = 100%)
4. A 4 vCPU VM needs to process.
5. Even if the 2 single vCPU VMs finish processing, the 4 vCPU VM cannot use the CPU available.
6. And while it’s accumulating Ready Time, other single vCPU VMs are able to take advantage of the available Threads
7. Even if we end up in a situation where only a single vCPU is being used, the 4 vCPU VM cannot do any processing. (CPU utilization = 25%)
As mentioned when we discussed time slicing, improvements have been made in the area of co-scheduling with each release of VMware. Amongst other things the time between individual CPUs being scheduled onto the physical CPUs has increased, allowing for greater flexibility in scheduling VMs with large number of vCPUs.
Acceptable performance is seen from larger VMs.
Along with Ready Time, there is also a Co-Stop metric. Ready Time can be accumulated against any VM. Co-Stop is specific to VMs with 2 or more vCPUs and relates to the time “stopped” due to Co-Scheduling contention. E.g. One or more vCPUs has been allocated a physical CPU, but we are stopped waiting on other vCPUs to be scheduled.