Metron - Capacity Management: VMware time slicing

Following on from Monday's blog, the effect we saw between the OS and VMware is caused by time slicing. In a typical VMware host we have more vCPUs assigned to VMs than we do physical cores. A situation known as over-provisioning, and to some extent the original purpose of virtualization.

The processing time of the physical cores has to be shared among the vCPUs in the VMs. The more vCPUs we have the less time each can be on the core, and therefore the slower time passes for that VM. To keep the VM in time extra time interrupts are sent in quick succession. So time passes slowly and then very fast.

Time is no longer a constant, but the OS doesn’t know that. So the safest approach is to avoid using anything from the OS that involves an element of time.

Significant improvements have been made in this area over the releases of VMware. VMware tools has a number of tricks to try and make the OS metrics as close as possible, as well as improved co-scheduling of CPUs. But the basic concept remains in place. Later I will discuss how it can be ok to use averages and estimates for reporting on the future, when we have the choice of accurate data from VMware, or less accurate data from the OS. I would suggest taking accuracy where we can easily do so, has to be the better option.

On Friday I'll be looking at the 5 key VMware metrics to monitor, in the meantime take a look at the great selection of white papers and on-demand webinars on VMware in our Resources section.

https://www.metron-athene.com/resources/index.asp

Phil Bell

Consultant

The effect we saw between the OS and VMware, in my blog on Friday, is caused by time slicing.

In a typical VMware host we have more vCPUs assigned to VMs than we do physical cores. The processing time of the cores has to be shared among the vCPUs. Cores are shared between vCPUs in time slices, 1 vCPU to 1 core at any point in time.

More vCPUs lead to more time slicing. The more vCPUs we have the less time each can be on the core, and therefore the slower time passes for that VM. To keep the VM in time extra time interrupts are sent in quick succession. So time passes slowly and then very fast.

More time slicing equals less accurate data from the OS.

Anything that doesn’t relate to time, such as disc occupancy should be ok to use.

Ready Time

Imagine you are driving a car, and you are stationary, there could be several reasons for this. You may be waiting to pick someone up, you may have stopped to take a phone call, or it might be that you have stopped at a red light. The first two of these (pick up, phone) you have decided to stop the car to perform a task. In the third instance the red light is stopping you doing something you want to do. In fact you spend the whole time at the red light ready to move away as soon as the light turns to green. That time is ready time.

When a VM wants to use the processor, but is stopped from doing so. It accumulates ready time and this has a direct impact on performance.

For any processing to happen all the vCPUs assigned to the VM must be running at the same time. This means if you have a 4 vCPU all 4 need available cores or hyperthreads to run. So the fewer vCPUs a VM has, the more likely it is to be able to get onto the processors.

To avoid Ready Time

You can reduce contention by having as few vCPUs as possible in each VM. If you monitor CPU Threads, vCPUs and Ready Time you’ll be able to see if there is a correlation between increasing vCPU numbers and Ready Time in your systems.

Proportion of Time: 4 vCPU VM

Below is an example of a 4vCPU VM, each doing about 500 seconds worth of real CPU time and about a 1000’s worth of Ready Time.

For every 1 second of processing the VM is waiting around 2 seconds to process, so it’s spending almost twice as long to process than it is processing. This is going to impact on the performance being experienced by the end user who is reliant on this VM.

Now let’s compare that to the proportion of time spent processing on a 2 vCPU VM. The graph below shows a 2 vCPU VM doing the same amount of work, around 500 seconds worth of real CPU time and as you can see the Ready Time is significantly less.

There are 3 states which the VM can be in:

Threads – being processed and allocated to a thread.

Ready – in a ready state where they wish to process but aren’t able to.

Idle – where they exist but don’t need to be doing anything at this time.

In the diagram below you can see that work has moved over the threads to be processed and there is some available headroom. Work that is waiting to be processed requires 2 CPU’s so is unable to fit and creates wasted space that we are unable to use at this time.

We need to remove a VM before we can put a 2 CPU VM on to a thread and remain 100% busy.

In the meantime other VM’s are coming along and we now have a 4vCPU VM accumulating Ready Time.

2 VM’s moves off but the 4vCPU VM waiting cannot move on as there are not enough vCPU’s available.

It has to wait and other work moves ahead of it to process.

Even when 3vCPU’s are available it is still unable to process and will be ‘queue jumped’ by other VM’s who require less vCPU’s.

Hopefully that is a clear illustration of why it makes sense to reduce contention by having as few vCPUs as possible in each VM.

Ready Time impacts on performance and needs to be monitored.

On Wednesday I'll be looking at Memory performance, in the meantime don't forget to register for our 'Taking a trip down vSphere Memory Lane' webinar taking place on June 24th

http://www.metron-athene.com/services/webinars/index.html

Phil Bell

Consultant

www.metron-athene.com

Wednesday, 5 July 2017

Understanding VMware Capacity - Why OS monitoring can be misleading, Time Slicing (2 of 10)

Monday, 8 June 2015

Top 5 Performance and Capacity Concerns for VMware - Time Slicing and Ready Time

Search This Blog