Metron - Capacity Management: June 2014

Monday, 30 June 2014

Time Slicing - Top 5 performance & capacity concerns for VMware

As discussed in Friday's blog the large difference between what the OS thinks is happening and what is really happening all comes down to time slicing.

In a typical VMware host we have more vCPUs assigned to Virtual Machines(VM's) than we do physical cores.

The processing time of the cores has to be shared among the vCPUs. Cores are shared between vCPUs in time slices, 1 vCPU to 1 core at any point in time.

More vCPUs lead to more time slicing. The more vCPUs we have the less time each can be on the core, and therefore the slower time passes for that VM. To keep the VM in time extra time interrupts are sent in quick succession, so time passes slowly and then very fast.

A VM with multiple vCPU's has a distinct disadvantage when scheduling CPU cycles. A 1 vCPU VM can execute instructions as soon as a single core is available. If a VM has 4 vCPU's, then it literally cannot execute any instructions until 4 cores are available and I'll be looking at this in more detail on Wednesday when I'll be taking a look at Ready Time.

In the meantime don't forget to sign up to our webinar 'Taking a Trip down VMware vSphere Memory Lane' which looks at how memory is used in a VMware vSphere environment http://www.metron-athene.com/services/training/webinars/index.html

Phil Bell

Consultant

Friday, 27 June 2014

5 Top Performance and Capacity Concerns for VMware

Jamie will be hosting our webinar Taking a Trip down VMware vSphere Memory Lane on July 23^rd http://www.metron-athene.com/services/training/webinars/index.html so I thought it would be pertinent to take a look at the Top 5 Performance and Capacity Concerns for VMware in my blog series.

I’ll begin with Dangers with OS Metrics.

Almost every time we discuss data capture for VMware, we’ll be asked by someone if we can capture the utilization of specific VMs, by monitoring the OS. The simple answer is no.

In the example below the operating system sees that VM1 is busy 50% of the time but what VMware sees is that it was only there for half of half the time and accordingly reports that it is 25% busy.

Looking at the second VM running, VM2, both the operating systems and VMware are in accordance that it is in full use and report that it is 50% busy.

This is a good example of the disparity that can sometimes occur, so let's compare OS with VMware data.

Below is some data from a real VM.

The (top) dark blue line is the data captured from the OS, and the (Bottom) light blue line is the data from VMware. While there clearly is some correlation between the two, at the start of the chart there is about 1.5% CPU difference. Given we’re only running at about 4.5% CPU that is an overestimation by the OS of about 35%. While at about 09:00 the difference is ~0.5%, so even the difference doesn’t remain stable. This is a small system but if you scaled this up it would not be unusual to see the OS reporting 70% CPU utilization and VMware reporting 30%.

This large difference between what the OS thinks is happening and what is really happening all comes down to time slicing.

Stay with me and I'll be looking at time slicing and the effect it has on Monday.

Phil Bell

Consultant