Metron - Capacity Management: vCPU

Showing posts with label vCPU. Show all posts

Friday, 6 January 2017

Virtualization Oversubscription - What’s so scary? Basic Ideas of Queuing (19 of 20)

Queueing theory is pretty simple.

You have a ‘server’. Think of this as the CPU or the person sat at the checkout scanning groceries. They work at a constant pace, and are fed with work from a queue. The Queue is filled by transactions or customers.

The response time of a transaction (from arriving to leaving), is the sum of the time spent queueing, and being served. Given identical transactions, or customers, we know the service time is a constant, what can change is the Arrival rate and the time spent in the Queue.

Utilization and Response Time

What we have here is a chart showing response time on the Y-Axis and the utilization of the server on the X-Axis.

The reason the chart starts part way up the Y-Axis is the Service Time. That’s static. As the utilization of the server becomes higher the chance of the server being busy when a new transaction/customer arrives increases, and therefore the longer the transaction/customer will spend in the queue. As we can see, it’s not a straight line.

All of this can be plotted using the formula R = S / (1-U). Where S is the service time and U is the Utilization of the server.

Benefits of Multiple Servers

When we add in multiple Servers, the line ends up having a more sudden degradation.

This change is sometimes known as “the knee of the curve”. The more servers or CPUs we include the higher the utilization of them before the knee of the curve is observed. This is because there is more chance that a CPU will be available at the moment a piece of work arrives.

Given most of the hosts in a virtualized environment are going to have high numbers of CPUs this means we can run them with pretty high utilizations before queueing takes over.

Consider though that a multiple vCPU VM needs multiple logical CPUs on the host available to do anything.

This has the effect of reducing the number of ‘servers’ or CPUs in the system. If all your VMs are 4 vCPUs and you have 16 logical CPUs in the host that’s the equivalent of a 1 vCPU VM on a 4 CPU host.

The moral of the story here is “use as few vCPUs as possible in each VM, and you’ll reduce queueing and improve performance.

Why are we interested in queuing? I'll answer that question in my final blog on Monday.

Phil Bell

Consultant

Friday, 16 December 2016

Virtualization Oversubscription - What’s so scary? VMWare vCPU Co-Scheduling & Ready Time (14 of 20)

Today I’ll explain the effect of what is happening inside the host to schedule the physical CPUs/cores to the vCPUs of the VMs. Clearly most hosts have more than 4 consecutive threads that can be processed but let’s keep this simple to follow.

· VMs that are “ready” are moved onto the Threads.

· There is not enough space for all the vCPUs in all the VMs so some are left behind. (CPU Utilization = 75%, capacity used = 100%)

· If a single vCPU VM finishes processing, the spare Threads can now be used to process a 2 vCPU VM. (CPU Utilization = 100%)

· A 4 vCPU VM needs to process.

· Even if the 2 single vCPU VMs finish processing, the 4 vCPU VM cannot use the CPU available and while it’s accumulating Ready Time, other single vCPU VMs are able to take advantage of the available Threads

· Even if we end up in a situation where only a single vCPU is being used, the 4 vCPU VM cannot do any processing. (CPU Utilization = 25%)

As mentioned when we discussed time slicing, improvements have been made in the area of co-scheduling with each release of VMware. Among other things the time between individual CPUs being scheduled onto the physical CPUs has increased, allowing for greater flexibility in scheduling VMs with large number of vCPUs. Acceptable performance is seen from larger VMs.

Along with Ready Time, there is also a Co-Stop metric. Ready Time can be accumulated against any VM. Co-Stop is specific to VMs with 2 or more vCPUs and relates to the time “stopped” due to Co-Scheduling contention. E.g. One or more vCPUs has been allocated a physical CPU, but we are stopped waiting on other vCPUs to be scheduled.

Imagine the bottom of a “ready” VM displayed, sliding across to a thread and the top sliding across as other VMs move off the Threads, so the VM is no longer rigid it’s more of an elastic band.

VMs and Resource Pools can be allocated Reservations, Shares and Limits and I'll be taking a look at these on Monday.

If you haven't already done so don't forget to sign up to get free access to our Resources, there are some great VMware white papers and on-demand webinars on there.

http://www.metron-athene.com/resources/login.asp

Phil Bell

Consultant

Wednesday, 14 December 2016

Virtualization Oversubscription - What’s so scary? CPU Oversubscription ( 13 of 20)

CPU Oversubscription

Memory is fairly easy to describe but there are a lot of things going on. CPU Oversubscription and the technologies involved can be a little more complex to visualize, but there are less tools that the hypervisor has to work with.

· Time slicing

· Co-Scheduling

· Reservations

· Shares

· Limits

For a start, time is no longer a constant. The hypervisor has the ability to run time at whatever speed it likes, just so long as it averages out in the end.

Co-Scheduling is where we have to have all the vCPUs for a single VM, mapped to logical CPUs from the hardware.

Reservations and Shares apply here also and we’ll have more of a look at how they work later.

Limits (also exist for memory), but these can be applied to restrict some VMs down to a smaller amount of CPU than their vCPU allocation would otherwise allow them to have.

Let’s start with Time Slicing.

Time Slicing

In a typical VMware host we have more vCPUs assigned to VMs than we do physical cores. The processing time of the physical cores (or logical CPUs if hyper threading is in play), has to be shared among the vCPUs in the VMs. The more vCPUs we have, the less time each can be on the core, and therefore the slower time passes for that VM. To keep the VM in time, extra time interrupts are sent in quick succession when the VM is processing, so time passes slowly and then very fast.

Significant improvements have been made in this area over the releases of VMware. vCPUs can be scheduled onto the hardware a few milliseconds apart but the basic concept remains in place.

Join me again on Friday when I'll look at VMWare vCPU Co-Scheduling & Ready Time.

Phil Bell

Consultant

Friday, 25 November 2016

Virtualization Oversubscription - What’s so scary? VMware CPU and Memory Maximums (5 of 20)

CPU and Memory are the main items people consider for Virtualized systems, so let’s lay down the maximums.

• Virtual Machine Maximum

– 128 vCPUs per VM

• Host CPU maximums

– Logical CPUs per host 480 (Logical CPUs being simultaneous threads so that might be 240 hyper-threaded cores.)

– Virtual machines per host 1024

– Virtual CPUs per host 4096

– Virtual CPUs per core 32

• Maximum with a caveat: The achievable number of vCPUs per core depends on the workload and specifics of the hardware. For more information, see the link below for the latest version of Performance Best Practices for VMware vSphere

https://www.vmware.com/pdf/vsphere6/r60/vsphere-60-configuration-maximums.pdf

This raises 2 points.

· Clearly it’s ok to oversubscribe CPUs

· There is no set number to tell you how much oversubscription is OK.

Memory VMware Maximums

Memory is a lot simpler.

• 6TB per Host

– Well 12TB on specific hardware

• 4TB per VM

Having set out those few ground rules we can now look at memory oversubscription and I'll be doing just that on Monday.

There are still a few places left on our VMware vSphere Capacity & Performance Essentials online workshop so don't forget to book your place.

http://www.metron-athene.com/services/online-workshops/capacity-management-workshops.html

Phil Bell

Consultant

Wednesday, 23 November 2016

Virtualization Oversubscription (What’s so scary?) - What can be oversubscribed? (4 of 20)

Today I'll deal with what can be oversubscribed.

What can be oversubscribed?

• CPUs

• Memory

• Disk

• NICs

In our virtual world, now that we have broken the link between the OS and the hardware, we can over provision all sorts of things.

CPU, Memory, Disk (as we mentioned) and NICs are all “Oversubscribed”.

Disk we have already looked at, Memory and CPU I’ll go into in more detail on later but I thought it was worth mentioning NICs here.

Typically people seem to be running with 10 - 15 VMs on a single host which will have significantly fewer NICs installed. A Server typically wouldn’t use all the bandwidth of its NIC so that unused bandwidth is like the unused space on disk.

When the VMs talk to other VMs on the same Host that’s not generating traffic though the physical NICs, so we might consider that as the equivalent of de-duplication.

In the next few blogs I'll be looking at CPU and Memory.

Phil Bell

Consultant

Friday, 11 November 2016

Idle VMs - Why should we care? (2 of 3)

In my previous blog I mentioned the term VM Sprawl and this is where Idle VMs are likely to factor.

Often VMs are provisioned to support short term projects, for development/test processes or for applications which have now been decommissioned. Now idle, they’re left alone, not bothering anyone and therefore not on the Capacity and Performance teams radar.

Which brings us back to the question. Idle VMs - Why should we care?

We should care, for a number of reasons but let's start with the impact on CPU utilization.

When VMs are powered on and running, timer interrupts have to be delivered from the host CPU to the VM. The total number of timer interrupts being delivered depends on the following factors:

· VMs running symmetric multiprocessing (SMP), hardware abstraction layers (HALs)/kernels require more timer interrupts than those running Uniprocessor HALs/Kernels.

· How many virtual CPUs (vCPUs) the VM has.

Delivering many virtual timer interrupts can negatively impact on the performance of the VM and can also increase host CPU consumption. This can be mitigated however, by reducing the number of vCPUs which reduces the timer interrupts and also the amount of co-scheduling overhead (check CPU Ready Time).

Then there's the Memory management of Idle VMs. Each powered on VM incurs Memory Overhead. The Memory Overhead includes space reserved for the VM frame buffer and various virtualization data structures, such as Shadow Page Tables (using Software Virtualization) or Nested Page Tables (using Hardware Virtualization). This also depends on the number of vCPUs and the configured memory granted to the VM.

We’ll have a look at a few more reasons to care on Monday, in the meantime why not complete our Capacity Management Maturity Survey and find out where you fall on the maturity scale. http://www.metron-athene.com/_capacity-management-maturity-survey/survey.asp

Jamie Baker

Principal Consultant

Friday, 14 October 2016

5 Top Performance and Capacity Concerns for VMware - Ready Time

As I mentioned on Wednesday there are 3 states which the VM can be in:

Threads – being processed and allocated to a thread.

Ready – in a ready state where they wish to process but aren’t able to.

Idle – where they exist but don’t need to be doing anything at this time.

In the diagram below you can see that work has moved over the threads to be processed and there is some available headroom. Work that is waiting to be processed requires 2 CPU’s so is unable to fit and creates wasted space that we are unable to use at this time.

We need to remove a VM before we can put a 2 CPU VM on to a thread and remain 100% busy.

In the meantime other VM’s are coming along and we now have a 4vCPU VM accumulating Ready Time.

2 VM’s moves off but the 4vCPU VM waiting cannot move on as there are not enough vCPU’s available.

It has to wait and other work moves ahead of it to process.

Even when 3vCPU’s are available it is still unable to process and will be ‘queue jumped’ by other VM’s who require less vCPU’s.

Hopefully that is a clear illustration of why it makes sense to reduce contention by having as few vCPUs as possible in each VM.

Ready Time impacts on performance and needs to be monitored. On Monday I'll be dealing with Monitoring Memory.

Phil Bell

Consultant

Wednesday, 12 October 2016

5 Top Performance and Capacity Concerns for VMware - Ready Time

Imagine you are driving a car, and you are stationary, there could be several reasons for this. You may be waiting to pick someone up, you may have stopped to take a phone call, or it might be that you have stopped at a red light. The first two of these (pick up, phone) you have decided to stop the car to perform a task. In the third instance the red light is stopping you doing something you want to do. In fact you spend the whole time at the red light ready to move away as soon as the light turns to green. That time is ready time.

When a VM wants to use the processor, but is stopped from doing so. It accumulates ready time and this has a direct impact on performance.

For any processing to happen all the vCPUs assigned to the VM must be running at the same time. This means if you have a 4 vCPU all 4 need available cores or hyperthreads to run. So the fewer vCPUs a VM has, the more likely it is to be able to get onto the processors.

To avoid Ready Time

You can reduce contention by having as few vCPUs as possible in each VM. If you monitor CPU Threads, vCPUs and Ready Time you’ll be able to see if there is a correlation between increasing vCPU numbers and Ready Time in your systems.

Proportion of Time: 4 vCPU VM

Below is an example of a 4vCPU VM, each doing about 500 seconds worth of real CPU time and about a 1000’s worth of Ready Time.

For every 1 second of processing the VM is waiting around 2 seconds to process, so it’s spending almost twice as long to process than it is processing. This is going to impact on the performance being experienced by the end user who is reliant on this VM.

Now let’s compare that to the proportion of time spent processing on a 2 vCPU VM. The graph below shows a 2 vCPU VM doing the same amount of work, around 500

seconds worth of real CPU time and as you can see the Ready Time is significantly less.

There are 3 states which the VM can be in and we'll take a look at these on Friday.

Don't forget to book on to our VMware vSphere Capacity & Performance Essentials workshop starting on Dec 6 http://www.metron-athene.com/services/online-workshops/index.html

Phil Bell

Consultant

Monday, 10 October 2016

5 Top Performance and Capacity Concerns for VMware - Time Slicing

As I mentioned on Friday the large difference between what the OS thinks is happening and what is really happening all comes down to time slicing.

In a typical VMware host we have more vCPUs assigned to VMs than we do physical cores.

The processing time of the cores has to be shared among the vCPUs. Cores are shared between vCPUs in time slices, 1 vCPU to 1 core at any point in time.

More vCPUs lead to more time slicing. The more vCPUs we have the less time each can be on the core, and therefore the slower time passes for that VM. To keep the VM in time extra time interrupts are sent in quick succession. So time passes slowly and then very fast.

More time slicing equals less accurate data from the OS.

Anything that doesn’t relate to time, such as disc occupancy should be ok to use.

On Wednesday I'll be dealing with Ready Time, you've still got time to register to come along to my webinar 'VMware and Hyper-V Virtualization over-subscription(What's so scary?) taking place on October 12. http://www.metron-athene.com/services/webinars/index.html

Phil Bell

Consultant