Metron - Capacity Management: Virtualization Oversubscription - What’s so scary? Basic Ideas of Queuing (19 of 20)

Friday, 6 January 2017

Virtualization Oversubscription - What’s so scary? Basic Ideas of Queuing (19 of 20)

Queueing theory is pretty simple.

You have a ‘server’. Think of this as the CPU or the person sat at the checkout scanning groceries. They work at a constant pace, and are fed with work from a queue. The Queue is filled by transactions or customers.

The response time of a transaction (from arriving to leaving), is the sum of the time spent queueing, and being served. Given identical transactions, or customers, we know the service time is a constant, what can change is the Arrival rate and the time spent in the Queue.

Utilization and Response Time

What we have here is a chart showing response time on the Y-Axis and the utilization of the server on the X-Axis.

The reason the chart starts part way up the Y-Axis is the Service Time. That’s static. As the utilization of the server becomes higher the chance of the server being busy when a new transaction/customer arrives increases, and therefore the longer the transaction/customer will spend in the queue. As we can see, it’s not a straight line.

All of this can be plotted using the formula R = S / (1-U). Where S is the service time and U is the Utilization of the server.

Benefits of Multiple Servers

When we add in multiple Servers, the line ends up having a more sudden degradation.

This change is sometimes known as “the knee of the curve”. The more servers or CPUs we include the higher the utilization of them before the knee of the curve is observed. This is because there is more chance that a CPU will be available at the moment a piece of work arrives.

Given most of the hosts in a virtualized environment are going to have high numbers of CPUs this means we can run them with pretty high utilizations before queueing takes over.

Consider though that a multiple vCPU VM needs multiple logical CPUs on the host available to do anything.

This has the effect of reducing the number of ‘servers’ or CPUs in the system. If all your VMs are 4 vCPUs and you have 16 logical CPUs in the host that’s the equivalent of a 1 vCPU VM on a 4 CPU host.

The moral of the story here is “use as few vCPUs as possible in each VM, and you’ll reduce queueing and improve performance.

Why are we interested in queuing? I'll answer that question in my final blog on Monday.

Phil Bell

Consultant

www.metron-athene.com

Friday, 6 January 2017

Virtualization Oversubscription - What’s so scary? Basic Ideas of Queuing (19 of 20)

No comments:

Post a Comment

Search This Blog