Queueing
theory is pretty simple.
The response
time of a transaction (from arriving to leaving), is the sum of the time spent
queueing, and being served. Given
identical transactions, or customers, we know the service time is a constant, what
can change is the Arrival rate and the time spent in the Queue.
You have a
‘server’. Think of this as the CPU or
the person sat at the checkout scanning groceries. They work at a constant pace, and are fed
with work from a queue. The Queue is
filled by transactions or customers.
Utilization
and Response Time
What we have
here is a chart showing response time on the Y-Axis and the utilization of the
server on the X-Axis.
The reason
the chart starts part way up the Y-Axis is the Service Time. That’s static. As the utilization of the server becomes
higher the chance of the server being busy when a new transaction/customer
arrives increases, and therefore the longer the transaction/customer will spend
in the queue. As we can see, it’s not a
straight line.
All of this
can be plotted using the formula R = S / (1-U).
Where S is the service time and U is the Utilization of the server.
Benefits of
Multiple Servers
When we add
in multiple Servers, the line ends up having a more sudden degradation.
This change
is sometimes known as “the knee of the curve”.
The more servers or CPUs we include the higher the utilization of them
before the knee of the curve is observed.
This is because there is more chance that a CPU will be available at the
moment a piece of work arrives.
Given most of
the hosts in a virtualized environment are going to have high numbers of CPUs
this means we can run them with pretty high utilizations before queueing takes
over.
Consider
though that a multiple vCPU VM needs multiple logical CPUs on the host available
to do anything.
This has the
effect of reducing the number of ‘servers’ or CPUs in the system. If all your VMs are 4 vCPUs and you have 16
logical CPUs in the host that’s the equivalent of a 1 vCPU VM on a 4 CPU
host.
The moral of
the story here is “use as few vCPUs as possible in each VM, and you’ll reduce
queueing and improve performance.
Why are we interested in queuing? I'll answer that question in my final blog on Monday.
Phil Bell
Consultant
No comments:
Post a Comment