Friday, 24 January 2014

Top 10 VMware Metrics to help pinpoint bottlenecks

Top 10 VMware metrics list to assist you in pinpointing performance bottlenecks within your VMware vSphere virtual infrastructure.  I hope you find these useful. 
CPU
1. Ave CPU Usage in MHz - this metric should be reported for both host and guest levels.  Because a guest (VM) has to run on an ESX host, that ESX host has a finite limit of resource.  High CPU Usage at the host level could indicate a bottleneck, however create a breakdown of all guests hosted to give a clear indication of who is using the most.  If you have enabled DRS on your cluster, you may see a rise in the number of vMotions as DRS attempts to load balance.
2. CPU Ready Time - This is an important metric that gives a clear indication of CPU Overcommitment within your VMware Virtual Infrastructure.  CPU Overcommitment can lead to significant CPU performance problems due to the way in which ESX CPU schedules Virtual CPU (vCPU) work onto Physical CPUs (pCPUs).  This is reported at the guest level.  Any values reported in seconds can indicate that you have provisioned too many vCPUs for this guest.  Look at all the vCPUs assigned to all hosted guests and then the number of Physical CPUs available on the host(s) to see whether you have overcommitted the CPU.
Memory
3. Ave Memory Usage in KB - similar to Average CPU Usage, this should be reported at both Host and Guest levels.  It can give you an indication in terms of who is using the most memory but high usage does not necessarily indicate a bottleneck.  If Memory Usage is high, look at the values reported for Memory Ballooning/Swapping.
4. Balloon KB - values reported for the balloon indicate that the Host cannot meet its Memory requirements and is an early warning sign of memory pressure on the Host.  The Balloon driver is installed via VMware Tools onto Windows and Linux guests and its job is to force the operating system, of lightly used guests, to page out unused memory back to ESX so it can satisfy the demand from hungrier guests.
5. Swap Used KB - if you see values being reported at the Host for Swap, this indicates that memory demands cannot be satisfied and processes are swapped out to the vSwp file.  This is ‘Bad’.  Guests may or will have to be migrated to other hosts or more memory will need to be added to this host to satisfy the memory demands of the guests.
6. Consumed - Consumed memory is the amount of Memory Granted on a Host to its guests minus the amount of Memory Shared across them.  Memory can be over-allocated, unlike CPU, by sharing common memory pages such as Operating System pages.  This metric displays how much Host Physical Memory is actually being used (or consumed) and includes usage values for the Service Console and VMkernel.
7. Active - this metric reports the amount of physical memory recently used by the guests on the Host and is displayed as “Guest Memory Usage” in vCenter at Guest level.

Disk I/O
8. Queue Latency - this metric measures the average amount of time taken per SCSI command in the VMkernel queue. This value must always be zero. If not, it indicates that the workload is too high and the storage array cannot process the data fast enough.
9. Kernel Latency - this metric measures the average amount of time, in milliseconds, that the VMkernel spends processing each SCSI command. For best performance, the value should be between 0-1 milliseconds. If the value is greater than 4ms, the virtual machines on the Host are trying to send more throughput to the storage system than the configuration supports. If this is the case, check the CPU usage, and increase the queue depth or storage.
10. Device Latency - this metric measures the average amount of time, in milliseconds, to complete a SCSI command from the physical device. Depending on your hardware, a number greater than 15ms indicates there are probably problems with the storage array.   Again if this is the case, move the active VMDK to a volume with more spindles or add more disks to the LUN.
Note:  Please be aware when reporting usage values, you take into consideration any child resource pools specified with CPU/Memory limits and report accordingly. 
I'm running a two-part webinar series, starting this Thursday on VMware vSphere Performance Management Challenges and Best Practices. Register and come along http://www.metron-athene.com/services/training/webinars/index.html
Jamie Baker
Principal Consultant
Metron-Athene Inc.

Wednesday, 15 January 2014

Top 5 Capacity and Performance Concerns for 2014

Most people will agree that business services and the IT that underpins those services are more complex than ever. Therefore, it’s crucial that these services be designed and then monitored and managed with performance and capacity in mind.

Metron has been in the business of Capacity and Performance Management for almost 30 years. From the early days when most services ran on mainframes with dumb terminals through to today’s heavy use of virtualization technologies, cloud computing, and an increased reliance on Big Data, Metron has been there providing software, services, and expertise to many organizations, large and small.

The train isn’t stopping in 2014. Technologies will continue to grow more complex and everyone is looking to see what the Top Technology Trends are going to be. Here are a few predictions from Gartner http://www.gartner.com/newsroom/id/2603623 

The need for Capacity and Performance Management will be more important than ever in 2014, and Metron will be here to offer expert guidance to those that need it.

I'll be running a webinar that will look at five of the top Capacity and Performance Concerns for 2014 - concerns that exist today, with an eye on concerns that will likely face Capacity Managers and other IT professionals in the next 12 months. 

Join me for a session that will help you plan your team’s 2014.  

I'll be covering:
  • Five top concerns for 2014
  • Why those concerns need to be considered
  • How to integrate these into your existing Capacity Management process OR
  • How to build a Capacity Management process that puts a focus on these concern
Register now and I'll look forward to speaking to you all today http://www.metron-athene.com/services/training/webinars/index.html

Rob Ford 
Principal Consultant

Thursday, 2 January 2014

Happy New Year! May 2014 see you healthy, happy and successful.

I’m not a great one for New Year resolutions and predictions.  I know people who are however – and I clearly work with them!  Our first free webinar of the New Year is Top 5 Capacity and Performance Concerns for 2014.  It’s on Wednesday January 15, and is the first in our new format of monthly webinars.http://www.metron-athene.com/services/training/webinars/index.html

If I were to make a more general prediction, it would be that everything will become more and more mainstream, part of our everyday working life.  The IT industry likes to become excited over each new technological change.  History tends to suggest that while each change does make a radical difference to our world, certain aspects of the change are never as revolutionary as previously expected. 

I think good examples of this are Cloud and Big Data.  Both will become increasingly major factors in our working lives – but not to the exclusion of all else.  Rather I see them as blending in to how we currently manage infrastructure, sitting alongside traditional activities, used where they are best suited.  Too often we talk of new technologies sweeping away the old.  3D printing is a good example.  Early talk was of it sweeping away traditional manufacturing in many areas, completely replacing certain activities.  Experience is already showing this not to be the case.  While it is certainly revolutionary as an approach, it is being adopted alongside traditional, established manufacturing practices, improving productivity in areas where traditional techniques are not able.

Cloud and Big Data will be the same.  Cloud based applications will replace many traditional applications, but their management will slip seamlessly into existing infrastructure management.  To prevent waste of resources, a central view is needed no matter how those resources are provisioned.  Similarly with Big Data, it will lend itself to certain areas of analysis.In some it will revolutionize life by enabling us to see patterns in data that we previously could not, but where proven techniques meet our needs, it will not need to be deployed.  My excitement from a capacity management perspective is in seeing what those new insights into the data we handle will be.

Finally, I will offer one firm prediction, but perhaps for further ahead than 2014.  In the not too distant future, all the concerns we keep hearing about security of data in the Cloud will disappear.  As an important problem to overcome and with human ingenuity so good, I believe that ways will emerge of making us comfortable with having our important data stored out there.

Enough – back to the day to day job of managing the here and now for me.  I hope Metron gets the chance to work with you in 2014 and make our own contribution to your success. 

Andrew Smith
CEO