Metron - Capacity Management: May 2012

Thursday, 24 May 2012

VMware Metrics - guidelines

Steve Roberts commented on my ‘Top 10 VMware metrics to help pinpoint bottlenecks’ blog and asked the question ‘Do you have any numbers around this monitoring?’

So Steve I’ve put together the following as a guideline :

CPU – The key here is to monitor the overall Host CPU Usage and associated VM CPU Usage. When using vSMP VMs count up the total numbers of vCPUs on the host to the number of PCPUs the host has, if the vCPUs are overcommitted monitor your VM CPU Ready Time. Values > 15% indicate the VM is waiting a significant period of time for CPU. However, if the overall CPU Usage on the host is low, then the impact of CPU Ready Time is likely to be mitigated. It also depends on the types of workload that are running on your VM.

Memory – Just beware of over commitment when deciding how much memory is granted to a VM, monitor the VM Active Memory against the Host Memory Consumed. You will want to see the Active VM Memory less than Host Memory Consumed, otherwise it could indicate that the VM Memory does not completely reside in Physical Memory and maybe in Swap. Also be aware of any limits set on the VM or Resource Pool and whether a limit is necessary? Limits are enforced by Memory reclamation and will increase CPU overhead of the ESX host. A performance tip, is to monitor the Active Memory and set the Granted Memory just above (allowing for longer peaks) the average value. You can also guarantee the memory by using reservations, however this does reduce the ability to overcommit virtual memory on an ESX host.

Disk I/O Latency – Couple of key metrics to monitor here are:

Physical Device Command Latency where you want to see values < 15ms, anything over indicates a slow array.

Kernel Command Latency where values should be consistently 0-1ms, any > 4ms indicate that the Storage System cannot support the amount of data being sent to it.

If you need more information I’ve done a video ‘5 things you’ve always wanted to know about capacity managing Vmware vSphere’ which you can watch on our Capacity Management Channel http://www.youtube.com/watch?v=WZJctffS8kM&feature=channel&list=UL

For those of you in the UK I’m running a 1 day VMware Capacity Management Awareness Workshop in London on June 20^th (I believe there are a few places left, but don’t quote me on that) http://www.metron-athene.com/training/workshops/vmware_capacity_and_performance_essentials.html

If all else fails you can always hire an expert to help through our professional services division.

I hope this helps and please keep those comments coming.

Jamie Baker

Principal Consultant

Friday, 11 May 2012

Mainframe mayhem

In this era of concentration on virtualization and distributed technologies the mainframe still has to be looked after. For the mainframe, while retaining its traditional central role in IT organizations, has evolved to also become the primary hub in large distributed networks making managing its performance as critical as ever.

It seems however that staff expertise on the mainframe is becoming an issue. Many experts are coming up to retirement age and the worry often is about who will carry the torch? (Writing this in the UK in 2012, I just had to get the Olympics in there somewhere) Less experienced staff need hand holding so more experienced staff are having to spend more time with them. Companies are faced with a dilemma. In the current climate, they have barely enough staff to cover everything as it is.

In a perfect world the vagaries of the mainframe would be easy to understand and an automated expert would offer systems analysis of all key zOS performance metrics, highlight areas for concern and even offer advice on the actions required to ensure better on going performance. - a kind of mainframe robot.

How often have Companies wished for that? The automation of mainframe systems performance analysis? Of course, the range of variables on a mainframe are such, the robot might need mainframe processing capability to deliver this. There is a lot of experience and advice out there because we have been handling mainframes for so long. Knowing where it is and having it available whenever you need it might be a different question. The breadth of the mainframe can also mean there are specialists in individual areas such as DB2 and CICS, but having an independent view across all areas is not so easy to find.

At Metron we know what a pivotal role the mainframe plays in many large Companies and we’re launching our new ES/1 NEO, expert systems performance analysis for zOS.

Sure, we’re biased but as far as we’re concerned ES/1 NEO is ‘The One’ when it comes to mainframe performance analytics and its resounding success in Japan holds testament to this.

Whilst it may not make the world perfect, it certainly takes a step in the right direction to mainframe nirvana. It pulls together analysis and expertise across all major mainframe areas such as DB2, CICS, WAS, IMS and zVM. IT automatically highlights potential problem areas.
More to the point, it has a vast library of potential recommended solutions to any problem identified, meaning it can provide independent advice at any time for hard pressed teams, to resolve disputes or if skilled staff resources are in short supply.

We’re running a series of items introducing ES/1 NEO to the world over the coming months: webinars, conference papers, demos, press releases and more. We’d be happy to get some feedback on how you are planning to handle skills shortages in the mainframe space over the next few years and if you feel there is a place for ‘expert’ advice like that provided by ES/1 NEO

Andrew Smith
Chief Sales & Marketing Officer

Tuesday, 8 May 2012

Using Capacity Management to help prevent VM Stall

When running a virtualization/consolidation program it’s usual to concentrate on the low risk servers, the “low hanging fruit” if you will. Once past this point the difficulties or politics in virtualizing more complex and business critical systems can lead to the concept of VM stall. This issue is wide spread across the industry with Gartner estimating that companies have still only virtualized 20% - 30% of their physical estate. VM stall has been evident during a number of consultancy engagements conducted by us, with companies taking an overly cautious approach to virtualization due to a lack of understanding. This lack of understanding tends to be focused on the following areas:

· Difficulties in sizing VMs and inherently the supporting infrastructure

· Understanding the performance aspects of virtualization

· Managing the Capacity of virtualization

Based on experience working on these sorts of issues with our global client base we have developed a number of service driven solutions that can provide as much or as little assistance as required.

This assistance could be in the form of a one day workshop that educates your support teams on what to monitor and how to interpret the data these metrics provide. We can also deliver a more process based solution where we would assess the effectiveness of Capacity Management within your business, in relation to virtualization and make recommendations.

To fully exploit the benefits of virtualization i.e. fewer, but more utilized servers the principles and practices of Capacity management should be employed, take a look at our services...

http://www.metron-athene.com/consulting/consulting.html

I’d be interested to hear your views on VM stall.

Rob Ford

Consultant