Friday, 28 January 2011

Too many servers not enough eyes or where did all these servers come from?!(7 of 9)

The main purpose of including trending in the reporting and analysis structure is to alert the analysts and the business unit that the potential for performance problems exist based on prior trends. Analysts must choose metrics where linear trending is appropriate: Linear behaviors such as utilizations are great candidates for trending, but non-linear metrics where contention and queuing have a much bigger impact, such as response-times and throughputs, are better candidates for analytic modeling.
In general, analytic models are far more accurate when looking at an overall system or application than trending, but a staff of a few people cannot adequately build, maintain, and update models for hundreds or thousands of systems and applications. Building, editing, calibrating, and evaluating models is a very interactive process and not one that lends itself well to automation. And because of the size of today’s data centers and the relatively small size of the analyst staffs, a level of automation is necessary in order to reduce the number of servers and applications down to a reasonable size for detailed analysis and modeling.
Alerts should be sent far enough in advance so that the analysts can easily analyze data, refine workload characterizations, and build models that can be used to plan needed upgrades or workload moves
Workloads and hardware configurations can change quite rapidly for both business and technological reasons and leave long-term trend reports quite meaningless. Once trend reports are built, an effort should be made to normalize the metrics trended so that changes to architecture or the addition of large chunks of end-users won’t skew the trended data.
This could be accomplished by reporting performance numbers that incorporate the number of users (CPU seconds per user, for example), or via incorporating a common hardware benchmark that can be used to normalize transactions across systems of different sizes.
Typically, the best way to normalize workloads from one environment to another is via the use of an analytic modeling tool and the “what-if” analysis it provides. A good place to improve the world of trend reporting and alerting would be a vendor tool that could provide some “intelligent” trending that would make some of these “what-if” changes to trend sets that would keep valid many of the trend reports built – even when configurations and workload levels change in the system or application.

Rich Fronheiser
Consultant

No comments:

Post a Comment