Monday, 10 November 2014

The Finance House which was doing effective capacity management but wanted to improve governance( Mind the Gap series 6 of 10)

The Finance House, Triplex Finance House, TFH, was known to have a highly qualified and large team of IT professionals (nearly 2000 with about half involved in development, with up to another 1000 sub-contractors called in on demand for major projects).

The service provided is so critical that downtime has to be minimized and performance optimized.

Although there is a mix of domains, as applications become more and more multi-tier, so it became felt that the capacity plan needed to be enterprise wide. However, the degree of metrics and planning in each is somewhat variable. Also, the aggregation of a number of separate plans from different authors into a single document takes a lot of time and editorial passes before it is acceptable to all. Such a large document tends to develop a life of its own.

Although the capacity management processes were in place, the coverage was not complete and a lot of reports were out-of-date. The Capacity Management Database(CMDB) changes were not advised to the Capacity Management Team(CMT), so there were a significant number of essentially defunct machines still being reported on by various hand-crafted reporting regimes over the years.

The main areas needing enhancement lay in those of communication with other teams such as development and testing.

Service Level Agreements(SLA) had some performance criteria, essentially on throughput and often related to what were effectively batch jobs updating the data warehouse.

The initial gap analysis found that:

Services had been effectively categorized but the service catalogue was still emerging.

Resource/component capacity management processes were well established but service capacity management was just being introduced for some category 1 services.

Business capacity management is identified as the next stage and will require more work
on business drivers, KPIs and QoS.

An initial dashboard for management on the capacity management process itself was well
structured but was completed to show everything as “all green”. This had the unsurprising but unanticipated effect that the next request for expenditure was rejected as there were no current problems.

The reporting process was then designed to incorporate an accurate reflection of some
key metrics like coverage of services, servers, deliverables and underpinning metrics.

The main conclusions from the study were the need to enhance:

Quality of info and data flows between CMT and

- service management

- change management

- configuration management

- production monitoring

Measures of business drivers and KPI’s

Reporting to track and predict SLA violations

Service-server mapping and configuration changes

Ability to clone reports with automatic trending etc

Ability to analyse reports and identify bottlenecks

Workload characterization and forecasting

Modeling scenario handling

Essentially it revealed a need for more coverage, better information flow and more automation.

On Wednesday I’ll be looking at a Retailer, don't forget to register for our next webinar in the meantime 'Demystifying z/OS Capacity Management for distributed planners'

Adam Grummitt
Distinguished Engineer

No comments:

Post a Comment