Monday, 24 October 2011

Cloud Computing - Complexity, Cost, and Capacity Management

Computer systems have always been complex.  The range of useful work computers can do is extended by layering complexity on top of the basic machine.  Current “Cloud Computing” capabilities are no exception.  Regardless of complexity, the costs of systems must be understood and reported to enable business planning and management.

So what are these perspectives that need to be understood, reported on and normalized to enable comparison and calculation of unit costs?  The business likes to look at total costs, irrespective of how their service is provided.  They are right to do this – what happens under the covers to deliver that service shouldn’t be their concern.  They just want to know that the service level they require is being provided and what that costs per business transaction or process.

On the systems side, it used to be relatively simple.  Internal systems were the norm.  We had to account for costs of hardware, software, floor space, power, air conditioning, ancillary costs such as insurance and of course, staff costs.  As applications and services became more interlinked and disparate in implementation, it became ever harder to compare and calculate costs for a particular service delivered to users.

Outsourcing and now the Cloud have added yet more levels of complexity.  On one level it seems simple: we pay a cost for an outsourced provision (application, hardware, complete data center or whatever).  In practice it becomes ever more difficult to isolate costs.  Service provision from outside our organization is often offered at different tiers of quality (Gold, Silver, Bronze etc).  These have different service levels, and different levels of provision, for example base units of provision and overage costs that vary and make direct comparison awkward.

Increasingly the model is to mix all of these modes of service provision, for example hybrid Cloud implementations featuring internal and external Cloud provision plus internal services all combined to deliver what the user needs.

Each facet of systems use can be monitored and accounted for in terms of resource utilization, and ultimately, dollar costs.  However, overly detailed data quickly adds volume and cost, becomes unwieldy, and delays analysis and reporting while overly simplified data weakens analysis and adversely impacts the quality of decision support.  The points of monitor and level of detail for data to be collected is driven by considerations of trade-offs between cost, utility, and performance and are highly detailed and dynamic.  Frequently, though, data collection is minimalized and aggregated to a level which obscures the level of detail needed to make some decisions.  For example, cpu metrics aggregated to 5 minute periods and suitable for capacity planning are not very useful to understand cpu resource consumption for individual transactions, a performance engineering concern.

A distinction perhaps needs to be made between different types of costs.  We might need to move towards calculating regular on-going fixed costs for users, supplemented by variable costs based on changing circumstances.  To my mind this is a little like having the running costs of your car covered by a general agreement (free servicing for 3 years) with those qualifying criteria any insurance business likes to slip in (assumes no more than 20,000 miles per year average motoring, standard personal use, does not include consumables such as tires, wiper blades).  If we go outside the qualifying criteria, we have to pay for individual issues to be fixed. 
Cloud in particular lends itself to costing IT services based on a fixed charge plus variable costs dependent on usage.

Going back to those complex systems - we need to ensure we normalize our view of services across this complex modern IT model, a fancy way of saying we must compare apples with apples.  The key is being able to define a transaction from a business perspective and relate it to the IT processing that underlies that definition. 
Application Transaction Management tools such as the Sharepath software distributed by Metron enable you to get this transaction visibility across diverse infrastructures, internal and external. 
Capacity Management data capture and integration tools like Metron’s Athene then allow you to relate underpinning resource metrics to those transactions, at least for systems where you are able or allowed to measure those resources. 
This brings us to a last key point about external providers, outsourcers or Cloud providers.  You need to ensure that they provide you with that resource level data for their systems or the ability for you to access your own systems and get that data.  Once you have the numbers, calculating costs per transaction is just math.  Without the numbers, you can’t calculate the cost. 
Business increasingly wants to see the costs of their services, so make sure you demand  access to the information you need from your suppliers, as well as from your own organization. Then you can put together a comprehensive and realistic view of costs for services in today’s multi-tiered internal and external application world.

GE Guentzel
Consultanthttp://www.metron-athene.com/

No comments:

Post a Comment