Metron - Capacity Management: December 2014

Monday, 29 December 2014

Cloud Concerns - Top 5 Key Capacity Management Concerns for UNIX/Linux(8 of 12)

To understand our Capacity and Performance concerns around cloud, we need to fully understand what the cloud is and its relation to UNIX/Linux systems. As mentioned previously cloud is underpinned by virtualization due to the ability to provide virtualized systems rapidly, providing services such as Capacity on Demand and then charging accordingly.

Cloud (Internal, External, Hybrid)

• Rapid Provisioning

• Metering (Chargeback) Capacity on Demand

• Resource Pooling and Elasticity

Hybrid popularity is growing.

At present less secure tasks are being handled in external clouds - The initial use of Hybrid clouds (a mixture of Public and Private) was traditionally to run tasks that were not classed as data sensitive. The use of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) in a Public cloud was very much favored by Test and Development departments to get quick and easy access to resources.

Increasing confidence in security - As security has improved and individuals and businesses confidence in Public clouds increases, the popularity of the Hybrid cloud is growing, providing an extension to business systems as and when required.

How easy is access to data? - How can we apply effective Capacity Management to a Hybrid cloud? We can and should have access to data from within an Internal cloud and also know what applications are running on what systems sharing what resources. Getting the same information from an External cloud provider however is not easy if you can get it at all.

Would we want to and do we care? - Does the cloud provider give guarantees for availability, capacity and performance?

SLA’s and appropriate penalties if breaches occur - Do you have or need to have any Service Level Agreements in place? If so, what penalties are in place in terms of a breach or number of breaches?

Hopefully that gives you food for thought.

I'll be looking at another hot topic next time, Big Data - what it is and what the concerns are. We've got some great webinars available to download on these and other capacity management topics, so sign up to access them for free

http://metron-athene.com/_downloads/index.html

Jamie Baker

Principal Consultant

Friday, 26 December 2014

2014 Holiday Reflections...

Another holiday season is upon us -- many of us have celebrated Christmas Day with our families and are now (depending on where we are) celebrating Boxing Day or, perhaps, are back to work.

Today, I'm going to sprinkle some of my personal reflections into this blog entry -- between plates of ham, trips to the gym (to work off the ham), and an indoor softball tournament played by my soon-to-be 10-year-old daughter.

This is my 12th Christmas with Metron and I know there are quite a few people with many more than this -- a real testament to the company and the people I work with on a daily basis.

Every year the Taunton (UK) office has a company lunch in mid-December. This year, I happened to be spending a week in Taunton and was able to join the staff in the festivities.

The food was wonderful, the company was fantastic, and I extended the experience by spending the next few hours with a much smaller group at a neighboring pub. I got to know people I'd never had the pleasure of spending time with, and vice-versa -- to them I was someone who was only a voice on a telephone. I became real to them, as well. It was a real highlight of the year for me.

At Metron, one of the things we pride ourselves on is personalized service to our clients. I can tell many stories over the last decade-plus of how we've had customers put in requests for our software that would help them serve their clients better -- and the internal conversation that followed was wholly centered on how we could make that happen, if possible. Indulge me as I tell but one of these stories.

I was new to Metron -- this was early in 2004. A new client was taking a week's worth of athene® training -- on-site at their offices. During the training, one of the people suggested that there was a better way for their company to display data in a particular chart. I took a note, and after the class I sent an email off to the development manager. The next morning, as I checked my emails, I found one from a developer that included a patch that did exactly what the new client wanted. The 6-hour time zone difference meant that I was able to go back to the classroom the very next day with a solution I could show. The client was, notably, impressed.

This was actually an easy patch to create, but as a person who had experience in much larger organizations, I was amazed at how quickly this was turned around -- developed, tested, sent. Depending on the request, these things aren't always possible, but with Metron they have been a lot more likely than in some other places I've been, mainly because the question "What can we do?" is always asked.

For us, as well as our customers, business is 24x7, 365 days a year. For our retail clients, this season is the most important time of the year -- having adequate capacity and performance to meet customer expectations is crucial for them. As it is for all our clients during the time periods that matter most to them. That's what capacity management is all about -- having adequate capacity to meet SLAs during the periods of peak demand. That's what good Capacity Managers plan for -- and that's their investment in athene® and in Metron pay the most dividends.

And as we head into 2015, Metron and its dedicated staff will be there to provide software and support, education, and services to help organizations and individuals improve how they ensure service to their clients. As we have every year since our founding in 1986.

Stick around and see what Metron will be up to in 2015. Till then, more ham, more running, more softball, more ham...

Happy Holidays, everyone.

Rich Fronheiser
Chief Marketing Officer

Linux differences - Top 5 Key Capacity Management Concerns for UNIX/Linux (7 of 12)

Following on from the development and introduction of UNIX and also the creation of the GNU General Public License (GNU GPL), in order to spread software freely, many of the programs required in an OS (such as libraries, compilers, text editors, a UNIX shell, and a windowing system) were completed by the early 1990s, but few elements such as device drivers, daemons, and the kernel were incomplete.

In 1991, Linus Torvalds began to work on MINIX, a Unix-like OS originally developed for academics, whose code was freely available under GNU GPL project. After which the first LINUX kernel was released on 17 September 1991, for the Intel x86 PC systems. This kernel included various system utilities and libraries from the GNU project to create a usable operating system. All underlying source code can be freely modified and used.

Linux is great for small- to medium-sized operations, and today it is also used in large enterprises where UNIX was considered previously as the only option. A few years ago, most big enterprises where networking and multiple user computing is the main concern, didn't consider Linux as an option. But today, with major software vendors porting their applications to Linux, being freely distributed and being installed pretty much on anything from mobile cell phones to supercomputers, the OS has entered the mainstream as a viable option for Web serving and office applications.

Both the main commercial virtualization offerings in VMware’s vSphere and Microsoft’s Hyper-V both support Linux as an operating system and along with Windows is the most popular virtualized operating system within organizations.

Its popularity has increased to include a user base estimated to be in excess of 25 million because of its application in embedded technologies and because it is free and easily available.

I'll be covering Cloud Concerns next time as to understand our Capacity and Performance concerns around cloud, we need to fully understand what the cloud is and its relation to UNIX/Linux systems.

Jamie Baker
Principal Consultant

Monday, 22 December 2014

Data -Top 5 Key Capacity Management Concerns for UNIX/Linux (6 of 12)

To produce and analyze the recommended performance reports we need to capture and store performance data. This process is normally performed by installing an agent responsible for running UNIX/Linux system tools such as SAR, VMSTAT and IOSTAT to capture information or running (potentially intrusive) kernel commands. As with any data capture whether it is local or remote you’d expect to incur some overhead. Typically agents should incur no more than 1% CPU usage when capturing data, however as mentioned some agents may incur more.

In addition when capturing data, can you rely on what it is reporting? Remember this is software and software can contain bugs. But you say ”we have to rely on what the operating system gives us”, and this is true to some extent. From my experience there are several tools to provide this information within the UNIX operating system – some are accurate and some are not.

For example: Does your Linux system have the sysstat package installed and is it an accurate and reliable version? Or in Solaris Containers are the Resident Set Size (RSS) values being incorrectly reported due to a double counting of memory pages? An example of this is shown below.

Zone Memory Reporting

This report is an interesting one. It shows the amount of RSS memory per zone against the Total Memory Capacity of the underlying server (Red).

Because RSS values are double counted the sum of RSS for each zone far exceeds the actual Physical Memory Capacity.

I'll be looking at Linux differences next.

Jamie Baker
Principal Consultant

Friday, 19 December 2014

Performance concerns - Top 5 Key Capacity Management Concerns for Unix/Linux(5 of 12)

Just think about this for a moment – it’s just you and your workstation. No contention. Introduce a server which hosts one or many applications and can be accessed by one or many users simultaneously and you are likely to get contention. Where contention exists you get queuing and inevitably response times will suffer.

With UNIX/Linux virtualization we have two main concerns:

1) Will virtual systems hosted on shared hardware (via Hypervisor) will impact one another?

2) What additional overhead and impact on performance does requesting hardware resources through a hypervisor cost?

To answer question 1, we have to understand what resources each virtual machine is requesting at the same time. So it is prudent to produce performance reports for each hosted VM.

Do they conflict with each other or complement each other? There are virtualization tools that have been developed to keep VMs apart if they conflict and together if they complement to improve performance.

To answer question 2, initial software virtualization techniques such as Binary Translation allowed for existing x86 operating systems to be virtualized. The hardware call request had to go via the hypervisor thus adding on an additional overhead, somewhere between 10-20% on average. As time progressed a combination of hardware virtualization and paravirtualized operating systems have reduced overhead and improved virtualized application response times.

Paravirtualization is a virtualization technique that presents a software interface to VMs that is similar, but not identical to that of the underlying hardware. The intention is to reduce the guest's execution time spent performing operations which are substantially more difficult to run in a virtual environment compared to a non-virtualized environment. It provides specially defined 'hooks' to allow the VMs and underlying host to request and acknowledge these tasks, which would otherwise be executed in the virtual domain (where execution performance is worse). A successful paravirtualized platform may allow the virtual machine monitor (VMM) to be simpler (by relocating execution of critical tasks from the virtual domain to the host domain), and/or reduce the overall performance degradation of machine-execution inside the virtual-guest.

To produce and analyze the recommended performance reports we need to capture and store performance data and I'll be discussing this on Monday.

http://metron-athene.com/_downloads/index.html

Jamie Baker

Principal Consultant

Wednesday, 17 December 2014

VM Principles - Top 5 Key Capacity Management Concerns Unix/Linux(4 of 12)

VM Principles

The following principles of virtual machines hold true across all virtualized environments provided within the IT industry.

• Isolation

– Own root (/), processes, files, security

• Virtualization

– Instance of Solaris O/S

• Granularity

– Resource allocation, Pools

• Transparency

– Standard Solaris interfaces

• Security

– No global reboots, isolated within each zone

On Friday I'll look at performance concerns.

Take a look at our online workshops for 2015 in the meantime http://metron-athene.com/services/training/online-workshops/index.html

Jamie Baker

Principal Consultant

Monday, 15 December 2014

Use of Internal/ External/ Hybrid clouds - Top 5 Key Capacity Management Concerns for Unix/Linux (3 of 12)

Cloud technology, whether the resources are internally or externally provided, is underpinned by virtualization. Virtualization provides the 3 following minimums for cloud services.

• On demand self-service

• Resource Pooling

• Rapid Elasticity

Virtualization

• Why virtualize? It’s an important question and one that shouldn’t be taken lightly. There are many benefits as I have already mentioned about virtualization, but it shouldn’t be taken as a broad brush approach. In my experience it is better to perform some kind of pre-virtualization performance analysis before actually committing to virtualizing an application, even though the vendor tells you it is ok! I am a big advocate of virtualization as it provides great flexibility, reduces hardware and associated costs, and allows you to manage your estate from a single pane of glass.

Commercial Vendors

– IBM, HP, Oracle, VMware, Microsoft

Specifically looking from a Capacity and Performance viewpoint, virtualizing applications in some instances can prove to be detrimental to performance. Even though virtualization allows for multiple virtual systems to be hosted, those hosts are still physical servers which have finite capacity and I’ll cover this later on.

Open source (Linux)

– KVM

– Xen

Underpins cloud technology - As mentioned previously, virtualization underpins Cloud technology. Cloud services can be provided through demand resourcing, flexibility and quick provisioning. There are many more cloud services now available than the traditional Infrastructure, Platform and Software.

– IaaS, PaaS, SaaS

Oracle Enterprise Linux - Oracle’s offering in the UNIX/Linux space is Oracle Enterprise Linux, which is a clone of Red Hat Enterprise Linux and Solaris after their acquisition of SUN. Both operating systems provide virtualization offerings, however the Linux coverage is growing quite substantially with its support for tablets and phones and database coverage.

The topic of cost differences for Linux over Solaris or AIX is a whole paper in itself. Both RHEL and Solaris are free and open source and are supported on x86 architecture, however entry point x86 hardware is cheaper than ultraSPARC hardware, unless you have the capacity requirements for ultraSPARC entry offerings.

On Wednesday I'll be looking at virtual machine principles. In the meantime don't forget to sign up for our 'Capacity Management 101' webinar http://metron-athene.com/services/training/webinars/index.html

Jamie Baker

Principal Consultant

Friday, 12 December 2014

Licensing concerns - Top 5 Key Capacity Management Concerns for Unix/Linux (2 of 12)

As most applications are licensed by the numbers of host CPUs, there is a concern on the cost of hosting applications on fixed physical core systems. Virtual systems now provide that flexibility of changing the number of vCPUs on a running system or in most cases after a reboot.

KPIs

Due to this additional complexity, effective Capacity Management is even more important, more specifically where services are sharing both physical and virtualized resources. With the numbers of systems being virtualized on the increase year on year, there is a number of Key Performance Indicators (KPI) we should be closely monitoring:

– Reduction in physical estate (servers)

– Reduction in physical space

– Reduction of power usage, cooling, associated costs

The 1990’s and early part of the millennium witnessed a rise in the number of physical distributed systems (UNIX/Windows) to a point where physical data center space and energy demand was becoming an issue. The introduction of virtualization has helped address these issues, by allowing multiple low usage systems to be virtualized on to a single piece of hardware. This in-turn had initially started to reduce the number of physical systems hosted within a data center, which removed the need for additional space and/or allowed for that space to be reclaimed.

The reduction of physical components and servers also allowed for a reduction in power usage and if reclaiming space and/or moving systems to a small data center were feasible then the amount of power dedicated to cooling could be reduced.

On Monday I'll be taking a look at Cloud technology.

Jamie Baker

Principal Consultant

Wednesday, 10 December 2014

Top 5 Key Capacity Management Concerns for Unix/Linux (1 of 12)

Since UNIX systems were developed back in the 1970’s, things as you’d expect have moved on a long way. Single CPU systems, the size of washing machines, have been replaced with many racks of multi-core (hyper-threaded) blade servers.

In more recent years, the re-introduction of virtualization allowed for multiple virtual systems to be hosted on a single piece of hardware via a hypervisor. Modern data centers will typically host many thousands of both physical and virtual servers.

Physical and Virtual hosts

– Web Servers

– Database hosts

The mix of physical and virtual tends to depend on what applications are being hosted. It is likely that physical servers will be hosting large RDBMS instances and virtual servers hosting web applications.
On Friday I'll be taking a look at licensing concerns.

Jamie Baker
Principal Consultant

Wednesday, 3 December 2014

I’m an old mainframer, I admit it - Highlights from Guide Share Europe Conference

I was in Systems Programming for the first 20 years of my working life and I can’t quite let go of that even though I’ve been working in Capacity Management for almost as long. I was delighted to be able to attend the recent Guide Share Europe conference at the beginning of November, at Whittlebury Hall in Northamptonshire. Metron tries to get there as often as possible to keep our company’s name and offerings in the consciousness of the mainframe community. For several years now there’s been an air of almost sadness about the place. This year, what a difference. The exhibition hall was humming with vendor stalls well laid out around two adjacent rooms that encouraged people to move around, mingle, chat, eat and drink without feeling on top of each other.

Many streams of sessions were available, covering a wide range of topics. There were “101” classes for people new to the mainframe world, “what’s new” sessions for the features and facilities of recent announcements, there were round-table discussions, technical workshops and personal development sessions.

Always of great interest is the session with Mark Anzani of IBM, despite being at 8am the morning after the night before – if you’ve ever been to a GSE conference dinner you’ll know what I mean. Mark has the marvellous title of “VP, System z Strategy, Resilience and Ecosystem”. His sessions are always full to the doors, and it’s not just the bacon baps and strong coffee that draw people in. This year didn’t disappoint. The direction the System z hardware and software is taking is (as it has been for so long that people forget) quietly revolutionary. Ideas about quantum computing, neuro-synaptic chips and nano-photonics on the hardware side were complemented by software developments – tamper-proof processing, self-healing, self-optimizing and an accelerated push to CAMS – Cloud, Analytics, Mobile and Security.

On the capacity management side, as previously signalled, the introduction of “hyperthreading” for System z processors came up – and this time it was more than just aspiration. IBM say they now have machines running in labs that can do this, but they aren’t in a hurry to release them until they fully understand the implications and benefits. It’ll probably be a year or two before they come to market. Why is this happening? For the simple reason that Intel and other chip manufacturers have gone that way – the speed you can make chips run at tops out as you try to balance the power needed to drive them with the heat it generates, and the speed at which electrons can move around circuits. Top end System z processors already run at a screaming 5.5 GHz and that isn’t going to get much faster, if at all. The alternative to going faster is to go wider – that is, do more in each clock tick, or allow other work to run where a thread of work is stalled. The ability to interleave threads of work on chips means that throughput will be improved, and it’s vital to grasp that concept correctly. Multiple threads of work on a single core lets you get more work done in a given time – it does not make the core faster. Initially there will be two threads per core, but this will rise with successive newer machines.

Work like lightweight Java programs, maybe Linux systems running on an IFL could be happy in this environment, but I started to wonder just how traditional heavy batch jobs or CICS systems that love big engines will react. Perhaps there will be an option to pool processors together that are “hyperthreaded” and some that are not, and work will be directed to each pool as appropriate.

IBM will need to include decent instrumentation so performance and capacity people can keep an eye on the physical usage as well as any logical or virtual usage. Almost no other operating systems provide this easily – not Windows, not Linux, not Solaris, not HP-UX. All of them report the utilization of processors as the operating system sees them, not as the underlying hardware is actually being used. That’s why athene® has its “Core Estimated Utilization” metrics alongside the “Reported Utilization” ones, to provide a view into that mostly invisible information. http://www.metron-athene.com/products/athene_ES1/index.html

IBM does a great job with AIX on System p and OS/400 on iSeries machine by giving you the logical, virtual and physical information about processor activity - let’s hope they continue that into System z and RMF.

Whatever is done, sites or businesses that rely on measures like CPU seconds for billing will need to review and possibly change their accounting software to make sure they continue to provide consistent metering of services. Customers of such facilities will need to check their bills carefully to make sure they aren’t being charged a “logical CPU” cost.

System z is still recognizably the child of the System/360 machines that came into the world in 1964. Here’s to ever more amazing changes in the next half a century.

Nick Varley

Chief Services Officer