Metron - Capacity Management: Windows Capacity Management

Showing posts with label Windows Capacity Management. Show all posts

Monday, 12 September 2016

How to monitor and manage disk- Windows Server Capacity Management 101 (12 of 12)

There are two main points to monitor on disk, which are occupancy and performance.

Occupancy – use Free Space Ratio%, this shows you how much space you have left on the disk.

Performance - to measure Performance we use average Response time of reads and average response time of writes.

How to monitor

Thresholds - setting a threshold for disk occupancy is dependent on how quickly you can get additional disk space and how quickly disk space is filling up, but a good rule of thumb is 70%warning and 80% Alarm.

Trends - very important when it comes to disk occupancy as it can show very far in advance when you are going to run out of disk space.

Reports – automate reports

This is a good example of a disk slowly filling up, I could trend on this and easily get the date when I am going to run out of disk space. Having this information is very important in ensuring that there is no down time for any of my important applications.

I hope my series has been informative and to summarize.

Capacity management is about ensuring that there is enough IT resource at all times.
Windows systems are under-utilized because of mistrust in their reliability.
Virtualization has helped make windows systems more utilized but not completely solved the problem
It's important to balance the cost of the service to the benefit
When managing windows systems look at CPU, Memory and Disk.

Find more white papers in our Resources section, sign up to our Community for access

http://www.metron-athene.com/_resources/login.asp

Josh Worth

Consultant

Friday, 9 September 2016

Monitoring and managing memory - Windows Server Capacity Management 101(11 of 12)

When monitoring and managing memory in Windows there are normally a couple of questions that people always ask, so I thought I'd deal with these.

How full can I run memory?

I find that a good rule of thumb is 90%.

How do I know when we're starting to run out of memory?

There are a number of indicators that can alert you to the fact that there are problems with memory and these are:

· Page faults increase

· Soft faults occur first (move things around in memory)

· Then hard faults (write pages to disk)

· Increases reads from and writes to page file (hard faults)

· Reads from image files can increase

· Eventually the system will stop responding – or just stop

In my final blog on Monday I'll be looking at how to monitor and manage disk and summarizing.
Don't miss out on our next webinar which takes place next Wednesday ' We’ve got to improve the bottom line! How Business Capacity Management can help to bring profitability to the business.' http://www.metron-athene.com/services/webinars/index.html

Josh Worth
Consultant

Wednesday, 7 September 2016

Memory, what to monitor - Windows Server Capacity Management 101(10 of 12)

Today we’ll look at memory and what you should be monitoring.

Memory utilization of whole system- if need be look at process working set sizes to see who’s the “culprit”, this will show you which process is using the most memory and is a good way to detect memory leaks. A good rule of thumb for memory utilization is to have at least 10% left, this is to prevent excess paging which massively hurts performance.

Page file usage% - if this is high it means that you are regularly running out of memory and windows is having to use the page files.

Memory leaks - when an application dynamically allocates memory, and does not free that memory when it is finished using it, that program has a memory leak. The memory is not being used by the application anymore, but it cannot be used by the system or any other program either. Memory leaks add up over time, and if they are not cleaned up, the system eventually runs out of memory.

How to monitor

Thresholds – when setting a threshold a good place to start is 80% warning and 90% alarm, remember if you are seeing performance issues before hitting the threshold then the threshold should be adjusted. If constantly breached, reset the value or look for memory leak.

Memory Utilization report, example

Above is a good example of a memory leak, you can see that memory utilization is slowly creeping up then I restart the machine it drops down and then starts to creep up again.

I'll share some best practice recommendations for monitoring and managing memory on Friday.

Josh Worth

Consultant

Thursday, 1 September 2016

How to monitor and manage CPU - Windows Server Capacity Management 101(8 of 12)

Hyper threading is splitting a single CPU core into a two logical processors, each of these processors can execute a separate piece of work. You will see one thread being the dominate thread and one processing when the other is stalled. There is some trade off with hyper-threading as it take time for the CPU to switch between threads, Some work will fit well with this such as multiple threads of light-weight work, and more heavy work that needs the whole power of a core to get through could work slower with hyper-threading.

Depending on the type of work Hyper-threading is not always beneficial, sometimes it is better to have cores not have hyper-threading into multiple threads as the jumping between threads can lower the throughput.

On Monday I'll take a closer look at Thresholds and Trends. In the meantime why not take our Capacity Management Maturity Survey and get your free 20 page report.

http://www.metron-athene.com/_capacity-management-maturity-survey/survey.asp

Josh Worth

Consultant

Friday, 19 August 2016

Best practice recommendations - Windows Server Capacity Management 101(6 of 12)

So now we have gone over what we need to properly manage a windows environment, here are some best practise recommendations.

There are 3 main components to monitor in your windows systems:

· CPU – physical utilization

· Memory - usage

· Disk – occupancy and performance

These are all components that if they fill up or are over utilized will severely effect performance.

Best practice recommendations - CPU

What to monitor

The first component to look at is CPU. When monitoring CPU you need to understand the difference between Logical CPU and Physical CPU, if your system is virtualized then it will be logical CPU as the windows environment does not know about the physical CPU it is being hosted on.

· If physical, CPU Total utilization of the machine - a physical system is much simpler as you are directly monitoring the physical components.

· If virtualized, CPU usage by the guest system - you will need to know the Physical CPU usage which is under the hypervisor. If you only look at CPU busy and it says 80%, it could be 80% busy of the 5% that has been allocated to it by VMware. You need to look at process level CPU busy.

· Process-level CPU busy; if virtualized gives a view of relative usage of the physical CPU busy from the host. It shows you how much CPU time each process is using, this is useful to see where all your CPU time is being used.

On Monday I'll be looking at how hard you can work a CPU.

Josh Worth

Consultant

Wednesday, 17 August 2016

Balance service against cost - Windows Server Capacity Management 101(5 of 12)

The better service the more the cost - When it comes to balancing the cost of a service it is important to know what the impact of spending too little or too much will be.

Align your IT spend to your business needs – It’s not about spending more and more it’s about spending smarter. Understanding what the business needs are and understanding how to meet them in a cost effective way.

It’s important to know the wider business, if you are expanding by 50% you need to know what is needed to meet the new demand. Without forward-looking activities, you could be in for any number of unpleasant surprises, such as:

· Performance crises

· Unnecessary hardware expenditure

· User dissatisfaction.

Understanding how to meet demand - Capacity Management is responsible for ensuring adequate capacity is available at all times to meet the requirements of the business. It is directly related to the business requirements and is not simply about the performance of the system’s components, individually or collectively.

On Friday I'll be discussing best practice. Don't forget to sign up for our IT Capacity Planning workshop running in September.

http://www.metron-athene.com/services/online-workshops/index.html

Josh Worth

Consultant

Monday, 15 August 2016

Trends - Windows Server Capacity Management 101 (4 of 12)

The purpose of trending is predicting what will happen by what has happened, the accuracy of trends rely on what is happening will carry on happening into the future.

Importance of trends – A trend gives you warning if your demand is going to outstrip then your supply and gives you chance to act.

How long to trend forward - As with most things there is no one size fits all. When deciding on the length of trends it is important to take a few things into consideration such as how long it takes you to buy and install new hardware. There is no point in trending forward on disk space for a week if it takes you 2 months to get additional space.

So a good length of a trend is how long it takes you to procure, physically install and configure new hardware, if this take 3 month then that is how long your trends should be.

Trending is good at predicting when something will hit a threshold but not telling you what will happen when it does, this is where modeling comes in.

·
Importance of modeling - it allows you to see how a system will react under different workloads. If the business has an event coming up that means it servers are going to be under higher than normal load you want to be able to reassure people that the system can handle it.

· What modeling shows - Modeling will show you how your components will perform under different workloads, and what component will fail and when.

Modeling is used frequently for ‘what if’ scenarios such as “What if my workload increases 30% will my system handle the extra load or will it fail? If it does where will it fail?”

Knowing this lets you be proactive instead of reactive.

On Wednesday I'll be looking at Service versus Cost. Talking of Wednesday we've got a great webinar lined up for you 'Capacity Management from the ground-up' ..don't miss it! http://www.metron-athene.com/services/webinars/index.html

Josh Worth

Consultant

Wednesday, 10 August 2016

How do we properly capacity manage a Windows environment? - Windows Server Capacity Management 101 (3 of 12)

In order to effectively capacity manage we need to:

· Capture/monitor appropriate metrics

· Trend

· Model

· Balance service against cost

The first step to properly capacity managing a windows environment is to properly implement Capacity management and it’s important to plan out how you are going to do it.

You'll start with collecting performance data on the windows environment you want to manage and if need be the host machine if they are running under a hypervisor like VMware or Hyper-V , then using this data to create charts and trends.

It’s important to:

· Capture the right metrics.

· Pick the right capture interval.

· Select when to capture data.

· Remember that some metrics don’t give the complete picture.

To be able to properly capacity manage, you need data but it's important to capture the right data. This means planning what metrics to capture and at what interval.

15 minute intervals of data is a good starting point, but the correct interval length is highly dependent on the type of workload and how you are reporting.

It's also important to consider when you want to capture data, capturing data 24 hours a day will make day averages much lower than if you only report on your peak hours.

This is the same for days, if you do most of your work Monday to Friday then Saturday and Sunday will make the averages lower. These are considerations you need to take in to account when you are collecting data.

It’s also important to understand that the more frequently you collect data the more data that is going to produce, this may sound obvious but capturing data at 2 minutes 24/7 will produce a large amount of data very quickly.

Once you are collecting data at the right time and interval, you want to start reporting and trending that data.

Not all metrics give a true picture of hardware, an example of this is CPU reported busy. Windows will report higher utilization than is true because it’s not aware that VMware will swap in and out the logical CPU, it will just report it was busy the whole time.

On Friday I'll take a look at Trending.

Josh Worth

Consultant