Metron - Capacity Management: Josh Worth

Showing posts with label Josh Worth. Show all posts

Monday, 12 September 2016

How to monitor and manage disk- Windows Server Capacity Management 101 (12 of 12)

There are two main points to monitor on disk, which are occupancy and performance.

Occupancy – use Free Space Ratio%, this shows you how much space you have left on the disk.

Performance - to measure Performance we use average Response time of reads and average response time of writes.

How to monitor

Thresholds - setting a threshold for disk occupancy is dependent on how quickly you can get additional disk space and how quickly disk space is filling up, but a good rule of thumb is 70%warning and 80% Alarm.

Trends - very important when it comes to disk occupancy as it can show very far in advance when you are going to run out of disk space.

Reports – automate reports

This is a good example of a disk slowly filling up, I could trend on this and easily get the date when I am going to run out of disk space. Having this information is very important in ensuring that there is no down time for any of my important applications.

I hope my series has been informative and to summarize.

Capacity management is about ensuring that there is enough IT resource at all times.
Windows systems are under-utilized because of mistrust in their reliability.
Virtualization has helped make windows systems more utilized but not completely solved the problem
It's important to balance the cost of the service to the benefit
When managing windows systems look at CPU, Memory and Disk.

Find more white papers in our Resources section, sign up to our Community for access

http://www.metron-athene.com/_resources/login.asp

Josh Worth

Consultant

Friday, 9 September 2016

Monitoring and managing memory - Windows Server Capacity Management 101(11 of 12)

When monitoring and managing memory in Windows there are normally a couple of questions that people always ask, so I thought I'd deal with these.

How full can I run memory?

I find that a good rule of thumb is 90%.

How do I know when we're starting to run out of memory?

There are a number of indicators that can alert you to the fact that there are problems with memory and these are:

· Page faults increase

· Soft faults occur first (move things around in memory)

· Then hard faults (write pages to disk)

· Increases reads from and writes to page file (hard faults)

· Reads from image files can increase

· Eventually the system will stop responding – or just stop

In my final blog on Monday I'll be looking at how to monitor and manage disk and summarizing.
Don't miss out on our next webinar which takes place next Wednesday ' We’ve got to improve the bottom line! How Business Capacity Management can help to bring profitability to the business.' http://www.metron-athene.com/services/webinars/index.html

Josh Worth
Consultant

Wednesday, 7 September 2016

Memory, what to monitor - Windows Server Capacity Management 101(10 of 12)

Today we’ll look at memory and what you should be monitoring.

Memory utilization of whole system- if need be look at process working set sizes to see who’s the “culprit”, this will show you which process is using the most memory and is a good way to detect memory leaks. A good rule of thumb for memory utilization is to have at least 10% left, this is to prevent excess paging which massively hurts performance.

Page file usage% - if this is high it means that you are regularly running out of memory and windows is having to use the page files.

Memory leaks - when an application dynamically allocates memory, and does not free that memory when it is finished using it, that program has a memory leak. The memory is not being used by the application anymore, but it cannot be used by the system or any other program either. Memory leaks add up over time, and if they are not cleaned up, the system eventually runs out of memory.

How to monitor

Thresholds – when setting a threshold a good place to start is 80% warning and 90% alarm, remember if you are seeing performance issues before hitting the threshold then the threshold should be adjusted. If constantly breached, reset the value or look for memory leak.

Memory Utilization report, example

Above is a good example of a memory leak, you can see that memory utilization is slowly creeping up then I restart the machine it drops down and then starts to creep up again.

I'll share some best practice recommendations for monitoring and managing memory on Friday.

Josh Worth

Consultant

Monday, 5 September 2016

How to monitor CPU - Windows Server Capacity Management 101(9 of 12)

As promised today we'll be looking at how to monitor CPU.

Thresholds

When dealing with thresholds there is no one size fits all but a good rule of thumb is 70% for a warning and 85% for an alarm these can and should be tweaked when you have a better idea of performance thresholds for your CPU.

Additionally it is good to have a thresholds in place for when a CPU is being under-utilized maybe threshold for 20% and 10% this lets you know which machine could be pushed harder.

Trends

When setting up a trend, you have to remember the longer the trend the less reliable it is. A good rule of thumb for trend is 3 months, as this gives a reasonably reliable trend and also lets you know within time to make a hardware change.

Reports

CPU Total Utilization Estd% - Report Example

Above is an example of an Estimated CPU core busy over a month for my computer with a trend going forward 1 month, you can see quickly that the trend line is going down. This kind of chart is very simple to create with a capacity management tool like athene^®.

On Wednesday I'll be dealing with Memory and how to monitor this. Don't forget to take a look at our workshops, there are some great ones coming up soon

http://www.metron-athene.com/services/online-workshops/index.html

Josh Worth

Consultant

Thursday, 1 September 2016

How to monitor and manage CPU - Windows Server Capacity Management 101(8 of 12)

Hyper threading is splitting a single CPU core into a two logical processors, each of these processors can execute a separate piece of work. You will see one thread being the dominate thread and one processing when the other is stalled. There is some trade off with hyper-threading as it take time for the CPU to switch between threads, Some work will fit well with this such as multiple threads of light-weight work, and more heavy work that needs the whole power of a core to get through could work slower with hyper-threading.

Depending on the type of work Hyper-threading is not always beneficial, sometimes it is better to have cores not have hyper-threading into multiple threads as the jumping between threads can lower the throughput.

On Monday I'll take a closer look at Thresholds and Trends. In the meantime why not take our Capacity Management Maturity Survey and get your free 20 page report.

http://www.metron-athene.com/_capacity-management-maturity-survey/survey.asp

Josh Worth

Consultant

Tuesday, 30 August 2016

How busy can I run the CPU? - Windows Server Capacity Management 101(7 of 12)

How hard you can work a CPU is highly dependent on the type of CPU and the type of work it is doing, there is no one size fits all number for how hard to work them.

Newer = more capable - Newer CPUs have larger on-chip cache memory, allowing more instructions to be kept nearer the cores. Cache memory is quicker to access than main memory but there is much less of it Megabytes of very fast cache, Gigabytes of fast ram and Terabytes of slow disk.

It’s also not just about the speed of the clock – a 3.4 GHz Pentium IV is nowhere near as capable (as in “can get through work”) as a brand new 2 GHz Xeon processor. Because the newer CPU can do more in one clock cycle then a older CPU can do in many.

More cores = can be pushed harder - It’s all about THROUGHPUT, not just speed.

The more cores a CPU has the harder you can run it without performance problems.

Dependent on the type of work; hyper-threading or not…

Some best practice recommendations are:

1 core

· At 50% busy will take twice as long as if it were 0% busy.

· At 80% busy it will take 5 times as long.

· At 90% it will take 10 times as long.

You can see with one core it does not take much work to make throughput slowdown.

 2 cores

· At 50% it will take 1.3 times as long.

· At 80% it will take 2.7 times as long.

· At 90% it will take 5.2 times as long.

16 Cores

• 50% - 1x, 80% - 1.02x, 90% - 1.22x

For a 16 core CPU it does not make much of a difference running it at 80% compared to 50%. But you have to keep in mind that all configurations max out at 100% just the number of cores flattens out the curve. You can see with more cores it takes longer to hit the knee of the curve.

Benefits of Multiple Cores

This chart illustrates that as you add more and more cores the response time curve is more and more flat.

On Thursday I'll be looking at how to monitor and manage CPU, don't forget to sign up for our next workshop 'IT Capacity Planning' http://www.metron-athene.com/services/online-workshops/index.html

Josh Worth

Consultant

Friday, 19 August 2016

Best practice recommendations - Windows Server Capacity Management 101(6 of 12)

So now we have gone over what we need to properly manage a windows environment, here are some best practise recommendations.

There are 3 main components to monitor in your windows systems:

· CPU – physical utilization

· Memory - usage

· Disk – occupancy and performance

These are all components that if they fill up or are over utilized will severely effect performance.

Best practice recommendations - CPU

What to monitor

The first component to look at is CPU. When monitoring CPU you need to understand the difference between Logical CPU and Physical CPU, if your system is virtualized then it will be logical CPU as the windows environment does not know about the physical CPU it is being hosted on.

· If physical, CPU Total utilization of the machine - a physical system is much simpler as you are directly monitoring the physical components.

· If virtualized, CPU usage by the guest system - you will need to know the Physical CPU usage which is under the hypervisor. If you only look at CPU busy and it says 80%, it could be 80% busy of the 5% that has been allocated to it by VMware. You need to look at process level CPU busy.

· Process-level CPU busy; if virtualized gives a view of relative usage of the physical CPU busy from the host. It shows you how much CPU time each process is using, this is useful to see where all your CPU time is being used.

On Monday I'll be looking at how hard you can work a CPU.

Josh Worth

Consultant

Wednesday, 17 August 2016

Balance service against cost - Windows Server Capacity Management 101(5 of 12)

The better service the more the cost - When it comes to balancing the cost of a service it is important to know what the impact of spending too little or too much will be.

Align your IT spend to your business needs – It’s not about spending more and more it’s about spending smarter. Understanding what the business needs are and understanding how to meet them in a cost effective way.

It’s important to know the wider business, if you are expanding by 50% you need to know what is needed to meet the new demand. Without forward-looking activities, you could be in for any number of unpleasant surprises, such as:

· Performance crises

· Unnecessary hardware expenditure

· User dissatisfaction.

Understanding how to meet demand - Capacity Management is responsible for ensuring adequate capacity is available at all times to meet the requirements of the business. It is directly related to the business requirements and is not simply about the performance of the system’s components, individually or collectively.

On Friday I'll be discussing best practice. Don't forget to sign up for our IT Capacity Planning workshop running in September.

http://www.metron-athene.com/services/online-workshops/index.html

Josh Worth

Consultant

Monday, 15 August 2016

Trends - Windows Server Capacity Management 101 (4 of 12)

The purpose of trending is predicting what will happen by what has happened, the accuracy of trends rely on what is happening will carry on happening into the future.

Importance of trends – A trend gives you warning if your demand is going to outstrip then your supply and gives you chance to act.

How long to trend forward - As with most things there is no one size fits all. When deciding on the length of trends it is important to take a few things into consideration such as how long it takes you to buy and install new hardware. There is no point in trending forward on disk space for a week if it takes you 2 months to get additional space.

So a good length of a trend is how long it takes you to procure, physically install and configure new hardware, if this take 3 month then that is how long your trends should be.

Trending is good at predicting when something will hit a threshold but not telling you what will happen when it does, this is where modeling comes in.

·
Importance of modeling - it allows you to see how a system will react under different workloads. If the business has an event coming up that means it servers are going to be under higher than normal load you want to be able to reassure people that the system can handle it.

· What modeling shows - Modeling will show you how your components will perform under different workloads, and what component will fail and when.

Modeling is used frequently for ‘what if’ scenarios such as “What if my workload increases 30% will my system handle the extra load or will it fail? If it does where will it fail?”

Knowing this lets you be proactive instead of reactive.

On Wednesday I'll be looking at Service versus Cost. Talking of Wednesday we've got a great webinar lined up for you 'Capacity Management from the ground-up' ..don't miss it! http://www.metron-athene.com/services/webinars/index.html

Josh Worth

Consultant