Wednesday 25 September 2013

Capacity Management for Big Data - Managing and Monitoring a Hadoop Cluster (3/3)

Now that we have a better idea of what Hadoop is and how it’s used to manage Big Data, the most important thing to put in place is a mechanism to monitor the performance of the cluster and the nodes within the cluster.

Hadoop is supported on Linux and Windows (it can run over other operating systems, such as BSD, Mac OS/X, and OpenSolaris).  Existing utilities and performance counters exist for those operating systems, which means that athene® can be implemented to capture performance and capacity data from those nodes and that data can be stored in the organization’s Capacity Management Information Store (CMIS).
The Capacity Manager, as in all other environments, is interested in a number of things:

(1)  How is the cluster and how are the nodes within the cluster performing now?  Performance data that shows how much CPU, memory, network resources, and disk I/O are used are readily available and can be stored and reported upon within athene®.  Web reporting and web dashboards can easily show the Capacity Manager the health of the cluster nodes and automatic alerting can quickly point the Capacity Manager to exceptions that can be the source of performance problems in the environment.

(2)  What are the trends for the important metrics? Big data environments typically are dealing with datasets that are increasing in size – as those data sets increase, the amount of processing of that data tends to increase, as well.  The Capacity Manager must keep a close eye out – a healthy cluster today could be one with severe performance bottlenecks tomorrow.  Trend alerting is built into athene® and can alert the Capacity Manager that performance thresholds will be hit in the future, allowing ample time to plan changes to the environment to handle the increased load predicted in the future.

(3)  Storage space is certainly something that cannot be forgotten.  With DAS, data is distributed and replicated across many nodes.  It’s important to be able to take this storage space available in a Hadoop cluster and represent it in a way that quickly shows how much headroom is available at a given time and how the amount of disk space used trends over time.  athene® can easily aggregate the directly attached storage disks to give a big-picture view of disk space available as well as the amount of headroom.  These reports can show how disk space is used over time, as well.  Trend reports and alerting can quickly alert the Capacity Manager when free storage is running low.

(4)  Finally, the ability to size and predict necessary changes to the environment as time goes on.  As with any other environment, a shortage in one key subsystem can affect the entire environment.  The ability to model future system requirements based on business needs and other organic growth is vital for the Capacity Manager.  With athene®, it’s easy to see how trends are affecting future needs and it’s equally easy to model expected requirements based on predicted changes to the workload.

As the price of data storage continues to decrease and the amount of data continues to increase, it becomes even more vital that organizations with Big Data implementations closely manage and monitor their environments to ensure that service levels are met and adequate capacity is always available. 
We'll be taking a look at some of the typical metrics you should be monitoring in our Capacity Management and Big Data webinar(Part 2) tomorrow -register now and don't worry if you missed Part 1, join our Community and catch it on-demand
 
Rich Fronheiser
Chief Marketing Officer
 

No comments:

Post a Comment