Following on from my previous blog on Big Data this is relatively new technology and therefore knowledge around performance tuning is immature. Our instinct tells us that we monitor the systems as a Cluster, how much CPU and Memory is being used with the local storage being monitored both individually and as one aggregated piece of storage. Metrics such as I/O response times, file system capacity and usage are important, to name a few.
What are the challenges?
Big Data Capacity Challenges
So with Big Data technology being relatively new with limited knowledge, our challenges are:
• Working with the business to predict usage - so we can produce accurate representations of future system and storage usage. This is normally quite a challenge for more established systems and applications so it we have to bear in mind that getting this information and validating it will not be easy.
• New technology - limited knowledge around performance tuning
• Very dynamic environment - which provides the challenge to be able to configure, monitor, and track any service changes to be able to provide effective Capacity Management for Big Data.
• Multiple tuning options - that can greatly affect the utilization/performance of systems
What key capacity metrics should you be monitoring for your Big Data environment?
Find out in my next blog and ask us about our Unix Capacity & Performance Essentials Workshop.