HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: Storage Admins: What do you monitor to ensure a healthy env?
PostPosted: Mon Jul 09, 2018 9:40 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
Howdy.. Was hoping I could get some advice on what everyone monitors, how often then monitor it, what alerts they have setup with what thresholds, etc..

Basically, what do you do as a storage admin to ensure a healthy environment? I am very green to 3Par so I was hoping to get some general "must haves"..

IE: Setup Threshold Alerts - This threshold with these metrics

Setup Reporting - These reports help with X and I run them X..

Anything else that would be beneficial!

Thanks for your time!!


Top
 Profile  
Reply with quote  
 Post subject: Re: Storage Admins: What do you monitor to ensure a healthy
PostPosted: Thu Jul 12, 2018 6:07 am 

Joined: Wed Nov 09, 2011 12:01 pm
Posts: 392
Space is probably the first thing to watch, especially if not used to thin based arrays (you will have typically allocated out more space then exists in the array, so you need to track actual usage and be aware of potential spikes).

Raw space in each tier can be alerted from the array system properties, depending on your scale, growth and time to turn around additional disk purchases I'd set this around 75-85%.

SSMC has some built-in dashboards for showing growth rate graphs, worth keeping an eye on.

Performance probably the next thing, I tend to do daily reports on busy systems for 'Exported Volumes Compare Hosts' so I can keep on eye on who is pushing the most IO to/from the arrays (I have these set to email so I can dig back easily, they take a while to generate so looking at daily PDFs normally quicker then doing a new report).

Also have alerts based on port bandwidth (around 85-90%), phyical disk IOPS (around the stress point of the type of disk, e.g. 15k FC/SAS ~220iops) and disk port service time (for some potential hotspot and silent fault spotting, around 35ms).

Most of the fault monitoring is all automatic via the SP/SSMC with some alerts on patches etc thrown in, it's also worth setting up the Infosight link to get more general info like balanced host ports, system load, recommended patches and trending etc.

You can also run the built-in healthcheck commands before doing anything to the system (HPE support will typically do this before/after they do things).


Top
 Profile  
Reply with quote  
 Post subject: Re: Storage Admins: What do you monitor to ensure a healthy
PostPosted: Mon Jul 16, 2018 11:49 pm 
User avatar

Joined: Sat May 03, 2014 2:01 pm
Posts: 71
Location: Dallas, TX
VLUN and PD thresholds for latency in SystemReporter/SSMC are a good place to start as well as some sort of host/guest based latency reporting (Windows PerfMon, ESXi vRO etc). Latency is typically the main thing we watch out for other than capacity.

The 3PAR reporting tends to report on the low side of actual latency we see at hosts so we tend to use a combination of it and host or DB reporting to verify all is well.

If you run FC, implement something like Brocade MAPS

_________________
Bryan W
Senior Architect/Manager of System Infrastructure, Dallas TX
https://www.linkedin.com/in/bryanlwhite


Top
 Profile  
Reply with quote  
 Post subject: Re: Storage Admins: What do you monitor to ensure a healthy
PostPosted: Fri Jul 20, 2018 10:06 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
ailean wrote:
Space is probably the first thing to watch, especially if not used to thin based arrays (you will have typically allocated out more space then exists in the array, so you need to track actual usage and be aware of potential spikes).

Raw space in each tier can be alerted from the array system properties, depending on your scale, growth and time to turn around additional disk purchases I'd set this around 75-85%.

SSMC has some built-in dashboards for showing growth rate graphs, worth keeping an eye on.

Performance probably the next thing, I tend to do daily reports on busy systems for 'Exported Volumes Compare Hosts' so I can keep on eye on who is pushing the most IO to/from the arrays (I have these set to email so I can dig back easily, they take a while to generate so looking at daily PDFs normally quicker then doing a new report).

Also have alerts based on port bandwidth (around 85-90%), phyical disk IOPS (around the stress point of the type of disk, e.g. 15k FC/SAS ~220iops) and disk port service time (for some potential hotspot and silent fault spotting, around 35ms).

Most of the fault monitoring is all automatic via the SP/SSMC with some alerts on patches etc thrown in, it's also worth setting up the Infosight link to get more general info like balanced host ports, system load, recommended patches and trending etc.

You can also run the built-in healthcheck commands before doing anything to the system (HPE support will typically do this before/after they do things).


Thanks a lot for the reply!! As far as setting up the threshold alerts via SSMC.. If NinjaStar says around 3,000 IO is the total I can expect should I set up the alert to be "Physical Drives > Total IOps > 3,000" ?

If the sampling is set to Hi-Res which is every 5 minutes, does that put a performance impact on the SAN itself?

Also, I do not see a report 'Exported Volumes Compare Hosts'.. Do you just schedule your reports to run nightly at like midnite or something?

Thanks again!


Top
 Profile  
Reply with quote  
 Post subject: Re: Storage Admins: What do you monitor to ensure a healthy
PostPosted: Wed Sep 05, 2018 8:01 am 

Joined: Wed Nov 08, 2017 8:57 am
Posts: 42
ailean wrote:
Space is probably the first thing to watch, especially if not used to thin based arrays (you will have typically allocated out more space then exists in the array, so you need to track actual usage and be aware of potential spikes).

Raw space in each tier can be alerted from the array system properties, depending on your scale, growth and time to turn around additional disk purchases I'd set this around 75-85%.

SSMC has some built-in dashboards for showing growth rate graphs, worth keeping an eye on.

Performance probably the next thing, I tend to do daily reports on busy systems for 'Exported Volumes Compare Hosts' so I can keep on eye on who is pushing the most IO to/from the arrays (I have these set to email so I can dig back easily, they take a while to generate so looking at daily PDFs normally quicker then doing a new report).

Also have alerts based on port bandwidth (around 85-90%), phyical disk IOPS (around the stress point of the type of disk, e.g. 15k FC/SAS ~220iops) and disk port service time (for some potential hotspot and silent fault spotting, around 35ms).

Most of the fault monitoring is all automatic via the SP/SSMC with some alerts on patches etc thrown in, it's also worth setting up the Infosight link to get more general info like balanced host ports, system load, recommended patches and trending etc.

You can also run the built-in healthcheck commands before doing anything to the system (HPE support will typically do this before/after they do things).


Ailean, I don't see where I can schedule the system reports.. Are you able to assist? I would like the Exported Volumes Compare by Performance emailed to me daily, if possible..

Thanks!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 


Who is online

Users browsing this forum: No registered users and 40 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt