HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: Latency on Primera with 256KB io and 512KB io
PostPosted: Wed May 12, 2021 9:54 am 

Joined: Wed Jun 13, 2018 2:31 am
Posts: 33
we have a few hosts (mssql servers) that off and on, are doing read/writes at 256KB & 512KB io size..

but some how, that is causing WRITE latency spikes on other HOSTS up to 20ms

those servers are not even using the same FE host port, any how to address this ?

Supports reply, is asking us not to do such large IO operations, as though we have control on it..

any sharing/experience in dealing with this is highly appreciated


Top
 Profile  
Reply with quote  
 Post subject: Re: Latency on Primera with 256KB io and 512KB io
PostPosted: Fri May 14, 2021 12:22 am 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
On the hosts that are latency 'victims', how many IOPs are they averaging when seeing the 20ms write latency? I ask because very low IO/idle hosts can report high latency when other systems are getting priority, however when other systems actually need to do some real IO, the latency corrects itself and the get a share of the pie too. I am not sure if this is part of the intelligent optimization, or if it's just math anomaly from calculating an "average" with too few data points.

Do you have an alert emailing you when this happens, or are you observing it in SSMC/Infosight? Is the reported high latency manifesting as application issues, or only observed in alerts and reports?

You may have more control over that SQL IO than you think! Have conversations with the DBAs to explain what you see, and understand what is going on when that unusually high block size is observed. I suspect they may be backing up the database from the Primera, TO the Primera! Possible to the same CPG. It could be batch imports/exports or something else too like refreshing a DEV box with a copy of PROD. There could be an opportunity for you to flex your storage skills and save the company time and money by leverage snapshots instead!

Worst case scenario... if this host is an absolute 'bully' hogging resources and after your draw your DBAs attention to it, they still can not, or will not, help... you can look into applying a QoS policy to it to keep it from stepping on everyone else's toes.

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: Latency on Primera with 256KB io and 512KB io
PostPosted: Sun May 16, 2021 11:43 pm 

Joined: Wed Jun 13, 2018 2:31 am
Posts: 33
i'm seeing latency from the reports, and also from the DB server as we have few critical DB servers that is being monitored by dynatrace, and the bosses as getting edgy, due to the high latency..

i've cross check the node cpu/ssd total iops/service time latency and the node cache performance (where i can see delayack up to 3k )

i'm trying to corelate where is the "bottle-neck" , as our primera is a mixworkload primera, but we do have a few highly latency sensitive db running..

snapshot leveraging isn't possible, there is alot of work required from server & db admin to use the snapshot as there some hosts that is residing on solaris ldom...

then we also have system that does schema backup before their batch processing..

i do agree with you on the QoS, but the document is vague and i'm not really sure how to implement it, let say i have a mssql server, that is latency sensitive, i create a QOS and put latency 1ms and iops and mb as per average workload, so that if there is other noisy neighbor , they won't impact it right ?


Top
 Profile  
Reply with quote  
 Post subject: Re: Latency on Primera with 256KB io and 512KB io
PostPosted: Sun May 30, 2021 12:04 pm 
Site Admin
User avatar

Joined: Tue Aug 18, 2009 10:35 pm
Posts: 1328
Location: Dallas, Texas
Any luck pinpointing the bottleneck? Have you opened a support case for a deeper dive?

How does VLUN latency compare to VV latency, as reported by the Primera/SSMC?


Regarding QoS, or Priority Optimization:

Page 185 of the SSMC User Guide covers some of it. Note in addition to the PO policy, there are also PO reports in system reporter and also PO alert settings you can manage.
https://support.hpe.com/hpesc/public/do ... cale=en_US

Page 295 of the CLI guide covers "setqos" with additional details.
https://support.hpe.com/hpesc/public/do ... 88929en_us

_________________
Richard Siemers
The views and opinions expressed are my own and do not necessarily reflect those of my employer.


Top
 Profile  
Reply with quote  
 Post subject: Re: Latency on Primera with 256KB io and 512KB io
PostPosted: Tue Jun 01, 2021 6:40 am 

Joined: Wed Jun 13, 2018 2:31 am
Posts: 33
Any luck pinpointing the bottleneck? Have you opened a support case for a deeper dive?

not really, we logged a case, and backline just say our SSD BE are overloaded as we are doing 15k to 20k IOPS...

i've asked if converting to thin lun, removing the deco, will yield lower load on the BE SSD, but somehow my country HPE team is trying to get intouch with the performance team for some deep dive..

i've ask does the PRIMERA SSD.. 3TB/7TB/15TB all have the same IOPS watermark or not, they didnt reply to my answer, not sure if the Ninja performance that was used to size our Primera from our 8440 is accurately calculating the load or not


Top
 Profile  
Reply with quote  
 Post subject: Re: Latency on Primera with 256KB io and 512KB io
PostPosted: Sun Jun 05, 2022 11:30 am 

Joined: Thu Apr 28, 2016 1:29 pm
Posts: 34
Richard Siemers wrote:
On the hosts that are latency 'victims', how many IOPs are they averaging when seeing the 20ms write latency? I ask because very low IO/idle hosts can report high latency when other systems are getting priority, however when other systems actually need to do some real IO, the latency corrects itself and the get a share of the pie too. I am not sure if this is part of the intelligent optimization, or if it's just math anomaly from calculating an "average" with too few data points.


So, if may piggyback on this. When I look at the performance charts from SSMC. I see 13 hosts showing high latency ranging from 25ms to 80ms for a period of time every night around 11PM, but 12 of them are barely doing any IOPS during that period. So, all those 12 hosts` high latency can be ignored?

That 1 host which is an SQL server is doing close to 8000 IOPs and has read latency between 30ms and 40ms which is still high for an all-flash Primera array. That lasts about an hour.

What do you suggest?


Top
 Profile  
Reply with quote  
 Post subject: Re: Latency on Primera with 256KB io and 512KB io
PostPosted: Sun Jun 05, 2022 11:28 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
bbarbaros wrote:
Richard Siemers wrote:
On the hosts that are latency 'victims', how many IOPs are they averaging when seeing the 20ms write latency? I ask because very low IO/idle hosts can report high latency when other systems are getting priority, however when other systems actually need to do some real IO, the latency corrects itself and the get a share of the pie too. I am not sure if this is part of the intelligent optimization, or if it's just math anomaly from calculating an "average" with too few data points.


So, if may piggyback on this. When I look at the performance charts from SSMC. I see 13 hosts showing high latency ranging from 25ms to 80ms for a period of time every night around 11PM, but 12 of them are barely doing any IOPS during that period. So, all those 12 hosts` high latency can be ignored?

That 1 host which is an SQL server is doing close to 8000 IOPs and has read latency between 30ms and 40ms which is still high for an all-flash Primera array. That lasts about an hour.

What do you suggest?


Ignore any host with less that 10 IOps. Comparing array stats with host stats usually tell that the host isn’t seeing this.

For the remaining hosts, compare statvlun to statvv to see if the array is strugling. If vlun is high and vv is low, the problem is outside the array. Also look at the queue. One can easily generate latency by increasing the queue on the host. The «timer» starts once the OS sends the IO to the HBA.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 


Who is online

Users browsing this forum: No registered users and 11 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt