HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: Performance degraded after upgrade from 3PAR 3.2.2 to 3.3.1
PostPosted: Fri Aug 09, 2019 2:46 pm 

Joined: Fri Aug 09, 2019 2:18 pm
Posts: 3
Hello,

We have a 3PAR 8200 and we upgraded to 3.3.1 MU2... just after the upgrade we noticed lower performance (higher latency) on a Database specifically on Solaris servers.

I realized that IOPS on SSD decreased after upgrade, that made me think that AO it's not running at the same way that before the upgrade and that is the reason why we are experimenting more latency in our platform.

I have made some changes on AO (extending measurement interval).. but still I don't get the same performance than before upgrade :( ...

Do you know any other reasons or similar behaviors after upgrade??..

This storage is also shared with VMWare platform, I opened a case but I just received comments about VMWare best practices.. already applied it.. no changes :(

Thanks!! Regards! :)


Attachments:
File comment: VLUNs databse performance, last month
VLUNs.png
VLUNs.png [ 438.63 KiB | Viewed 17786 times ]
File comment: SSDs performance, last month
SSDs.png
SSDs.png [ 277.96 KiB | Viewed 17786 times ]
Top
 Profile  
Reply with quote  
 Post subject: Re: Performance degraded after upgrade from 3PAR 3.2.2 to 3.
PostPosted: Fri Aug 09, 2019 4:09 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Getting anything out of a visual graph is like selling sand in the Sahara.

What it looks to me is that you are getting more writes on vlun and less writes on PD. And that doesn't make sense unless you're getting a lot more cache hits. How does the graph for bandwidth and IOsize look?

And just a hunch, maybe look for the issue rather than ehat has changed...... At least one important thing has happened.... Both nodes have rebooted. That might cause some hosts to lose a path or two that might impact balance.. Are all VLUNs following the same trend or is it just some VLUNs or VVs?

Edit: from the looks of it, all PDs look better after upgrade and VLUNs are worse. What about VV? If VV is good and VLUN is bad it is usually outside the 3PAR where you have the problem.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance degraded after upgrade from 3PAR 3.2.2 to 3.
PostPosted: Fri Aug 09, 2019 5:12 pm 

Joined: Fri Aug 09, 2019 2:18 pm
Posts: 3
MammaGutt wrote:
Getting anything out of a visual graph is like selling sand in the Sahara.

What it looks to me is that you are getting more writes on vlun and less writes on PD. And that doesn't make sense unless you're getting a lot more cache hits. How does the graph for bandwidth and IOsize look?

And just a hunch, maybe look for the issue rather than ehat has changed...... At least one important thing has happened.... Both nodes have rebooted. That might cause some hosts to lose a path or two that might impact balance.. Are all VLUNs following the same trend or is it just some VLUNs or VVs?

Edit: from the looks of it, all PDs look better after upgrade and VLUNs are worse. What about VV? If VV is good and VLUN is bad it is usually outside the 3PAR where you have the problem.



Hello! Thanks for your answer!

SSDs are ok, but what I can't explain is why they have lower usage after upgrade. And I think that's the cause for VLUNs to look worse.

About what you said, HP reviewed and told us to change some LUNs from MRU to RR, you can see the host ports graph (that was bad before and after upgrade, but since we changed looks good and balanced).. but apparently this is not the cause for the behavior on SSDs..

All VLUNs are following the same trend, but the ones where we are noticing degradation are from DBA team.. as this is a production environment..

I also attached what you asked, I really don't see any big change on charts as the one I see on SSD IOPS.. the little improvement I get was changing the AO measurement interval..


Attachments:
hostports.png
hostports.png [ 176.4 KiB | Viewed 17782 times ]
SSDs_BW_IO.PNG
SSDs_BW_IO.PNG [ 357.31 KiB | Viewed 17782 times ]
VLUNs_BW_Size.PNG
VLUNs_BW_Size.PNG [ 301.22 KiB | Viewed 17782 times ]
Top
 Profile  
Reply with quote  
 Post subject: Re: Performance degraded after upgrade from 3PAR 3.2.2 to 3.
PostPosted: Sat Aug 10, 2019 12:19 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Okay...

The Qlength worries me... Both before and after upgrade.

Without taking a very deep look I would say that you probably were having some issues before the upgrade, but you might not have noticed it.

Did you use flashcache or anything prior to the upgrade?

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance degraded after upgrade from 3PAR 3.2.2 to 3.
PostPosted: Sat Aug 10, 2019 8:25 am 

Joined: Thu Feb 04, 2016 4:12 pm
Posts: 28
less ssd use could be AFC or AO, possible...
during the upgrade, if assisted by HPE. they would disable AFC, and cancel or pause the schedule for AO. id make sure the AO config is still scheduled, and the task is enabled, as well as use/check AFC


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance degraded after upgrade from 3PAR 3.2.2 to 3.
PostPosted: Sat Aug 10, 2019 11:20 am 

Joined: Fri Aug 09, 2019 2:18 pm
Posts: 3
Flash cache is enabled

I also thought that the reason could be AO schedule.. But it's enabled, I have made some changes as I said before, but SSDs use it's not the same that before..

I have changed the AO schedule to run 3 times per day, every 8 hours. (before was just once a day) and I saw the improvement but not at all..

I know FC disks are oversubscribed (HP told me that and I also noticed it on checkhealth), but I think there is something else.. We wanted to downgrade to 3.2.2 but L2 it was not possible..

Now we are planning to migrate this database to another 3PAR where all VVs will be on SSD CPG but I wanted to be sure that there wasn't anything else I didn't noticed..


3par8200 cli% showflashcache -vv
Flash Cache enabled for all Virtual Volumes

3par8200 cli% showflashcache -vvset
Flash Cache enabled for all Virtual Volumes

3par8200 cli% showflashcache
-(MB)-
Node Mode State Size Used%
0 SSD normal 65536 100
1 SSD normal 65536 100
-------------------------------
2 total 131072

3par8200 cli% statcache
12:00:38 08/10/2019 ------- Current -------- -------- Total ---------
CMP FMP Total CMP FMP Total
Node Type Accesses Hit% Hit% Hit% Accesses Hit% Hit% Hit%
0 Read 8090 84 4 88 8090 84 4 88
0 Write 3833 11 0 11 3833 11 0 11
1 Read 4634 72 5 77 4634 72 5 77
1 Write 4160 12 0 12 4160 12 0 12

Internal Flashcache Activity
----- Current ------ ------- Total --------
Node Type Accesses IO/s MB/s Accesses IO/s MB/s
0 Read Back 294 147 1 294 147 1
0 Destaged Write 0 0 0 0 0 0
1 Read Back 241 121 1 241 121 1
1 Destaged Write 0 0 0 0 0 0

------------------- FMP Queue Statistics --------------------
Node Dormant Cold Norm Warm Hot Destage Read Flush WrtBack
0 0 0 364264 103486 3726554 0 0 0 0
1 0 0 166658 110641 3917005 0 0 0 0

------------------- CMP Queue Statistics --------------------
Node Free Clean Write1 WriteN WrtSched Writing DcowPend DcowProc
0 93891 824236 1186 595 368 132 0 0
1 95703 824442 1470 432 600 48 0 0

Press the enter key to stop...

12:00:40 08/10/2019 ------- Current -------- -------- Total ---------
CMP FMP Total CMP FMP Total
Node Type Accesses Hit% Hit% Hit% Accesses Hit% Hit% Hit%
0 Read 13963 91 1 93 22053 89 2 91
0 Write 6833 31 0 31 10666 24 0 24
1 Read 8825 87 2 89 13459 82 3 85
1 Write 6094 32 0 32 10254 24 0 24

Internal Flashcache Activity
----- Current ------ ------- Total --------
Node Type Accesses IO/s MB/s Accesses IO/s MB/s
0 Read Back 189 95 1 483 121 1
0 Destaged Write 0 0 0 0 0 0
1 Read Back 150 75 1 391 98 1
1 Destaged Write 0 0 0 0 0 0

------------------- FMP Queue Statistics --------------------
Node Dormant Cold Norm Warm Hot Destage Read Flush WrtBack
0 0 0 364264 103486 3726554 0 0 0 0
1 0 0 166658 110641 3917005 0 0 0 0

------------------- CMP Queue Statistics --------------------
Node Free Clean Write1 WriteN WrtSched Writing DcowPend DcowProc
0 97796 825637 433 1058 402 96 0 0
1 95413 825694 298 692 310 7 0 0

Press the enter key to stop...

12:00:42 08/10/2019 ------- Current -------- -------- Total ---------
CMP FMP Total CMP FMP Total
Node Type Accesses Hit% Hit% Hit% Accesses Hit% Hit% Hit%
0 Read 16875 93 1 95 38928 91 2 93
0 Write 11141 25 0 25 21807 25 0 25
1 Read 14732 92 1 93 28191 87 2 89
1 Write 8991 18 0 18 19245 21 0 21

Internal Flashcache Activity
----- Current ------ ------- Total --------
Node Type Accesses IO/s MB/s Accesses IO/s MB/s
0 Read Back 245 122 1 728 121 1
0 Destaged Write 0 0 0 0 0 0
1 Read Back 176 88 1 567 94 1
1 Destaged Write 1 0 0 1 0 0

------------------- FMP Queue Statistics --------------------
Node Dormant Cold Norm Warm Hot Destage Read Flush WrtBack
0 0 0 364264 103486 3726554 0 0 0 0
1 0 0 166658 110641 3917005 0 0 0 0

------------------- CMP Queue Statistics --------------------
Node Free Clean Write1 WriteN WrtSched Writing DcowPend DcowProc
0 95251 822204 2063 863 762 0 0 0
1 95284 822242 1714 760 627 0 0 0

Press the enter key to stop...


Top
 Profile  
Reply with quote  
 Post subject: Re: Performance degraded after upgrade from 3PAR 3.2.2 to 3.
PostPosted: Sat Aug 10, 2019 12:37 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
The only supported downgrade requires the system to run OOTB (basically clean the array and start over).

Running AO 3 times per day usually sounds like a bad idea unless you have a very weird environment. Is it likely that blocks not active at 1AM and active at 9AM will be active at 5PM?

How full is your SSD tier? 100% full?
If so, I would ditch FlashCache and let AO use those 128GB. Look at your statcache output. Your 8200 has 64GB of cache. If my memory serves me right, that is 32GB per node and 16GB is control cache and 16GB is data cache. Flashcache is read-only cache. Of that on-node cache you are getting ~91% hit on node0 and 87% hit on node1 as current and 89% / 82% as an historical average. That is 32GB (which is shared with write cache, so it is way less than 32GB) of cache total for both nodes providing those numbers. Your 128GB of Flashcache is hitting 1% and 2% currently and 2% and 3% historically.... My guess is that those 128GB would do _a lot_ more as part of your AO config..... And I really have to say, you're getting in the neighbourhood of 90% hit on read cache... That is really high... If your system is already having issues on the FC layer, think what will happen if your workload just changes a little bit and your cache hit drops to something like 50%... You'll be in big trouble.

If you look at System Reporter IO Region Density report you will see if you have a lot of hot blocks on FC. (srrgiodensity in CLI if you want it readable to the details)

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 


Who is online

Users browsing this forum: Google [Bot] and 40 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt