HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2, 3  Next
Author Message
 Post subject: Poor performance
PostPosted: Wed Nov 20, 2013 3:44 pm 

Joined: Tue Nov 19, 2013 4:38 pm
Posts: 119
Hi,

We have extremely poor performance on our 4-node 3PAR 7400. It is not in production yet so I am pretty much alone on the system when testing.
I’m testing from a VM on vSphere ESXi 5.1 installed on a HP BL460c Gen8 Blade server. I’ve zoned 4 patch for the host to one node-pair and set multi-path to round robin in ESXi.
Write performance is extremely poor and unreliable. It “freezes up” under sustained writes and ESXi records latency in seconds. Read performance is better but not what I would expect. Example, my 48 DISK HP EVA can manage around 80mb/sec in 8KB Read and the 96 FC-DISK 3PAR only manage around 17mb/sec and the EVA is MAX loaded (3PAR is going to replace it :D )

I’ve attached ATTO benchmarks for both systems.

What could be the cause and where should I look?


Attachments:
File comment: EVA ATTO
EVA.pdf [15.21 KiB]
Downloaded 1261 times
File comment: 3PAR ATTO
3PAR.pdf [14.38 KiB]
Downloaded 1461 times
Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 4:02 pm 

Joined: Thu Dec 06, 2012 1:25 pm
Posts: 138
any errors on the links to the 3par? (in imc, browse to the host and look at the tab link errors, lower right)
what is the speed of the links?
if it are 8gbit switches, did you get the fillword on the ports to 3, on both host and switch side? (not on any isl's)

How is the 3par configured for the cpgs and cpg configuration?

to better understand the config:
could you post the output of:
- showcpg -sdg
- showversion
- checkhealth -svc
- showhost (for the relevant hosts used)
- showvv (for the relevant vv's used)

whille running the load test run the following and post it?

- statvlun -ni -hostsum -iter 1
- statcmp -iter 1

lets work from there :)

_________________
The goal is to achieve the best results by following the clients wishes. If they want to have a house build upside down standing on its chimney, it's up to you to figure out how do it, while still making it usable.


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 4:25 pm 

Joined: Tue Nov 19, 2013 4:38 pm
Posts: 119
Architect wrote:
any errors on the links to the 3par? (in imc, browse to the host and look at the tab link errors, lower right)
what is the speed of the links?
if it are 8gbit switches, did you get the fillword on the ports to 3, on both host and switch side? (not on any isl's)


No link error.
The link speed is 8gbit and the Switch is 8gbit.

I didn't get the fillword (what is that :oops: )

Architect wrote:
How is the 3par configured for the cpgs and cpg configuration?

to better understand the config:
could you post the output of:
- showcpg -sdg
- showversion
- checkhealth -svc
- showhost (for the relevant hosts used)
- showvv (for the relevant vv's used)

whille running the load test run the following and post it?

- statvlun -ni -hostsum -iter 1
- statcmp -iter 1

lets work from there :)


:)

3PAR1 cli% showcpg -sdg
------(MB)------
Id Name Warn Limit Grow Args
0 FC_r1 - - 65536 -ssz 2 -ha cage -t r1 -p -devtype FC
1 FC_r5 - - 65536 -ssz 4 -ha cage -t r5 -p -devtype FC
2 FC_r6 - - 65536 -ssz 8 -ha cage -t r6 -p -devtype FC
3 SSD_r1 - - 16384 -ssz 2 -ha cage -t r1 -p -devtype SSD
4 SSD_r5 - - 16384 -ssz 4 -ha cage -t r5 -p -devtype SSD
5 NL_r1 - - 65536 -ssz 2 -ha cage -t r1 -p -devtype NL
6 NL_r6 - - 65536 -ssz 6 -ha cage -t r6 -p -devtype NL
7 FC_R5_nonAT - - 65536 -t r5 -ha cage -ssz 4 -ss 128 -ch first -p -devtype FC

3PAR1 cli% showversion
Release version 3.1.2 (MU2)
Patches: P10

Component Name Version
CLI Server 3.1.2 (MU2)
CLI Client 3.1.2 (MU2)
System Manager 3.1.2 (MU2)
Kernel 3.1.2 (MU2)
TPD Kernel Code 3.1.2 (MU2)
3PAR1 cli%

3PAR1 cli% checkhealth -svc
Checking ....(i've removed the rest)
System is healthy

3PAR1 cli% showhost
Id Name Persona -WWN/iSCSI_Name- Port
0 ESX11 VMware 5001438026EB1EC6 0:1:2
5001438026EB1EC4 0:1:1
5001438026EB1EC4 1:1:1
5001438026EB1EC6 1:1:2

3PAR1 cli% showvv
---Rsvd(MB)--- -(MB)--
Id Name Prov Type CopyOf BsId Rd -Detailed_State- Adm Snp Usr VSize
3 Test_Volume tpvv base --- 3 RW normal 512 0 93184 2097152
-------------------------------------------------------------------------------
3 total 512 0 185344 2189312

3PAR1 cli% statvlun -ni -hostsum -iter 1
22:18:50 11/20/2013 r/w I/O per second KBytes per sec Svt ms IOSz KB
Hostname Cur Avg Max Cur Avg Max Cur Avg Cur Avg Qlen
ESX11 t 820 820 820 212605 212605 212605 0.94 0.94 259.2 259.2 1
--------------------------------------------------------------------------------------
1 t 820 820 212605 212605 0.94 0.94 259.2 259.2 1

3PAR1 cli% statcmp -iter 1
22:20:11 11/20/2013 ----- Current ----- ---------- Total ----------
Node Type Accesses Hits Hit% Accesses Hits Hit% LockBlk
0 Read 25139 25140 100 25139 25140 100 0
0 Write 45 15 33 45 15 33 3
1 Read 25137 25137 100 25137 25137 100 0
1 Write 14 1 7 14 1 7 2
2 Read 24002 24001 100 24002 24001 100 0
2 Write 19 8 42 19 8 42 3
3 Read 24815 24815 100 24815 24815 100 0
3 Write 2 1 50 2 1 50 0

Queue Statistics
Node Free Clean Write1 WriteN WrtSched Writing DcowPend DcowProc
0 34338 381983 0 0 8 0 0 0
1 35996 380859 1 0 2 0 0 0
2 34471 385390 0 0 0 0 0 0
3 33088 383499 0 0 0 1 0 0

Temporary and Page Credits
Node Node0 Node1 Node2 Node3 Node4 Node5 Node6 Node7
0 589 17231 16142 16089 --- --- --- ---
1 16352 314 17228 15722 --- --- --- ---
2 14958 16533 0 15122 --- --- --- ---
3 16789 16195 16901 0 --- --- --- ---

Page Statistics
---------CfcDirty--------- ------------CfcMax------------ ----------DelAck----------
Node FC_10KRPM FC_15KRPM NL SSD FC_10KRPM FC_15KRPM NL SSD FC_10KRPM FC_15KRPM NL SSD
0 8 0 0 0 38400 0 7200 38400 30125 0 0 0
1 3 0 0 0 38400 0 7200 38400 448 0 0 0
2 0 0 0 0 38400 0 7200 38400 0 0 0 0
3 1 0 0 0 38400 0 7200 38400 37365 0 0 0


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 5:19 pm 

Joined: Thu Dec 06, 2012 1:25 pm
Posts: 138
Quote:

No link error.
The link speed is 8gbit and the switch is 8gbit.

I didn't get the fillword (what is that :oops: )


It's the special word a switch uses to differentiate between frames of data. Different systems needs a different "dialect", if it is wrong you'll get loads of enc out errors, and that could lead serious performance impacts. see this blog for a bit of background info: http://loopbackconnector.com/2013/02/14 ... gfillword/

if you are using (rebranded) brocade switches, do a portcfgshow to see the current fillword settings, and a portsstatsshow to see the errorscounters, the er_bad_os counter should be around zero, or at least quite low (in the hundreds max). the blog will also tell you how to set it, please do if it is wrong! It should be set to 3 for all ports of the 3PAR connected to the switch, and for all host ports connected to the switch. ISLs (Inter Switch Links) should be using mode 0.

Quote:
Code:
3PAR1 cli% showcpg -sdg
               ------(MB)------
Id Name        Warn Limit  Grow Args
 0 FC_r1          -     - 65536 -ssz 2 -ha cage -t r1 -p -devtype FC
 1 FC_r5          -     - 65536 -ssz 4 -ha cage -t r5 -p -devtype FC
 2 FC_r6          -     - 65536 -ssz 8 -ha cage -t r6 -p -devtype FC
 3 SSD_r1         -     - 16384 -ssz 2 -ha cage -t r1 -p -devtype SSD
 4 SSD_r5         -     - 16384 -ssz 4 -ha cage -t r5 -p -devtype SSD
 5 NL_r1          -     - 65536 -ssz 2 -ha cage -t r1 -p -devtype NL
 6 NL_r6          -     - 65536 -ssz 6 -ha cage -t r6 -p -devtype NL
 7 FC_R5_nonAT    -     - 65536 -t r5 -ha cage -ssz 4 -ss 128 -ch first -p -devtype FC



that looks ok. in which CPG is the test lun located? CPG 7 i guess?

Quote:
Code:
3PAR1 cli% showhost
Id Name  Persona -WWN/iSCSI_Name- Port
 0 ESX11 VMware  5001438026EB1EC6 0:1:2
                 5001438026EB1EC4 0:1:1
                 5001438026EB1EC4 1:1:1
                 5001438026EB1EC6 1:1:2



looks ok as well

Quote:
Code:
3PAR1 cli% statvlun -ni -hostsum -iter 1
22:18:50 11/20/2013 r/w I/O per second       KBytes per sec    Svt ms     IOSz KB
           Hostname      Cur  Avg  Max    Cur    Avg    Max  Cur  Avg   Cur   Avg Qlen
              ESX11   t  820  820  820 212605 212605 212605 0.94 0.94 259.2 259.2    1
--------------------------------------------------------------------------------------
                  1   t  820  820      212605 212605        0.94 0.94 259.2 259.2    1


ok, so the 3par is doing 820 iops for this host, with 212MB/sec @ 0.94 msec response times
so the 3par is doing this work with 6 fingers up it's nose and can do much much more. the queuelen of 1 is already directing my attention to the host and not the 3PAR.

Quote:
Code:
3PAR1 cli% statcmp -iter 1
22:20:11 11/20/2013 ----- Current ----- ---------- Total ----------
    Node Type       Accesses  Hits Hit% Accesses  Hits Hit% LockBlk
       0 Read          25139 25140  100    25139 25140  100       0
       0 Write            45    15   33       45    15   33       3
       1 Read          25137 25137  100    25137 25137  100       0
       1 Write            14     1    7       14     1    7       2
       2 Read          24002 24001  100    24002 24001  100       0
       2 Write            19     8   42       19     8   42       3
       3 Read          24815 24815  100    24815 24815  100       0
       3 Write             2     1   50        2     1   50       0

this is interesting, all the reads are coming 100% from the cache (100% cache hit ratio)
explains partially the excellent 0.94msec responsetimes. the low write figures show you were probably running a read test.


Quote:
Code:
        Queue Statistics
Node  Free  Clean Write1 WriteN WrtSched Writing DcowPend DcowProc
   0 34338 381983      0      0        8       0        0        0
   1 35996 380859      1      0        2       0        0        0
   2 34471 385390      0      0        0       0        0        0
   3 33088 383499      0      0        0       1        0        0


ok very low write hits seen, and some writes are scheduled to disk. very low figures though, nothing to worry about.

Quote:
Code:
        Page Statistics
     ---------CfcDirty--------- ------------CfcMax------------ ----------DelAck----------
Node FC_10KRPM FC_15KRPM NL SSD FC_10KRPM FC_15KRPM   NL   SSD FC_10KRPM FC_15KRPM NL SSD
   0         8         0  0   0     38400         0 7200 38400     30125         0  0   0
   1         3         0  0   0     38400         0 7200 38400       448         0  0   0
   2         0         0  0   0     38400         0 7200 38400         0         0  0   0
   3         1         0  0   0     38400         0 7200 38400     37365         0  0   0


writes go to FC 10k rpm drives, SSD and NL is not being used.
I see some delacks though, so somewhere in it's life it has been overloading the FC 10K tier.
the current writes are so low that I don't even have to look further down the backend. 3PAR looks fine and is happily idling away.

My guess is that the ATTO test software is not using multiple parallel IO streams to really load the 3PAR, and that you are hitting the boundries of using a queue of one. A 3PAR is designed from the ground up to do massive parallel IO processing. single stream IO's are handled a little bit slower as a consequence.

I'd advice to switch to IOmeter, and do a new test with 16 worker threads. if you want to test the bandwith do a 32k blocksize, 100% sequential read or write test. It should fully blow your hba's to 800MB/sec each without even breaking a sweat.

_________________
The goal is to achieve the best results by following the clients wishes. If they want to have a house build upside down standing on its chimney, it's up to you to figure out how do it, while still making it usable.


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 5:23 pm 

Joined: Thu Oct 24, 2013 6:50 pm
Posts: 185
skumflum wrote:
Hi,
I’m testing from a VM on vSphere ESXi 5.1 installed on a HP BL460c Gen8 Blade server. I’ve zoned 4 patch for the host to one node-pair and set multi-path to round robin in ESXi.


What SATP is VMware setup with?
Have you applied all relevant VMware updates?
Is the firmware on your HBA up to date?


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 5:34 pm 

Joined: Thu Dec 06, 2012 1:25 pm
Posts: 138
hmm, agreed i was assuming he configured vmware as it should, and has his firmware within 3par supported levels.

bad firmware does not give above performance figures though, and wrong multipath settings would e.g. force all data over just 1 path, limiting performance to 1 8gbit path. as that is not the case, it's not the resolution to his issue.

_________________
The goal is to achieve the best results by following the clients wishes. If they want to have a house build upside down standing on its chimney, it's up to you to figure out how do it, while still making it usable.


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 5:57 pm 

Joined: Tue Nov 19, 2013 4:38 pm
Posts: 119
Josh26 wrote:
skumflum wrote:
Hi,
I’m testing from a VM on vSphere ESXi 5.1 installed on a HP BL460c Gen8 Blade server. I’ve zoned 4 patch for the host to one node-pair and set multi-path to round robin in ESXi.


What SATP is VMware setup with?
Have you applied all relevant VMware updates?
Is the firmware on your HBA up to date?


SAPT SATP_ALUA
Yes and the HP VIBS
Yes the HBA firmware is up to date


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Wed Nov 20, 2013 6:28 pm 

Joined: Tue Nov 19, 2013 4:38 pm
Posts: 119
I really appreciate this :)

I have checked the fillword and it is set to 3 on all ports. The ISLs is also set to 3 through but the performance of our EVA is fine. Should I set this to 0?

I then reset the portstats on the switch and 0 errors

IOmeter and the test suggested (write only). Result is as follows:
Total I/O: 414
Total MB/sec: 12
AV IO response: 2.4041
MAX IO response: 20366.0583
New stats:
00:06:28 11/21/2013 r/w I/O per second KBytes per sec Svt ms IOSz KB
Hostname Cur Avg Max Cur Avg Max Cur Avg Cur Avg Qlen
ESX11 t 818 818 818 212921 212921 212921 0.96 0.96 260.4 260.4 1
--------------------------------------------------------------------------------------
1 t 818 818 212921 212921 0.96 0.96 260.4 260.4 1


3PAR1 cli% statvlun -ni -hostsum -iter 1
00:06:28 11/21/2013 r/w I/O per second KBytes per sec Svt ms IOSz KB
Hostname Cur Avg Max Cur Avg Max Cur Avg Cur Avg Qlen
ESX11 t 818 818 818 212921 212921 212921 0.96 0.96 260.4 260.4 1
--------------------------------------------------------------------------------------
1 t 818 818 212921 212921 0.96 0.96 260.4 260.4 1

3PAR1 cli% statcmp -iter 1
00:07:06 11/21/2013 ----- Current ----- ---------- Total ----------
Node Type Accesses Hits Hit% Accesses Hits Hit% LockBlk
0 Read 0 0 0 0 0 0 0
0 Write 2835 942 33 2835 942 33 3
1 Read 0 0 0 0 0 0 0
1 Write 3239 1080 33 3239 1080 33 3
2 Read 0 0 0 0 0 0 0
2 Write 3082 1024 33 3082 1024 33 3
3 Read 25840 25763 100 25840 25763 100 0
3 Write 1056 349 33 1056 349 33 0

Queue Statistics
Node Free Clean Write1 WriteN WrtSched Writing DcowPend DcowProc
0 46587 362592 1 762 256 0 0 0
1 50950 359367 0 1183 311 0 0 0
2 31376 384419 0 1699 415 0 0 0
3 30410 384957 0 1232 259 0 0 0

Temporary and Page Credits
Node Node0 Node1 Node2 Node3 Node4 Node5 Node6 Node7
0 16 18817 18183 19166 --- --- --- ---
1 17617 0 19130 17916 --- --- --- ---
2 15829 15289 0 17447 --- --- --- ---
3 16471 16704 16441 0 --- --- --- ---

Page Statistics
---------CfcDirty--------- ------------CfcMax------------ ----------DelAck----------
Node FC_10KRPM FC_15KRPM NL SSD FC_10KRPM FC_15KRPM NL SSD FC_10KRPM FC_15KRPM NL SSD
0 1019 0 0 0 38400 0 7200 38400 30125 0 0 0
1 1494 0 0 0 38400 0 7200 38400 448 0 0 0
2 2114 0 0 0 38400 0 7200 38400 0 0 0 0
3 1491 0 0 0 38400 0 7200 38400 37365 0 0 0


Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Thu Nov 21, 2013 4:19 am 

Joined: Thu Dec 06, 2012 1:25 pm
Posts: 138
[edit]
been reading up on ISL fillword settings again, and if VCinit is disabled it should be 0, if it is enabled it should 1, but 2 and 3 are allowed as well. So my initial advice on the ISLs was incomplete/could have been wrong. Sorry.
[/edit]

hmm there is something really strange in that case. seems like you can't get past 2gbit a second, and while the 3par is quick, the host see's bad response times. that really points the issue to your san.

Is the host directly via the same switch connected to the 3par, or are there ISL's in between?
are the ISL's configured and running on 8gbit fixed? (ports for the 3par and ISLS should always be hardcoded on 8gbit so that they cannot autonegotiate on a lower value.

What type of test was this? a read or a write test?

i get the impression something is wrong in your san setup. Bad ISL config, bad cable, bad sfp, something like that. If possible PM me a drawing on how the san looks like and what the path is from the host to the 3par

_________________
The goal is to achieve the best results by following the clients wishes. If they want to have a house build upside down standing on its chimney, it's up to you to figure out how do it, while still making it usable.


Last edited by Architect on Thu Nov 21, 2013 5:21 pm, edited 2 times in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Poor performance
PostPosted: Thu Nov 21, 2013 5:24 am 

Joined: Tue Nov 19, 2013 4:38 pm
Posts: 119
Architect wrote:
Yeas, ISL's between brocade switches (with a recent FOS version) should be 0. But ofcourse verify with some other expert before implementing. (wouldn't want to cause outages)


Okay I will investigate this.


Architect wrote:
hmm there is something really strange in that case. seems like you can't get past 2gbit a second, and while the 3par is quick, the host see's bad response times. that really points the issue to your san.

Is the host directly via the same switch connected to the 3par, or are there ISL's in between?
are the ISL's configured and running on 8gbit fixed? (ports for the 3par and ISLS should always be hardcoded on 8gbit so that they cannot autonegotiate on a lower value.



The host is connected to two embedded Brocade 4GB switch in a C7000 enc. The embedded switces is connect by 2x4 ISLS to two external Brocade 8GB switces. I've attached a drawing.

Architect wrote:
What type of test was this? a read or a write test?


32kb 100% write 0% random

Architect wrote:
i get the impression something is wrong in your san setup. Bad ISL config, bad cable, bad sfp, something like that. If possible PM me a drawing on how the san looks like and what the path is from the host to the 3par


I've attached the drawing.

I've also zoned a LUN directly to a Windows host. The performance is equally poor ruling out vSphere IMO


Attachments:
SAN drawing.jpg
SAN drawing.jpg [ 27.51 KiB | Viewed 25474 times ]


Last edited by skumflum on Thu Nov 21, 2013 5:29 am, edited 3 times in total.
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2, 3  Next


Who is online

Users browsing this forum: No registered users and 333 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt