HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: 8200 poor performance....(in VM guest)
PostPosted: Fri Jun 12, 2020 9:01 pm 

Joined: Mon Mar 09, 2020 8:34 am
Posts: 67
Quick thing I have noticed is that copying in guest VMs peaks at around 136MB/s when copying a large file.

We have esxi hosts with 2x10gbe iscsi each, to two 8200 controllers, meaning each datastore in esxi has 8 paths, and these are set for round robin, and iops=1.

Our 8200s have something like 72 HDD spindles and 24SSD drives. If I have a 17GB file inside a VM for instance, and simply copy it on the desktop, I would have expected the above setup to have absolutely ripped this to shreds, and been up at 10gbe transfer speeds in the guest OS, but no, only topping out at about 136MB/s.

Is it normal for the 3par to be this slow with such a config? Of course all the ports are set for 10gbe and appear to be functioning normally. The latencies are very nice, at about 1ms, but I do not understand how such an expensive array can be so slow in VM guest performance.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Fri Jun 12, 2020 9:11 pm 

Joined: Mon Mar 09, 2020 8:34 am
Posts: 67
OK just as a reply to my own post, well, that was weird.

I copied the ~17Gb file on the desktop, then did a paste paste for two at once, and it flew up to about 700MB/s to the VM guest! Happy with that, but wonder why the performance was so bad with a single file?

Could it be some queue depth setting? maybe iops=1 over EIGHT paths to the array is rubbish and should be increased to standard 1000 iops for more bursty performance?

Any tips!? At least we know its not totally broken though, so good news :)

OK spoke too soon, did another double paste, wouldnt budge past 40MB/s inside the VM. THis is utterly bizarre.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 12:07 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
My guess is that you hit some cache on the 700MB/s copy.

Are you reading/writing to SSD or HDD? And what type of HDDs do you have?

The 8200 is the least powerful model of the least powerful series. I did some performance testing on a 8200 a couple of years back. I got about 500MB/s with dedupe enabled and compression enabled and disabled and above 1GB/s with only thin volumes. That was FC and not iSCSI and I have a larger test file (50GB) to ensure that cache was out of the play. That system had 12 or 14 SSDs.

I would suggest to keep statpd, statvv, statvlun and statcpu open in 4 SSH consoles to monitor those when you do the copy and add there here so we can see if we spot the bottleneck.

Btw, have you disabled delayed ack for iSCSI on your hosts? That is a common issue with iSCSI and at least Vmware for most (if not all) storage arrays.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 5:02 am 

Joined: Mon Mar 09, 2020 8:34 am
Posts: 67
Maybe using these crude methods is not ok?

I ran up IOmeter, and with two worker threads and 2MB transfer request size and a couple of worker threads with a 100% write IO set to random, and saw 1627MB/s in a Win10 test VM, so it LOOKS like in theory the bandwidth is there, and checking the 10Gbe ports on the SAN, each were sitting up at ~23% so it looks like the numbers were real. I remember doing this random 2MB 100% write (very simple plain raw speed test) before with our Compellent storage, and after mega tweaking and proper RR multipath getting about 280MB/s, so 1627MB/s currently inside a Windows VM I deem as reasonable so far.

I realise this is super crude, but it appears less crude than a windows file copy+paste perhaps, even though the latter was a more "real" world thing to do.

I thought DelAck was only there for network congestion/cache fills?

When I get back into the office I will setup the SSH consoles and do this properly.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 6:28 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
I don't know the reason or logic behind delack, but it seems to be a general problem with iSCSI and Vmware where it in some cases (I believe hardware accelated iSCSI) does some funny things.

Multiple workers in IOmeter always helps, same with having a little bit of outstanding IOs (basically a queue). Watch out for the data sample size with IOmeter. If it is to low, everything goes to cache and you get unrealisticly high numbers (which some vendors actually use as their official performance numbers). Also remember, big IOs = low IOps, high throughput. Small IOs = high IOps, low throughput. Know what number you are looking for, or benchmarking against. My experience is that real life virtualized environments seem to be between 32kB and 64kB IO size on average.

If you see 23% usage on the SAN ports, that is good. When using your old method of windows file copy, you could have triggered VAAI/ODX which can give _very_ funny numbers with cache hit.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 10:03 am 

Joined: Mon Mar 09, 2020 8:34 am
Posts: 67
Oh I totally agree with you about the IOmeter stuff and workloads.

This was just the lowest of the low in terms of simplicity just to ensure that the very basics are all configured the way I was expecting.

I HAVE seen in the past with other arrays, that even though the vendor specified iops=1 for the RR PSP, actually for fast burst traffic latency and spikes were worse with RR. I think no matter what you introduce latency in the kernel for switching IO Paths every IOP.

I just did a 2MB block read, and it was getting 2300MB/s to the VM, which I consider case closed for bandwidth at least. Now, all these tests are with NO data in the SSD tier, and I am wondering if AO set to 50IOPS is actually too high of a threshold and its not moving data. Also I have only enabled flashcache in simulation mode, I did notice it wont let you enable flashcache with RAID0 as as a setting, only RAID1, losing half.

EDIT:- Also I noticed that ditching this Win10 SATA disk controller and replacing with vParavirtual, made things a bit quicker also.

Let me ask one thing though, Do you think as a matter of course, everyone should disable delayed Ack anyway, regardless?

We have 8200 arrays, 10gbe switch gear, Gen10 HPE hosts, running esxi 6.7U3, as 7.0 had some bugs I was not happy with.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 11:54 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
A lot of stuff to attack here :)

Adaptive Flash Cache (AFC) can do RAID0... from 3.3.1 I think. On the other hand I've never been happy with AFC. In my testing, AO always reduced traffic on FC the most and if you don't have NL then it will only support data on FC tier. Some scenarios AFC makes a whole lot sense, but generally you're better off letting AO do the trick.

As for AO and min_iops... with FC and SSD I always set min_iops to 0 and set a t0 min on AO config. That ensures that you're always putting the most active blocks on SSD even if the usage is low. As long as you don't have NL, nothing will get worse performance but a lot might get better.

For RR and iops=1. I'm not sure what your reference is, but 3PAR is active/active so your IO will be processed at the path it is sent. For active/passive arrays, RR will ensure that 50% of the IOs are recieved on one controller and processed on another. You seem to have a small environment so it might not matter but for larger arrays setting iops=1000 might cause hotspots in the SAN and on host ports on 3PAR that give variable latency across paths.

As for DelAck. Test it. If you see a difference, disable it. If not, don't. As for 10GbE gear for iSCSI it is often more important about switch port buffers, than pure speed. It seems that everyone forgot after 1GbE.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 12:18 pm 

Joined: Mon Mar 09, 2020 8:34 am
Posts: 67
Thank you for taking the time to educate.

I think I will tweak the AO a little, and see whats what. From what I can see above, we would basically fill the SSD tier, and AO would then filter down rather than what I have setup right now which is FC_r6 default with AO running nightly on a 50 iops (default) setting. I might have to change this, we are pretty low duty in our general usage. Basically we are short bursts where we need rapid IO, but over the course of time, as an average, the IO load is small.

I do have ONE final important question, I see that in 6.7 they have VAAI XCOPY enhancements, and 3par recommendation is to run:-

sxcli storage core claimrule add -r 914 -t vendor -V "3PARdata" -PVMW_VAAIP_T10 -c VAAI -a -s


On all the hosts. Have you guys done this and seen results?


Top
 Profile  
Reply with quote  
 Post subject: Re: 8200 poor performance....(in VM guest)
PostPosted: Sat Jun 13, 2020 1:05 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
I would recommend to use FC_r6 as default and have AO move things up. If you do vice versa, the SSD layer might go full, in that case the system will take FC chunklets and use them for SSD tier. No need to explain what might happen to latency when what you think is on SSD is actually on FC.

As for the claimrule I've not done this on any hosts. To my knowledge the primary objective of VAAI is to reduce traffic across the SAN. I'm guessing you have dedicated 2x 10GbE on host level and 4x 10GbE on 3PAR for iSCSI. I don't see any scenario where SAN bandwidth could become an issue and VAAI improvements actually make difference. I might be wrong but I don't see where there is anything to gain.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 


Who is online

Users browsing this forum: No registered users and 41 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt