Yes sounds like we're both on the same page.
You would have thought that they would have added it to the VMWare best practices guide by now, but here is the advisory information copied from the last upgrade major upgrade that I had done: 3.2.2 MU2 -> 3.2.2 MU4
For VMware Hosts
For ESXi 5.5 Update 2 or ESXi 6.0 disable VAAI-ATS as per VMware Advisory below:
https://kb.vmware.com/selfservice/micro ... %202538771 As per the latest VMware implementation guide -
http://h20564.www2.hpe.com/hpsc/doc/pub ... =c03290624 , Page 51 indicates not to install the 3PAR VAAI Plug-in 2.2.0 on the ESXi 5.x if it is connected to a 3PAR StoreServ Storage running 3PAR OS 3.1.1 or later. The VAAI primitives are handled by the default T10, VMware plug-in and do not require the 3PAR VAAI plug-in.
We have several factors in play, periodic remote copy, snapshots and dedupe that combined have caused us significant problems for a long time including minor issues similar to yours all the way up to random host outages that require reboots. Online operations exacerbate the problem such as compactcpg or tunevv and almost always cause host outages due to the excessively long i/o stalls i.e. the array will refuse data for a specifc LUN for on occasion over 19 seconds or more.
We've had the case open for such a long time that it's been escalated all the way up almost to Meg Whitman and HPE have just loaned us an additional 16x 8TB SSDs in order to provide temporary buffer space to complete the "fix" that they have in mind.
Essentially they want us to un-dedupe everything, upgrade to 3.3.1 MU1 (will be GA/default very shortly) and then turn on dedupe for selected volumes that benefit.
Since you are having similar but less severe problems than ourselves then moving to 3.3.1 MU1 might just do the trick, you will likely be able to use tunevv to migrate your vvs to dedupe 3.0