HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: 3PAR Persistent Port not working
PostPosted: Thu Nov 21, 2019 7:22 am 

Joined: Thu Nov 21, 2019 6:50 am
Posts: 4
Hi,

I've been reading this site for a little while now, very useful, but this is my first post with an issue I'm having.

We've had two 3PAR 8200 for nearly a couple of years now and whenever HPE performed online upgrades the vSphere environment would be unaware of the node reboots because of the 3PAR Persistent Port feature, which was great. I think the most we saw on the esx host was a single log entry about a delay or pause.

Now over the past few months we've upgraded 3PAR to 3.3.1 MU2 and we've upgraded vSphere to 6.7 U2.

Since then, when a node reboot has occurred the vSphere environment has been well aware of the paths failing, lots of warnings and errors showing path failures, degradation, failover etc. In most cases MPIO has done its job and there was no disruption, but we did have one case where it caused two esx hosts to completely freeze. Investigation by HPE said the quiesce process took over 19 seconds so it timed out and I guess the esx host didn't like that so it froze (not even CLI access via ILO) so had to reboot them.

Anyway, what I want to solve is why 3PAR persistent ports is not working as expected when it previously worked fine. I've gone through all the 3PAR documentation on the subject, double checking configurations but I cannot find anything wrong.
Cabling is correct, 3PAR config appears correct, I can see partner port listed. We have Brocade fibre switch and NPIV is enabled on the ports. ESX hosts are configured for 3PAR with round robin and I/O=1. All HPE hosts have latest SPP applied and the correct recommended driver/firmware for HBA as well.

Note: Although we have 2x 3PAR there is no remote copy or replication between two, they are independent but near identical environments and both environments appear to have this issue.

I just can understand what I'm missing, why it worked before and not now.

Does anyone have any suggestions on what might be wrong?

N:S:P Connmode ConnType CfgRate MaxRate Class2 UniqNodeWwn VCN IntCoal TMWO Smart_SAN
0:0:1 host point auto 16Gbps disabled disabled disabled disabled disabled unknown
0:0:2 host point auto 16Gbps disabled disabled disabled disabled disabled unknown
0:1:1 disk point 12Gbps 12Gbps n/a n/a n/a enabled n/a n/a
0:1:2 disk point 12Gbps 12Gbps n/a n/a n/a enabled n/a n/a
1:0:1 host point auto 16Gbps disabled disabled disabled disabled disabled enabled
1:0:2 host point auto 16Gbps disabled disabled disabled disabled disabled enabled
1:1:1 disk point 12Gbps 12Gbps n/a n/a n/a enabled n/a n/a
1:1:2 disk point 12Gbps 12Gbps n/a n/a n/a enabled n/a n/a


N:S:P Mode State ----Node_WWN---- -Port_WWN/HW_Addr- Type Protocol Label Partner FailoverState
0:0:1 target ready 2FF70002AC01DA5B 20010002AC01DA5B host FC - 1:0:1 none
0:0:2 target ready 2FF70002AC01DA5B 20020002AC01DA5B host FC - 1:0:2 none
0:1:1 initiator ready 50002ACFF701DA5B 50002AC01101DA5B disk SAS DP-1 - -
0:1:2 initiator ready 50002ACFF701DA5B 50002AC01201DA5B disk SAS DP-2 - -
0:3:1 peer offline - 9418824549BD free IP IP0 - -
1:0:1 target ready 2FF70002AC01DA5B 21010002AC01DA5B host FC - 0:0:1 none
1:0:2 target ready 2FF70002AC01DA5B 21020002AC01DA5B host FC - 0:0:2 none
1:1:1 initiator ready 50002ACFF701DA5B 50002AC11101DA5B disk SAS DP-1 - -
1:1:2 initiator ready 50002ACFF701DA5B 50002AC11201DA5B disk SAS DP-2 - -
1:3:1 peer offline - 9418824540E9 free IP IP1 - -


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Thu Nov 21, 2019 10:57 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
Things to keep in mind

NPIV must be enabled when the 3PAR port goes online, so if NPIV at some point in time was disabled when the 3PAR nodes last booted, it wouldn't work.

Other than that you need to ensure that partner ports are in the same fabric (0:0:1 and 1:0:1 to one switch, 0:0:2 a d 1:0:2 to the other).

You can do a less intrusive test by disabling one port on the SAN switch or simply pull a cable to test persistent port.

I know there was an issue with 3.2.2 MU3, both upgrades to and from if my memory serves me right. There is a general notes in the pre-upgrade guide about that (not stating 3.2.2 MU3, but a certian setting on Brocade switches).

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Thu Nov 21, 2019 11:35 am 

Joined: Wed Nov 09, 2011 12:01 pm
Posts: 392
What Patches do you have on the 3.3.1 MU2 installs?

May not be related but spotted the following fix in P76, so there has been some bugs in port persistence this year (we jumped to 3.3.1 MU3 so didn't have MU2 in production for very long and didn't spot this issue);

Quote:
Issue ID: 231779, 264986
Issue summary: The array does not respond to an abort sequence and sends responses to wrong Fibre
Channel identifier (FCID).
Platforms affected: All StoreServ
Affected software versions: 3.2.2 MU4, 3.3.1 MU1 - MU2
Issue description: During Persistent port failover, the array sends an incorrect FCID to Windows host.
The Windows host waits 40 seconds for that I/O to time out.
Symptoms: Host I/O operations stall.
Conditions of occurrence: Port persistence is enabled.
Impact: High
Customer circumvention: None.
Customer recovery steps: None.


Also think most of our ESX estate is still 6.5.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Thu Nov 21, 2019 12:19 pm 

Joined: Thu Nov 21, 2019 6:50 am
Posts: 4
NPIV has not been changed since initial install, always had NPIV enabled, which I believe is the default.

Double checked port and partner port in same switch.

We just changed to patch P103 last night actually and then tried node reboot after, so latest patch not fixed, but thanks for suggestion.

This from pre-upgrade looks promising though: -

"Brocade FOS 7 switches have a "F-Port login parameter" named "Enforce FLOGI/FDISC login" set to 0. Which means it will NOT allow a duplicate pWWN to be advertised on multiple ports. And due to that it has a tendency to break the 3PAR persistent port feature at times since at failover or failback time it is not possible to get the pWWN moved. When this is determined by the switch it will place the target port into a persistent disabled state."

I do remember reading this and I think I dismissed it because it said Brocade FOS 7 and we have OS v8.

I've checked the setting and it is configured to 0, which can cause this problem, so looks like we need to change to 2 to fix.

Probably won't be until next week now until I get to do that, so will update post with results then.

Thanks for help and suggestions guys :)


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Fri Nov 22, 2019 5:47 am 

Joined: Wed Nov 09, 2011 12:01 pm
Posts: 392
I've seen that multi pWWN warning for a while but I've never had an occurrence here, I think we still had 7.4 when port persistence came out and are now up to 8.1.

It should be easy to spot if, as it warns, it persistently disables the array port on the switch. ;)


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Mon Nov 25, 2019 6:13 am 

Joined: Thu Nov 21, 2019 6:50 am
Posts: 4
ailean wrote:
I've seen that multi pWWN warning for a while but I've never had an occurrence here, I think we still had 7.4 when port persistence came out and are now up to 8.1.

It should be easy to spot if, as it warns, it persistently disables the array port on the switch. ;)


Yeah the errdump on the switches are not showing warnings.
But if our issue is not that setting I cannot think what else it would be.

Just so I'm not misunderstanding what persistent ports provides, if this is working correctly, what would you expect to see on esx side?

I'm pretty sure before we only saw an information message on the esx host about a pause or delay. At the moment we seem to get multiple warnings and errors from MPIO showing degraded paths etc. My understanding is that I should not have any errors or warnings from MPIO if persistent ports is working correctly.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Mon Nov 25, 2019 8:51 am 

Joined: Wed Nov 09, 2011 12:01 pm
Posts: 392
I'd expect to possibly see a blip on those paths but they should remain alive in ESX during the port/node outage with another blip on return.

I'd check the status of the paths, as you're not using peer persistence/mirroring I'd expect all the paths to be active throughout the port/node outage with at most a couple of warnings during the beginning/end (a couple of dropped IOs during the flip).

I've not had access to the ESX console for a couple of years and as mentioned I don't think we have much 6.7 yet so I'm not sure of the exact messages.

Typically if paths go away ESX will moan but will continue to work while some paths remain, if all paths go away ESX used to get rather upset but I'm told it's a lot less messy since 6.5.

What I'd expect to really upset any host is if the paths remain and appear active but are responding badly in some way.

PS also spotted a couple of options on controlport to failover/failback persistent ports, not used these myself so some caution there but might allow more controlled testing then doing a full node reboot.

Maybe check what WWNs are on the port during a test and what zones are active (assuming WWN based zoning here).

PPS Had a chat with our ESX team and they said it sounds a bit like an all paths down event, which should be stated in the ESX logs. They also mentioned to check ESX HBA drivers as they've had some funkiness at that level before. Make sure the drivers are on the SPOCK 3PAR support list and if they are the absolute latest then try back reving them to a previous supported version (they've been tripped up with HPE latest drivers having issues before).


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR Persistent Port not working
PostPosted: Wed Nov 27, 2019 10:49 am 

Joined: Thu Nov 21, 2019 6:50 am
Posts: 4
Thanks for the reply ailean, that is useful info.

I'm currently waiting for a window where I can run through these tests to see if I can narrow this down, so once I get that and try I will feedback.

Regarding SPOCK, yes I've been through all of that recently actually. We are perfect accept the Brocade firmware is v8.1.0a and the current recommended target is v8.1.2g (on SPOCK) or v8.1.2h (on Brocade Recommended Target).

So I'm looking to get that updated too to make sure we are at the right level.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 


Who is online

Users browsing this forum: Google [Bot] and 61 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt