Right now we are trying to understand a failed test case we are experiencing where automatic failover didn't happen when it should have. We are using quorum witness.
We have an RCG set up between 2 3par 8440s. Our testing team rebooted the first one (the active system) using the SSMC reboot command. However, instead of failing over gracefully, everything got hung up.
The ideal test case would be having someone yank the power cables, but that's not realistic given our scenario.
When you issue a "reboot" command via SSMC, what actually happens? Do both controllers/nodes in the 3par simultaneously reboot? Does one node gracefully hand off traffic to the other, reboot, come back up, and then the other does the same thing?
It makes a difference because if they both go down simultaneously, because then both of the checks for an automatic failover (losing heartbeat between the 3pars, losing communication between the 3par and the quorum withess) would be triggered. If it reboots each controller gracefully in sequence, then there will always be a controller responding to the quorum witness and providing a heartbeat to the other 3par, which means it won't try to fail over to the second 3par.
Just trying to understand what "reboot" means.
|