HPE Storage Users Group https://3parug.com/ |
|
Low disk speed reported on VM using the 3Par SAN https://3parug.com/viewtopic.php?f=18&t=3318 |
Page 1 of 1 |
Author: | renard2guerre [ Thu Nov 21, 2019 2:54 am ] |
Post subject: | Low disk speed reported on VM using the 3Par SAN |
Hi, I have a customer that have VM that use for datastore, LUN hosted on a 3Par SAN. VM report a drop in writespeed, from 500MB/s to less than 100MB/s... Disk speed Issue happened on the 11/13/2019. HP C7000 is connected with the HP 3PAR SAN 7200 via Fiber Channel 8 Gbps Multi Mode links. The guy in charge of this SAN left... and I don't have documentation/knowledge about HP SAN. I've run some commands and I'm not sure if some actions could be done or if I should open a HP ticket right now. checkhealth -detail Checking alert Checking cabling Checking cage Checking dar Checking date Checking ld Checking license Checking network Checking node Checking pd Checking port Checking rc Checking snmp Checking task Checking vlun Checking vv Component ----------------------------Description---------------------------- Qty Alert New alerts 7 Cage Cages not on current firmware 1 Cage Cages missing A loop 1 Cage Degraded or failed cage power supply fans 2 Cage Degraded or failed cage power supplies 2 Cage Degraded or failed cage AC power 2 LD LDs in write through mode 22 LD LDs without backup 25 LD Number of logging LDs does not match number of nodes in the cluster 1 License Licenses which have expired 1 Network Too few working admin network connections 1 Network Errors detected on network 1 Node Nodes that are not online 1 Node Power supplies with failed or degraded AC 2 Node Power supplies with failed or degraded DC 2 PD Cages with unbalanced disks 1 PD PDs that are degraded 8 PD Disks experiencing a high level of I/O per second 1 vlun Hosts not connected to a port 2 Component ------Identifier------- -------------------------------------------------------------------------Description-------------------------------------------------------------------------- Alert sw_sysmgr Total FC raw space usage at 4942G (above 75% of total 6552G) Alert sw_task:4098 Task 4098 (type 'system_task', name 'check_slow_disk') has failed (Marking task 4098 failed due to node down-). Please see task status for details. Alert hw_node:1 Node 1 Failed (Node Offline Due to Failure) Alert hw_cage:0,hw_cage_ifc:1 Cage 0, Interface Card 1 Failed (CPU Firmware Unknown) Alert hw_cage:0,hw_cage_ifc:0 Cage 0, Interface Card 0 Failed (CPU Firmware Unknown) Alert hw_cage:0,hw_cage_bat:0 Cage 0, Cage Battery 0 Failed (UNDEFINED UNDEFINED) Alert sw_sysmgr License feature "Peer Motion" has expired. You are in violation of your 3PAR License Agreement. Please contact your 3PAR representative as soon as possible. Cage cage:0 Firmware is not current Cage cage:0 Missing A loop Cage cage:0 Power supply 0's fan is Cage cage:0 Power supply 0 is Cage cage:0 Power supply 0's AC state is Cage cage:0 Power supply 1's fan is Cage cage:0 Power supply 1 is Cage cage:0 Power supply 1's AC state is LD ld:admin.usr.0 LD is in write-through mode LD ld:admin.usr.0 LD does not have a backup LD ld:admin.usr.1 LD is in write-through mode LD ld:admin.usr.1 LD does not have a backup LD ld:admin.usr.2 LD is in write-through mode LD ld:admin.usr.2 LD does not have a backup LD ld:admin.usr.3 LD is in write-through mode LD ld:admin.usr.3 LD does not have a backup LD ld:pdsld0.0 LD does not have a backup LD ld:pdsld0.1 LD does not have a backup LD ld:pdsld0.2 LD does not have a backup LD ld:.srdata.usr.0 LD is in write-through mode LD ld:.srdata.usr.0 LD does not have a backup LD ld:.srdata.usr.1 LD is in write-through mode LD ld:.srdata.usr.1 LD does not have a backup LD ld:tp-2-sa-0.0 LD is in write-through mode LD ld:tp-2-sa-0.0 LD does not have a backup LD ld:tp-2-sa-0.1 LD is in write-through mode LD ld:tp-2-sa-0.1 LD does not have a backup LD ld:tp-2-sd-0.0 LD is in write-through mode LD ld:tp-2-sd-0.0 LD does not have a backup LD ld:tp-2-sd-0.1 LD is in write-through mode LD ld:tp-2-sd-0.1 LD does not have a backup LD ld:tp-2-sd-0.2 LD is in write-through mode LD ld:tp-2-sd-0.2 LD does not have a backup LD ld:tp-2-sd-0.3 LD is in write-through mode LD ld:tp-2-sd-0.3 LD does not have a backup LD ld:tp-2-sd-0.4 LD is in write-through mode LD ld:tp-2-sd-0.4 LD does not have a backup LD ld:tp-2-sd-0.5 LD is in write-through mode LD ld:tp-2-sd-0.5 LD does not have a backup LD ld:tp-2-sd-0.7 LD is in write-through mode LD ld:tp-2-sd-0.7 LD does not have a backup LD ld:tp-2-sd-0.6 LD is in write-through mode LD ld:tp-2-sd-0.6 LD does not have a backup LD ld:tp-2-sd-0.9 LD is in write-through mode LD ld:tp-2-sd-0.9 LD does not have a backup LD ld:tp-2-sd-0.8 LD is in write-through mode LD ld:tp-2-sd-0.8 LD does not have a backup LD ld:tp-2-sd-0.10 LD is in write-through mode LD ld:tp-2-sd-0.10 LD does not have a backup LD ld:tp-2-sd-0.11 LD is in write-through mode LD ld:tp-2-sd-0.11 LD does not have a backup LD ld:tp-2-sd-0.13 LD is in write-through mode LD ld:tp-2-sd-0.13 LD does not have a backup LD ld:tp-2-sd-0.12 LD is in write-through mode LD ld:tp-2-sd-0.12 LD does not have a backup LD -- Number of logging LDs does not match number of nodes in the cluster License Peer Motion License has expired Network -- Nodes have no admin network link detected Network Node0:Admin Errors detected on network Node node:1 Node is not online Node node:0 Power supply 0 AC state is -- Node node:0 Power supply 0 DC state is -- Node node:0 Power supply 1 AC state is -- Node node:0 Power supply 1 DC state is -- PD Cage:0 PDs FC/10K/900GB unbalanced. Primary path: 8 on Node:0, 0 on Node:1 PD disk:0 Degraded States: missing_A_port PD disk:1 Degraded States: missing_A_port PD disk:1 Disk is experiencing a high level of I/O per second: 167.0 PD disk:2 Degraded States: missing_A_port PD disk:3 Degraded States: missing_A_port PD disk:4 Degraded States: missing_A_port PD disk:5 Degraded States: missing_A_port PD disk:6 Degraded States: missing_A_port PD disk:7 Degraded States: missing_A_port vlun host:VPLEX Host wwn:5000144280581311 is not connected to a port vlun host:VPLEX Host wwn:5000144290581311 is not connected to a port showversion Release version 3.1.2 (MU2) Patches: None Component Name Version CLI Server 3.1.2 (MU2) CLI Client 3.1.2 (MU2) System Manager 3.1.2 (MU2) Kernel 3.1.2 (MU2) TPD Kernel Code 3.1.2 (MU2 showalert Id : 27 State : New Message Code: 0x027000f Time : 2018-06-03 23:52:00 EDT Severity : Minor Type : FC raw space allocation 75% alert Message : Total FC raw space usage at 4942G (above 75% of total 6552G) Id : 28 State : New Message Code: 0x00e000a Time : 2019-11-03 01:00:12 EST Severity : Minor Type : Task failed Message : Task 4098 (type 'system_task', name 'check_slow_disk') has failed (Marking task 4098 failed due to node down-). Please see task status for details. Id : 29 State : New Message Code: 0x01a00fa Time : 2019-11-03 01:00:12 EST Severity : Major Type : Component state change Message : Node 1 Failed (Node Offline Due to Failure) Id : 22 State : New Message Code: 0x02d00fa Time : 2019-11-03 01:01:04 EST Severity : Major Type : Component state change Message : Cage 0, Interface Card 1 Failed (CPU Firmware Unknown) Id : 23 State : New Message Code: 0x02d00fa Time : 2019-11-03 01:01:04 EST Severity : Major Type : Component state change Message : Cage 0, Interface Card 0 Failed (CPU Firmware Unknown) Id : 16 State : New Message Code: 0x05d00fa Time : 2019-11-20 12:58:46 EST Severity : Major Type : Component state change Message : Cage 0, Cage Battery 0 Failed (UNDEFINED UNDEFINED) Id : 18 State : New Message Code: 0x00e0005 Time : 2019-11-20 16:00:02 EST Severity : Major Type : License key usage Message : License feature "Peer Motion" has expired. You are in violation of your 3PAR License Agreement. Please contact your 3PAR representative as soon as possible. Last boot: 2014-03-10 22:24:42 EDT I see Node 0 PS 0 Battery 0 is degraded and all the PD are degraded. From showeventlog I see that on the 11/13/2019 "Message : Cage 0, Cage Battery 0 Failed (UNDEFINED UNDEFINED)". This message was sent on the same day as when VM reported low disk speed. Can it explain? Is there a way to make a SAN disk speed check to see if the issue is on the SAN? |
Author: | MammaGutt [ Thu Nov 21, 2019 3:11 am ] |
Post subject: | Re: Low disk speed reported on VM using the 3Par SAN |
Long story short, you don't have write cache enabled due to lack of redundancy (one node down, only one node up). |
Author: | renard2guerre [ Thu Nov 21, 2019 3:39 am ] |
Post subject: | Re: Low disk speed reported on VM using the 3Par SAN |
Thanks for quick reply. How can I know which node is down? Will it require a hw replacement? |
Author: | MammaGutt [ Thu Nov 21, 2019 4:44 am ] |
Post subject: | Re: Low disk speed reported on VM using the 3Par SAN |
renard2guerre wrote: Component ----------------------------Description---------------------------- Qty Alert New alerts 7 Cage Cages not on current firmware 1 Cage Cages missing A loop 1 Cage Degraded or failed cage power supply fans 2 Cage Degraded or failed cage power supplies 2 Cage Degraded or failed cage AC power 2 Node Nodes that are not online 1 Node Power supplies with failed or degraded AC 2 Node Power supplies with failed or degraded DC 2 Component ------Identifier------- -------------------------------------------------------------------------Description-------------------------------------------------------------------------- Alert hw_node:1 Node 1 Failed (Node Offline Due to Failure) Alert hw_cage:0,hw_cage_ifc:1 Cage 0, Interface Card 1 Failed (CPU Firmware Unknown) Alert hw_cage:0,hw_cage_ifc:0 Cage 0, Interface Card 0 Failed (CPU Firmware Unknown) Alert hw_cage:0,hw_cage_bat:0 Cage 0, Cage Battery 0 Failed (UNDEFINED UNDEFINED) Cage cage:0 Firmware is not current Cage cage:0 Missing A loop Cage cage:0 Power supply 0's fan is Cage cage:0 Power supply 0 is Cage cage:0 Power supply 0's AC state is Cage cage:0 Power supply 1's fan is Cage cage:0 Power supply 1 is Cage cage:0 Power supply 1's AC state is Node node:1 Node is not online Node node:0 Power supply 0 AC state is -- Node node:0 Power supply 0 DC state is -- Node node:0 Power supply 1 AC state is -- Node node:0 Power supply 1 DC state is -- showversion Release version 3.1.2 (MU2) Patches: None showalert Id : 29 State : New Message Code: 0x01a00fa Time : 2019-11-03 01:00:12 EST Severity : Major Type : Component state change Message : Node 1 Failed (Node Offline Due to Failure) And adding to this.. Inform OS 3.1.2 was the initial release for 3PAR 7000 in 2012. 3.1.2 became 3.1.3 (2014) which became 3.2.1 (2014) which became 3.2.2 (2015).... Most likely the node has failed and needs to be replaced. After that you should really get a software upgrade on the system because the node failure has thrown some error messages I've never seen before |
Author: | renard2guerre [ Thu Nov 21, 2019 5:39 am ] |
Post subject: | Re: Low disk speed reported on VM using the 3Par SAN |
Thanks! Do you have some documentation regarding how to do the upgrade? Is it service affecting? |
Author: | MammaGutt [ Thu Nov 21, 2019 10:38 am ] |
Post subject: | Re: Low disk speed reported on VM using the 3Par SAN |
If you have a service contract is included. It is intended to be online, but I know HPE needs you to accept all risks for such an old 3PAR OS version. |
Page 1 of 1 | All times are UTC - 5 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |