HPE Storage Users Group
https://3parug.com/

3PAR 7200 replacement disk marked as Slow Drive and fails
https://3parug.com/viewtopic.php?f=18&t=3748
Page 1 of 1

Author:  sivah [ Thu Feb 17, 2022 3:16 am ]
Post subject:  3PAR 7200 replacement disk marked as Slow Drive and fails

Hi all,

I have this weird issue with 3PAR 7200. I have a failed disk with specs 900GB FC 10K 6G Encrypted HDD.

Each time I replace it, servicemag resume will succeed.
However after a couple of hours, the disk will fail again. Also showing that servicemag start succeeds. I have tried 3 disks already, each with different DOM (2013, 2014, 2015) and it is still the same.

I then further dig into the logs. Each replacement that I have, I noticed that after servicemag completes, the replacement disk is always marked as a candidate for check_slow_disk task.

The IOPS for the replaced disk is between the range of 105 to 135. While the ideal should be 140 for a 10K HDD.

This is the last extract of the check_slow_disk before failing, for the 4th time.

2022-02-05 20:07:01 +08 Updated Executing "check_slow_disk" as 0:29843
2022-02-05 20:07:01 +08 Updated RPM 100 -> Good IOPS 2000
2022-02-05 20:07:01 +08 Updated RPM 10 -> Good IOPS 140
2022-02-05 20:07:01 +08 Updated RPM 150 -> Good IOPS 2000
2022-02-05 20:07:01 +08 Updated RPM 15 -> Good IOPS 180
2022-02-05 20:07:01 +08 Updated RPM 7 -> Good IOPS 60
2022-02-05 20:07:01 +08 Updated Running at interval 840 for 3360 seconds
2022-02-05 20:21:01 +08 Updated
2022-02-05 20:21:01 +08 Updated Starting next iteration
2022-02-05 20:21:01 +08 Updated
2022-02-05 20:21:01 +08 Updated Checking speed 7 drives
2022-02-05 20:21:01 +08 Updated Candidate:PDID: 27, adj_svct: 7.0, idle%: 99.7, iops: 0.5, kbps: 15.4, svct: 7.2
2022-02-05 20:21:01 +08 Updated Next:PDID: 19, adj_svct: 6.6, idle%: 99.8, iops: 0.4, kbps: 12.6, svct: 6.8
2022-02-05 20:21:01 +08 Updated Checking speed 10 drives
2022-02-05 20:21:01 +08 Updated Candidate:PDID: 64, adj_svct: 59.4, idle%: 7.6, iops: 109.7, kbps: 3027.2, svct: 98.3
2022-02-05 20:21:01 +08 Updated Next:PDID: 11, adj_svct: 15.3, idle%: 19.6, iops: 122.1, kbps: 3355.2, svct: 58.6
2022-02-05 20:35:01 +08 Updated
2022-02-05 20:35:01 +08 Updated Starting next iteration
2022-02-05 20:35:01 +08 Updated
2022-02-05 20:35:01 +08 Updated Checking speed 7 drives
2022-02-05 20:35:01 +08 Updated Candidate:PDID: 26, adj_svct: 4.2, idle%: 99.7, iops: 0.8, kbps: 33.2, svct: 4.6
2022-02-05 20:35:01 +08 Updated Next:PDID: 19, adj_svct: 3.9, idle%: 99.9, iops: 0.3, kbps: 11.6, svct: 4.1
2022-02-05 20:35:01 +08 Updated Checking speed 10 drives
2022-02-05 20:35:01 +08 Updated Candidate:PDID: 64, adj_svct: 113.1, idle%: 1.8, iops: 129.2, kbps: 3842.5, svct: 159.6
2022-02-05 20:35:01 +08 Updated Next:PDID: 36, adj_svct: 45.9, idle%: 10.5, iops: 143.5, kbps: 3871.3, svct: 96.7
2022-02-05 20:49:02 +08 Updated
2022-02-05 20:49:02 +08 Updated Starting next iteration
2022-02-05 20:49:02 +08 Updated
2022-02-05 20:49:02 +08 Updated Checking speed 7 drives
2022-02-05 20:49:02 +08 Updated Candidate:PDID: 19, adj_svct: 4.9, idle%: 99.8, iops: 0.4, kbps: 13.5, svct: 5.1
2022-02-05 20:49:02 +08 Updated Next:PDID: 27, adj_svct: 4.1, idle%: 99.8, iops: 0.5, kbps: 17.3, svct: 4.4
2022-02-05 20:49:02 +08 Updated Checking speed 10 drives
2022-02-05 20:49:02 +08 Updated Candidate:PDID: 64, adj_svct: 96.6, idle%: 2.0, iops: 128.5, kbps: 3936.1, svct: 143.0
2022-02-05 20:49:02 +08 Updated Next:PDID: 36, adj_svct: 29.2, idle%: 11.7, iops: 136.4, kbps: 3825.2, svct: 77.8
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Updated Starting next iteration
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Updated Checking speed 7 drives
2022-02-05 21:03:02 +08 Updated Candidate:PDID: 29, adj_svct: 12.5, idle%: 99.2, iops: 1.6, kbps: 289.4, svct: 13.7
2022-02-05 21:03:02 +08 Updated Next:PDID: 21, adj_svct: 12.5, idle%: 99.2, iops: 1.5, kbps: 282.6, svct: 13.6
2022-02-05 21:03:02 +08 Updated Checking speed 10 drives
2022-02-05 21:03:02 +08 Updated Candidate:PDID: 64, adj_svct: 105.8, idle%: 1.6, iops: 130.9, kbps: 4184.9, svct: 153.4
2022-02-05 21:03:02 +08 Updated Next:PDID: 35, adj_svct: 22.6, idle%: 12.0, iops: 142.0, kbps: 4152.3, svct: 73.5
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Updated FOUND SLOW DRIVE: PDID: 64, adj_svct: 105.8, idle%: 1.6, iops: 130.9, kbps: 4184.9, svct: 153.4
2022-02-05 21:03:02 +08 Updated Marking slow disk 64 failed
2022-02-05 21:03:02 +08 Updated Failed PDID 64
2022-02-05 21:03:02 +08 Updated
2022-02-05 21:03:02 +08 Completed.


The latest servicemag start

2022-02-06 00:30:36 +08 Updated Executing "sstart_pd_64" as 1:15777
2022-02-06 00:30:36 +08 Updated servicemag start -wait -pdid 64
2022-02-06 00:30:36 +08 Updated ... servicing disks in mag: 3 0
2022-02-06 00:30:36 +08 Updated ... normal disks:
2022-02-06 00:30:36 +08 Updated ... not normal disks: WWN [5000C5007F6EFABC] Id [64] diskpos [0]
2022-02-06 00:30:36 +08 Updated ... relocating chunklets to spare space...
2022-02-06 00:30:47 +08 Updated ... bypassing mag 3 0
2022-02-06 00:31:27 +08 Updated ... bypassed mag 3 0
2022-02-06 00:31:27 +08 Updated servicemag start -wait -pdid 64 -- Succeeded
2022-02-06 00:31:27 +08 Completed scheduled task.


I noticed that the replacement disk is a candidate for checking for 10 consecutive times then the system will mark it as Failed.

Has anyone experienced this same issue? Is there a way to not make the disk on the specific slot not to be slow?

Author:  MammaGutt [ Thu Feb 17, 2022 4:50 pm ]
Post subject:  Re: 3PAR 7200 replacement disk marked as Slow Drive and fail

Just asking, could the issue be the cage slot and not PDs? Are you seeing SAS errors or such on the slot?

From what I see, the drive has very high svct (service time or latency in plain english) which is probably why it is always a candidate.

Author:  sivah [ Wed Mar 09, 2022 12:35 am ]
Post subject:  Re: 3PAR 7200 replacement disk marked as Slow Drive and fail

Hi,

Just an update to this.
I have searched and found that HPE actually phased out the 900GB Encrypted HDDs that we are currently using and gave an advisory of using 1.2TB Encrypted HDDs instead

Advisory: (Revised) HPE 3PAR StoreServ 7000 Storage And HPE 3PAR StoreServ 10000 Storage - Transitioning From HCBRE, HCEP, And Certain SLTN HDD Spare Parts To Alternate Replacement HDD Spare Parts

https://support.hpe.com/hpesc/public/do ... 28695en_us

I finally ordered the 1.2TB disk instead which have a DOM of 2018 and now finally works after replacement for 5 days with no signs of being a "slow drive"

It seems those 900GB Encrypted HDDs we were using for replacement were just old and bad. Even though those parts were bought from multiple suppliers.

Author:  MammaGutt [ Wed Mar 09, 2022 3:12 am ]
Post subject:  Re: 3PAR 7200 replacement disk marked as Slow Drive and fail

I was told back in the days that 900 GB drives were discontinued as no vendor continued to make them when they released new series of drives.

Page 1 of 1 All times are UTC - 5 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/