HCHTech
Well-Known Member
- Reaction score
- 4,212
- Location
- Pittsburgh, PA - USA
I got an email from one of my iDRAC setups yesterday warning of a failed hard drive - boy that felt good. Just like you imagine when you setup a warning system, you want to get a notice of a problem before there is any downtime.
So I check to see if I've got an 8TB SAS drive in inventory to take with me - oops, guess I must have used the one I had at some point and not re-ordered a replacement - that was a mistake. Well, it's a single disk in a RAID10 array, I guess I'll just get the warranty replacement going with Dell, then go onsite.
Once on the phone with Dell, it turns out that they made a mistake when registering this service tag last year, so we can't proceed until they fix that - ugh, thanks, Dell. 3 hours later, I get a call back that they've fixed their problem and we can proceed with the warranty claim. 45 minutes of nonsense follows, I think it must have been the guy's first day on the job or something. Multiple holds while he checked with his superior. Finally get to the end and instead of ordering the drive, they send an email for me to fill out most of the same information they took over the phone - ugh, bureaucracy at its finest. The email requires the Dell part number, which of course isn't reported anywhere in iDRAC, so I give up and head out to the clients.
I arrive at the clients, get into iDRAC again and confirm which drive is the problem, remove it and take a picture of label so I have the Dell part number. Then I look through their cold spares to see if there is an 8TB drive there. OS SSD, check. Redundant power supply, check. No 8TB drive. Hmm, I wonder why that is - another mistake. Anyway, I remount the drive so it doesn't get lost while I wait for it's replacement to arrive. To my surprise, iDRAC now reports the drive is healthy and the array is rebuilding. I wait for a few minutes, but the rebuild is continuing without error. Weird.
Once back in the office, I decide to wait a bit to send back the warranty claim email - If I end up trying to claim a healthy drive, they'll probably charge me the full Dell extortion rate for it. The rebuild was at about 60% done when I finished up yesterday, and this morning I see the array is reporting as healthy again, and so is the problem drive.
So now I don't know how to proceed other than to just wait and see if there are additional errors reported on that drive. Plus, what even was the original problem that caused iDRAC to mark it failed? If it doesn't turn out to be failing, I just wasted a few non-billable hours chasing around, and I'm not quite as fond of the iDRAC warning system as I was a couple of days ago. I should probably run some diagnostics on that drive to sleep better, but that will either take a few hours on downtime to run the Dell diags during a maintenance window, or I'll have to purposefully degrade the array again to remove the drive and test it on the bench. Frustrating.
So I check to see if I've got an 8TB SAS drive in inventory to take with me - oops, guess I must have used the one I had at some point and not re-ordered a replacement - that was a mistake. Well, it's a single disk in a RAID10 array, I guess I'll just get the warranty replacement going with Dell, then go onsite.
Once on the phone with Dell, it turns out that they made a mistake when registering this service tag last year, so we can't proceed until they fix that - ugh, thanks, Dell. 3 hours later, I get a call back that they've fixed their problem and we can proceed with the warranty claim. 45 minutes of nonsense follows, I think it must have been the guy's first day on the job or something. Multiple holds while he checked with his superior. Finally get to the end and instead of ordering the drive, they send an email for me to fill out most of the same information they took over the phone - ugh, bureaucracy at its finest. The email requires the Dell part number, which of course isn't reported anywhere in iDRAC, so I give up and head out to the clients.
I arrive at the clients, get into iDRAC again and confirm which drive is the problem, remove it and take a picture of label so I have the Dell part number. Then I look through their cold spares to see if there is an 8TB drive there. OS SSD, check. Redundant power supply, check. No 8TB drive. Hmm, I wonder why that is - another mistake. Anyway, I remount the drive so it doesn't get lost while I wait for it's replacement to arrive. To my surprise, iDRAC now reports the drive is healthy and the array is rebuilding. I wait for a few minutes, but the rebuild is continuing without error. Weird.
Once back in the office, I decide to wait a bit to send back the warranty claim email - If I end up trying to claim a healthy drive, they'll probably charge me the full Dell extortion rate for it. The rebuild was at about 60% done when I finished up yesterday, and this morning I see the array is reporting as healthy again, and so is the problem drive.
So now I don't know how to proceed other than to just wait and see if there are additional errors reported on that drive. Plus, what even was the original problem that caused iDRAC to mark it failed? If it doesn't turn out to be failing, I just wasted a few non-billable hours chasing around, and I'm not quite as fond of the iDRAC warning system as I was a couple of days ago. I should probably run some diagnostics on that drive to sleep better, but that will either take a few hours on downtime to run the Dell diags during a maintenance window, or I'll have to purposefully degrade the array again to remove the drive and test it on the bench. Frustrating.