Crisis - 2 drives failed in 5 drive RAID 5 array

stick1977

Member
Reaction score
0
and the backups haven't been running... someone please shoot me in the face with a rocket launcher.

Does anyone know of a utility that resets SMART alerts? The BIOS isn't giving us this option and it's a fully updated BIOS. Basically from what I've been hearing from the tech onsite is that he needs a utility that resests SMART alerts so the BIOS will be able to see all drives.
 
and the backups haven't been running... someone please shoot me in the face with a rocket launcher.

Does anyone know of a utility that resets SMART alerts? The BIOS isn't giving us this option and it's a fully updated BIOS. Basically from what I've been hearing from the tech onsite is that he needs a utility that resests SMART alerts so the BIOS will be able to see all drives.

I know diddly about servers, but having a RAID controller halt on SMART errors alone seems to me to be a bad design.

I would be concerned about the tech that wants to reset SMART. Why not pull the drives and perform a clone/image and replace the faulty drives?
 
Yea phaZed I dunno. I'm having a hard time getting a hold of the onsite tech at this point. I think he's on the phone with Dell.
 
Basically from what I've been hearing from the tech onsite is that he needs a utility that resests SMART alerts so the BIOS will be able to see all drives.

Doesn't sound correct, but you never know. Anyhow, if the drive is bad to the point that smart is preventing it from working, then disabling that percaution seems like the incorrect next step.

The 2 bad drives need to be diagnosed, at least one repaired & cloned, and RAID reconstructed. You should clone all the good drives now too, before doing anything else. If you try to "rebuild" the RAID (assuming you get the drives back up), you may inadvertently make things worse, so "reconstruction" is the way to go here (once you get 4/5 drives working and cloned).
 
500 GB drives, Seagate Barracuda

Oh...SATA. Ack.
Prolly no support left on that old rig unless she had a few extended warranties? Dell support has walked me through a few tricks on their RAID controllers (at least the better models...dunno if a SATA RAID controller will have it, prolly just Adaptec/Intel fake RAID..but hopefully not) to bring back drives that weren't recognized....one time wasn't long ago, after a 1900 that had been running for a few years straight was shut down for an office relocation. Servers that have been running fine for years straight, 24x7x365..it's when they get shut down that...on the next power up, if something is gonna go wrong..that's when! Anyways...there was some advanced "force online" (something like that) option he had me do once.

The server isn't going to have a data volume anyways...no backups, I'd call Dell and if it's out of extended warranty...just sign up for extended warranty at that time (purchase it)..and get support going.
 
My onsite tech got lucky, the server is out of warranty but he found someone at Dell that talked to him for free, yes very lucky. Did I say or imply SATA? Sorry if I did, they're SAS drives. We're trying to get one straight away so as to expedite this recovery, if possible. But from listening to npinc, we'd better get four right away not just one. Thank you for your reply.
 
Well, there's a reason I asked that. We just ran into it with an HP. BTW, I knew they were SAS drives, it's all good.

Anyways, the hardware was saying the drive was bad, the drive was bad, blah blah blah. Turned out there was nothing wrong with the drive. The backplane was faulty. Take the drive and put it on another port. I wouldn't be surprised if it shows normal. Don't forget, those are 1.2M hour drives.

We also saw this happen in a Dell not that long ago.
 
Just note. There are two reasons why we are unable to recover data from a RAID.

1. The drives are damaged beyond recovery
2. The technician and/or client destroys any chance of recovery

So, before you do anything, be sure to get a full sector-by-sector clone of each drive...including the two that failed. Then recover the data from the clones.

If you ever have any questions about an issue and are unsure of the safest course of action, you are always welcome to call me and I can can advise.

Whatever you do, don't spend a day or two fighting with it first. Yes, you might get lucky, but the odds are, you won't. This is no time to gamble.

Luke
 
Back
Top