Raid 5 Problems / Data Recovery

Reaction score
5
Hello fellow tech people. I have an issue that I dont quite know the answer.

Here is what happened:

Server 2008 R2 Raid 5. 4 Drives. Two amber drives. (Crap!)
No backups (Clients choice) (Even bigger crap!)

We tried everything... drives didn't go green. We took a shot and forced a drive online. It went green and was successful. The raid now said "Degraded". The 2nd amber drive went green and rebuilt.
Raid 5 says online. No issues.

Reboot the server. The server will not boot. It will only boot in recovery mode. I tried all the fixes. None worked. The server simply wont boot. bcd rebuild... everything. (btw the bcd rebuild sees the OS and I can say make it bootable but it simply wont boot) Fixboot fixmbr... everything. No chkdsk /r because thats a no no on a raid 5.

When I go into recovery mode and go to the command prompt I check diskpart. I see all the drive partitions. They all say they are healthy. When I cd to the drives I can see file structure. I see the directories ect.

So my question is:

If I do a reinstall of the OS do you think I will be able to access those other partitions and not lose the data? Obviously I wouldn't format the drive or re partition anything. I would just use the same partition that the OS is on now. What do you all think?

Thank you in advance. What a pickle....
 
The issue is it's hard telling what damage has been done to the data since you had possibly 2 bad drives and might still have 2 bad drives. So anything you do to them puts the data at further risk. If you had a solid backup or images of all the drives you could get back to square one by re-imaging or re-installing the OS, however right now you have a time bomb that may blow or you may cut the right wire, hard to tell. But if the data is gone the client is going to blame you even though they denied having a backup. So I would play it safe as possible. At least after this ordeal they will realise they do need backups after all.
 
The issue is it's hard telling what damage has been done to the data since you had possibly 2 bad drives and might still have 2 bad drives. So anything you do to them puts the data at further risk. If you had a solid backup or images of all the drives you could get back to square one by re-imaging or re-installing the OS, however right now you have a time bomb that may blow or you may cut the right wire, hard to tell. But if the data is gone the client is going to blame you even though they denied having a backup. So I would play it safe as possible. At least after this ordeal they will realise they do need backups after all.

I hear you. Of course now they want the backup solution and of course they want to know WHY the data hasn't been backing up even though they decided to roll the dice and opt for regular archiving with copy off. Bad move.

Yes, the recovery will certainly be at the expensive of the business. Forcing the drives online was a last ditch by the Dell support tech. Since I can browse the data when I select the directories I feel confident the data is intact... hopefully. I believe you are correct though. The efforts would be best put into recovery rather than taking a chance on losing it all. Lets see what the client wants to do.

Do you know of a reliable recovery company? This data is pretty critical so I need an experienced company.

Thanks!
 
I hear you. Of course now they want the backup solution and of course they want to know WHY the data hasn't been backing up even though they decided to roll the dice and opt for regular archiving with copy off. Bad move.

Yes, the recovery will certainly be at the expensive of the business. Forcing the drives online was a last ditch by the Dell support tech. Since I can browse the data when I select the directories I feel confident the data is intact... hopefully. I believe you are correct though. The efforts would be best put into recovery rather than taking a chance on losing it all. Lets see what the client wants to do.

Do you know of a reliable recovery company? This data is pretty critical so I need an experienced company.

Thanks!

As @Slaters Kustum Machines said there are several members that do professional DR. Beyond that.

Going forward you really should modify your procedures. Any situation which involves problem drives should have very limited activities if data is important. Start with what kind of monetary value the customer places on the data. Anything that is done with the machine, even rebooting, can reduce the success of data recovery. I always make the risks completely clear up front.

If they are up front about willing to spend money then it is just passed on to a DR business. If not then the first thing is imaging the drives. Once imaged, apps like R-Studio can rebuild a RAID array from the images.

The resources section has several articles about this.

https://www.technibble.com/forums/resources/
 
Hello fellow tech people. I have an issue that I dont quite know the answer.
When I go into recovery mode and go to the command prompt I check diskpart. I see all the drive partitions. They all say they are healthy. When I cd to the drives I can see file structure. I see the directories ect.
.
If you can see the directories, why not just copy the data out at that point?
 
Problem 1: No backup
Problem 2: Using RAID 5
Problem 3: Most likely RAID was degraded (one drive offline) and was ignored
Problem 4: Potentially failing drive was forced online
Problem 5: RAID rebuild was done without a backup/or images of the drives
Problem 6: Messing around with trying to get it booting
 
A single RAID 5 volume that was partitioned into C and D? Ugh!

And no backups..double ugh!
2 out of 4 drives in an R5...you can safely assume it's not coming up. The parity is corrupted/gone and any attempts to force online and rebuild will over-write...probably with corrupted data. Don't take this on your shoulders....you stated it was the clients decision not to have backup. Shipping drives out to a data recovery house will cause downtime..but again..clients decision not to have D/R backups (minimum....full image backups).
 
You must do what @lcoughey said immediately. System off, drives out, do not touch except to get to someone qualified.

Frankly you may already be screwed because you forced one of the drives online - do you know whether it was the one that had just failed, or was it one that failed months ago and nobody noticed? If you forced an old one back online you may have managed to eliminate any chance of successful recovery of anything substantial.
 
I just used Proven Data Recovery for a client that got hit with Locky that didn't want offsite backup.

They were good.

I wish use Gillette.

Don't mess with it, send drives out , get cost, add 15% plus consulting time. Rebuild domain and server, restore data, sell them on backup and msp plan.
 
I just used Proven Data Recovery for a client that got hit with Locky that didn't want offsite backup.

They were good.

I wish use Gillette.

Don't mess with it, send drives out , get cost, add 15% plus consulting time. Rebuild domain and server, restore data, sell them on backup and msp plan.

You got it.

The drives are out and I was told there was file structure but corrupt. They are confident that all the data will be saved. No physical damage to any drives.

The company decided to scrap the server for a new one. Raid 6 now. They also have seen the light and went with a backup solution. A very expensive lesson to be learned.

As to others... C: D: E: Partition actually. How would you have partitioned it?

Regular copy off means 4 years of data are saved on external media and only 1 year of data needed to be recovered. The intervals were normally 4 month copy off but they fell behind.

For those listing the problems, I agree. This has also been a learning experience for us as well. The lack of backups gave way to desperation to get things back online. This is the first time we have had a problem of this magnitude with the lack of a backup solution in place. Ive never had 2 drives fail at the same time. (Yes they failed at the same time, we monitor the server health) It seems the data is recoverable so I guess not that bad but it could have been worse.

New server has Raid 6 and a backup appliance will be put in place. Now I need to work on getting the network rebuilt.

Edit: and thank you all for the help and harsh criticism. I needed it. :)
 
Last edited:
RAID6 has double parity because with the size of modern drives (and depending on the quality of the drives) single parity would probably cause rebuilds to fail about 30-50% of the time. The larger the size of the drives you use, the more risk you will have. With RAID 6, in theory you could have two disks fail, but then it is basically a RAID 5 making the rebuild risky.

Not that RAID 6 isn't an acceptable option. If you need to maximize space with a limited budget, that's when you would use it.
 
I see you offer managed services, why didn't you catch this drive failing?

Even break fix clients shoukd get an agent installed
 
Back
Top