Just built a new server running a simple software RAID1 setup on Ubuntu. I documented the process on rebuilding the array when replacing one of the drives. I couldn't find very good current documentation on this, so maybe this will help some of you if you ever need to replace a faulty drive on a newer linux system. On the setup below I have two RAID arrays setup, one for the main root file-system, and one for the swap drive. (md0, and md1) To Determine if RAID is healthy cat /proc/mdstat Should look like this: Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid1 sdb5[1] sda5[0] 12209280 blocks [2/2] [UU] md0 : active raid1 sdb1[1] sda1[0] 231986496 blocks [2/2] [UU] The hard drive is split into two RAID partitions. md0 is the main file-system, and md1 is the swap file. If either partition show [UU] then things are healthy, if it shows [_U] then one of the drives is not working properly in the RAID array. You will notice md1 shows usage of sdb5 and sda5, md0 uses sdb1 and sda1. sda and sdb are the two physical drives in the machine. If you do not see any sda or sdb partitions when you run cat /proc/mdstat that means that drive is not currently being utilized in the array. The partition tables for sda and sdb have been previously saved to /raidinfo/partitions.sda and /raidinfo/partitions.sdb using the command: sfdisk -d /dev/sda > /raidinfo/partitions.sda sfdisk -d /dev/sdb > /raidinfo/partitions.sdb If one of the drives were to be replaced, the steps to rebuild the array are as follows: 1. Determine which drive is not showing up on cat /proc/mdstat (sda or sdb) 2. Repartition the new drive using the backed up partition table sfdisk /dev/sda < /raidinfo/partitions.sda (if the sda drive failed) sfdisk /dev/sdb < /raidinfo/partitions.sdb (if the sdb drive failed) 3. Add the drive back into the array mdadm -add /dev/md0 /dev/sda1 (if the sda is the new drive) or mdadm -add /dev/md0 /dev/sdb1 (if the sdb is the new drive) then mdadm --add /dev/md1 /dev/sda5 (if the sda is the new drive) or mdadm --add /dev/md1 /dev/sdb5 (if the sdb is the new drive) This step may take several hours to rebuild. You can check the progress of the rebuild by running this command again: cat /proc/mdstat Once the RAID array has been rebuilt, you will need to rewrite the MBR (Master boot record) to both drives. This will allow you to boot off of either drive if one of them fails again in the future. Type the following commands: grub device (hd0) /dev/sda root (hd0,0) setup (hd0) device (hd1) /dev/sdb root (hd1,0) setup (hd1) quit Finally, reboot and check cat /proc/mdstat to make sure things look healthy again.
Or you could use TestDisk or scsirastools or I have found the least hassle way to recover your RAID 1 is using the GUI Webmin tool www.webmin.com. On my home server I disconnected my primary disk to test my RAID worked properly. Rebuilding it was a simple as selecting it on the Webmin management page and adding it back in. Monitoring of the rebuild was still performed by cat /proc/mdstat.
That looks very interesting. Does webmin and/or the other tools automagically rebuild Grub for the MBR of the new drive?
Yep, that seems about normal. My rebuild with 2 typical SATA II Seagate 250GB hard disks. Nvidia 430 SATA controller. Took about 1-1.5 hours.