GNU Linux/Software RAID

From WhyAskWhy.org Wiki
< GNU Linux
Revision as of 19:36, 7 August 2012 by Deoren (talk | contribs) (Saving progress.)
Jump to: navigation, search



The following content is a Work In Progress and may contain broken links, incomplete directions or other errors. Once the initial work is complete this notice will be removed. Please contact me via Twitter with any questions and I'll try to help you out.


These are my scratch notes for recovering Software RAID arrays on a GNU/Linux box. The examples here are for a CentOS 5.x box, but presumably any recent GNU/Linux distro could be used that has support for Software RAID via mdadm. In case it's not clear, I'm a newbie when it comes to Software RAID, so some of these steps may be redundant or nonsensical. If so, please feel free to point that out so I can make this easier to read.


The problem report

This started off with me receiving emails from mdadm (that was monitoring three RAID devices on a 1U server with 4 physical disks) that there was a DegradedArray event on md device /dev/md0.

This is an automatically generated mail message from mdadm running on server.example.org

A DegradedArray event had been detected on md device /dev/md0.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdd1[3] sdc1[2] sdb2[1]
2917676544 blocks level 5, 256k chunk, algorithm 2 [4/3] [_UUU]

md1 : active raid1 sdd2[1] sdc2[0]
8385856 blocks [2/2] [UU]

md3 : active raid5 sdd3[3] sdc3[2] sdb3[1]
2917700352 blocks level 5, 256k chunk, algorithm 2 [4/3] [_UUU]

md0 : active raid1 sdb1[1]
8385792 blocks [2/1] [_U]

unused devices: <none>

Determining what disks or partitions the RAID device is composed of

Based on a previous conversation with another tech, I knew that a RAID device could be composed of entire disks or partitions from multiple disks. The advantage of using partitions instead of entire disks is the ease in which you can satisfy the requirement that all RAID members be the same size. In this case, partitions were used to assemble the RAID devices instead of entire disks.

Assuming that your main root partition is still operational, it's time to collection some information.


mdadm.conf contents

cat /etc/mdadm.conf
DEVICE partitions
MAILADDR root
ARRAY /dev/md0 level=raid1 num-devices=2 uuid=cbae8de5:892d4ac9:c1cb8fb2:5f4ab019
ARRAY /dev/md3 level=raid5 num-devices=4 uuid=a5690093:5c58a8d9:ac966bcf:a00660c2
ARRAY /dev/md2 level=raid5 num-devices=4 uuid=a45e768c:246aca55:1c012e56:58dd3958
ARRAY /dev/md1 level=raid1 num-devices=2 uuid=183e0f5d:2ac92a56:f064a724:9c4cc3a4


Partitions list

fdisk -l
Disk /dev/sda: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        1044     8385898+  fd  Linux raid autodetect
/dev/sda2            1045      122122   972559035   fd  Linux raid autodetect
/dev/sda3          122123      243201   972567067+  fd  Linux raid autodetect

Disk /dev/sdb: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        1044     8385898+  fd  Linux raid autodetect
/dev/sdb2            1045      122122   972559035   fd  Linux raid autodetect
/dev/sdb3          122123      243201   972567067+  fd  Linux raid autodetect

Disk /dev/sdc: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *           1      121078   972559003+  fd  Linux raid autodetect
/dev/sdc2          121079      122122     8385930   fd  Linux raid autodetect
/dev/sdc3          122123      243201   972567067+  fd  Linux raid autodetect

Disk /dev/sdd: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1   *           1      121078   972559003+  fd  Linux raid autodetect
/dev/sdd2          121079      122122     8385930   fd  Linux raid autodetect
/dev/sdd3          122123      243201   972567067+  fd  Linux raid autodetect


Determining array members based on block size

Since RAID devices require identical sizes on the array members, I realized that to find other array members I could determine the block size of one member and use that to find matching partitions. So if /dev/sdb1 as the remaining member of the /dev/md0 array and it has a block size of 8385898, the other member would also need to have the same block size.

fdisk -l | grep 8385898
/dev/sda1   *           1        1044     8385898+  fd  Linux raid autodetect
/dev/sdb1   *           1        1044     8385898+  fd  Linux raid autodetect

However, that could prove untrue with arrays that are of identical size, so thankfully there is an easier way to find out the array members.


Determining array members based on mdadm output

mdadm allows us to get the list of array members for a specified array with a short command:

mdadm --misc --detail /dev/md1
/dev/md1:
        Version : 0.90
  Creation Time : Wed Jul 13 23:04:19 2011
     Raid Level : raid1
     Array Size : 8385856 (8.00 GiB 8.59 GB)
  Used Dev Size : 8385856 (8.00 GiB 8.59 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Tue Aug  7 14:12:52 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 183e0f5d:2ac92a56:f064a724:9c4cc3a4
         Events : 0.168

    Number   Major   Minor   RaidDevice State
       0       8       34        0      active sync   /dev/sdc2
       1       8       50        1      active sync   /dev/sdd2

In this case we see that both /dev/sda1 and /dev/sdb1 make up the /dev/md0 RAID device, so we'll need to make sure both are active. In this case we can see that they are because I've already added /dev/sda1 back to the array.

However, you require the RAID device to be active and all members assembled in order to get the complete listing. Otherwise, you have to example each disk/partition for the presence of the UUID that identifies that RAID device. This is where having a current mdadm.conf file really comes in handy.


Repairing the root RAID device

To find out which array member is missing, let's determine which one isn't:


cat /proc/mdstat | grep md0
md0 : active raid1 sdb1[1]

So, it appears that /dev/sda1 needs to be added back.

mdadm --add /dev/md0 /dev/sda1

Snippet from /var/log/messages related to the last command:

Aug  7 14:16:00 lockss1 kernel: md: bind<sda1>
Aug  7 14:16:00 lockss1 kernel: RAID1 conf printout:
Aug  7 14:16:00 lockss1 kernel:  --- wd:1 rd:2
Aug  7 14:16:00 lockss1 kernel:  disk 0, wo:1, o:1, dev:sda1
Aug  7 14:16:00 lockss1 kernel:  disk 1, wo:0, o:1, dev:sdb1
Aug  7 14:16:00 lockss1 kernel: md: syncing RAID array md0
Aug  7 14:16:00 lockss1 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Aug  7 14:16:00 lockss1 kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Aug  7 14:16:00 lockss1 kernel: md: using 128k window, over a total of 8385792 blocks.
Aug  7 14:22:31 lockss1 kernel: md: md0: sync done.
Aug  7 14:22:31 lockss1 kernel: RAID1 conf printout:
Aug  7 14:22:31 lockss1 kernel:  --- wd:2 rd:2
Aug  7 14:22:31 lockss1 kernel:  disk 0, wo:0, o:1, dev:sda1
Aug  7 14:22:31 lockss1 kernel:  disk 1, wo:0, o:1, dev:sdb1
Aug  7 14:26:10 lockss1 smartd[3387]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
tail /proc/mdstat
md1 : active raid1 sdd2[1] sdc2[0]
      8385856 blocks [2/2] [UU]

md3 : active raid5 sdd3[3] sdc3[2] sdb3[1]
      2917700352 blocks level 5, 256k chunk, algorithm 2 [4/3] [_UUU]

md0 : active raid1 sda1[0] sdb1[1]
      8385792 blocks [2/2] [UU]

unused devices: <none>

Even with the smartd error, it looks like /dev/md0 is holding. We'll have to go back at some point and run fsck on it from a rescue disc so the filesystem isn't mounted while we're trying to verify its consistency.


mdadm --misc --detail /dev/md1

As we can see, /dev/md0 has been restored to service.

/dev/md0:
        Version : 0.90
  Creation Time : Wed Jul 13 23:04:19 2011
     Raid Level : raid1
     Array Size : 8385792 (8.00 GiB 8.59 GB)
  Used Dev Size : 8385792 (8.00 GiB 8.59 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Aug  7 17:50:16 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : cbae8de5:892d4ac9:c1cb8fb2:5f4ab019
         Events : 0.4493518

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1


Repairing md3

For this particular server the /dev/md2 and /dev/md3 RAID devices are used to store content that is accessed pretty infrequently, so we can actually take the system out of service while we're recovering the arrays. RAID allows us to rebuild the arrays but still use them in their degraded state with reduced performance, but it is important to understand that rebuilding an array puts a high demand on all RAID members that are used to restore content. In my case I'm able to take the server out of service to allow for faster restoration of the arrays.

(Optional) Stopping services that are configured to use the RAID devices

service stop SERVICE_NAME
umount /dev/md2
umount /dev/md3

Unmounting the file systems should stop services/daemons from hitting the RAID devices we're trying to repair.

mdadm --stop /dev/md2
mdadm: stopped /dev/md2
mdadm --stop /dev/md3
mdadm: stopped /dev/md3

We've now stopped the devices and are ready to begin adding the missing members.

Determining the array members for /dev/md3