Information about starting and operating an ISP or corporate Intranet using Linux servers.

RAID notes

Changing RAID file-system type or changing partitions from RAID to non-RAID is not always as smooth as you might expect. The notes below are works in progress, there are certainly a lot of variables not considered here that could affect you, and loss of data is always possible. Make a back-up before trying anything.

Fixing a changed device number

Scenario: To fix a problem you boot with a “rescue” CD or USB stick and suddenly you see your RAID devices have started numbering themselves around /dev/md120 instead of single-digit values like /dev/md1. When you reboot back to your normal installation it ignores the values in /etc/mdadm.conf; instead the new /dev/md120+ values remain and the server halts during boot. If you boot back to the rescue CD and manually change the devices back to what they were and reboot, they just go back to the new high-number values.

Why this happened I do not understand, but I've read a few other people asking about this and they did not get an answer. I finally figured it out: Each partitions stores a preferred device minor value (e.g., “3”), and somehow that number was changed (e.g., to something like “120”). To fix the problem you need to reset the preferred minor value using mdadm.

For example, if your /dev/md1 contained drive partitions /dev/sda1 and /dev/sdb1, and it mysteriously changed to /dev/md125, you would see this:

# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
         [snip...]
  Total Devices : 2
Preferred Minor : 125
(etc.)

To fix the problem you would do this:

# mdadm --stop /dev/md125
# mdadm --assemble /dev/md1 --update=super-minor /dev/sda1 /dev/sdb1

Now if you run the mdadm –examine command again you will see the minor number has changed:

# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
         [snip...]
  Total Devices : 2
Preferred Minor : 1
(etc.)

Repeat the process for every RAID device then reboot.

Changing a filesystem on RAID

Suppose you want to change a filesystem on a RAID device. For example, you might see some strange corruption now and then with a filesystem such as XFS and decide to switch to EXT4.

It seems simple enough: Back-up the data to another drive, unmount the RAID device, create the new filesystem, put the data back, and change the entry in /etc/fstab. But I did this and after rebooting saw the O/S trying to mount the device using the former filesystem. Also, I had some UUIDs in fstab and saw messages such as /dev/sda5 is busy or already mounted (rather than referencing the RAID device /dev/md#).

First I discovered that the old filesystem type was still listed in /etc/blkid.tab yet the UUID of the RAID device changed so that it no longer matched the UUID of the individual drive partitions. So I tried two things to fix this:

  1. edit /etc/blkid.tab and change the filesystem type for each relevant device
  2. Possibly use mdadm similar to what is shown above to update the superblock of each affected RAID device, e.g.,
    mdadm --stop /dev/md10
    mdadm --assemble /dev/md10 --update=uuid --uuid=(drive's UUID)

    However, using mdadm version 3.0 this did not work even though the man page said it should. After I update to version 3.1.4 I will test this again. See man mdadm for yourself.

Instead of #2, other possible ways to fix the problem are:

  1. Change the UUIDs in /etc/fstab to reference the new value for the RAID device as reported by the blkid command.
  2. Change /etc/fstab to reference the /dev/md# device name instead of the UUID. This is what I did.

Adding drives

If you add more drives to an existing array you need to do these steps to put them to use:

mdadm --add /dev/md# /dev/sd?# etc.
mdadm --grow /dev/md# --raid-devices=(the new total number)

At this point they will be active in the array and the new drives will start synchronising (i.e., being built). When the new drives are done being rebuilt, then you can expand the file-system to actually use the new space. The tool you use will depend on the filesystem. For example, ext2/ext3/ext4 you use resize2fs while for xfs you would use xfs_growfs. Use the apropos command to find the command suitable for your filesystem.

Example:

resize2fs -p /dev/md11

Recovering from interrupted "grow"

While adding two drives to a RAID-5 set one of the two new drives went bad. This caused the completion time shown in /proc/mdstat to keep climbing. When the time had climbed from the initial ~8 hour estimate to where it was reporting a completion time more than a week in the future, I tried to perform a “shutdown”. This failed and the server froze.

After rebooting the server, the array showed itself as “inactive” in /proc/mdstat. Trying to ”–run” the array just gave an “input/out error”. Using the ”–examine” command (e.g., “mdadm –examine /dev/sda10”) showed that the reshape was interrupted, but mdadm knew exactly where it had left off.

After I had replaced the faulty drive I considered whether to wipe the array clean, but restoring over a terabyte of data from back-up is daunting plus there were a few days of data that were not backed-up. Finally I found this solution:

mdadm --stop /dev/md10
mdadm --assemble /dev/md10 /dev/sda10 /dev/sdb10 /dev/sdc10 /dev/sdd10 /dev/sde10 /dev/sdf10 --force

The ”–force” option did the trick. /proc/mdstat then showed that the array was being re-shaped, I was able to mount it and see that the data was still there. When it is done re-shaping I will proceed to add the replacement sixth drive back to the array.

External links

Navigation
Print/export
Toolbox