Ubuntu 12.04 - Retrofitting a RAID1 array

Posted in How Did I Do That?

Instructions are based heavily (with a few modifications) on information at http://feeding.cloud.geek.nz/posts/setting-up-raid-on-existing/ and http://www.guyrutenberg.com/2013/12/01/setting-up-raid-using-mdadm-on-existing-drive/

 

1 Preparation

Check what devices are known

root@server2:/home/hwm# df -h

Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/server2-root  1.8T  258G  1.5T  15% /
udev                      3.9G  4.0K  3.9G   1% /dev
tmpfs                     1.6G  3.4M  1.6G   1% /run
none                      5.0M     0  5.0M   0% /run/lock
none                      3.9G  160K  3.9G   1% /run/shm
cgroup                    3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2                 229M   48M  169M  23% /boot
/dev/sda1                 190M  122K  190M   1% /boot/efi

And check what devices are installed:

root@server2:/home/hwm# fdisk -l

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

  Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1  3907029167  1953514583+  ee  GPT
Partition 1 does not start on physical sector boundary.

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

 

More info:

root@server2:/home/hwm# parted

GNU Parted 2.3
Using /dev/sda

Welcome to GNU Parted! Type 'help' to view a list of commands.

(parted) print all                                                       

Model: ATA ST2000DL003-9VT1 (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name                 Flags
 1      1049kB  200MB   199MB   fat32        Ubuntu12.04_LTS_FAT  boot
 2      200MB   456MB   256MB   ext2
 3      456MB   2000GB  2000GB               Ubuntu12.04_LTS_LVM  lvm


Model: ATA ST2000DL003-9VT1 (scsi)
Disk /dev/sdb: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name                 Flags
 1      1049kB  200MB   199MB   fat16        Ubuntu12.04_LTS_FAT  boot
 2      200MB   456MB   256MB   ext2
 3      456MB   2000GB  2000GB  ext4         Ubuntu12.04_LTS_LVM  lvm


Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/server2-swap_1: 8477MB
Sector size (logical/physical): 512B/4096B
Partition Table: loop

Number  Start  End     Size    File system     Flags

 1      0.00B  8477MB  8477MB  linux-swap(v1)



Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/server2-root: 1991GB
Sector size (logical/physical): 512B/4096B
Partition Table: loop

Number  Start  End     Size    File system  Flags
 1      0.00B  1991GB  1991GB  ext4

 

And to summarize:

So, sda is the currently used drive, and sdb is untouched.
sda1=boot/efi
sda2=boot
sda3=root

I should start by making a backup of sda, just in case I really hose things up. Moving forward with clonezilla to make a backup of sda onto usb drive. Backup created.

2 Last Check

Before you start, make sure the following packages are installed:

apt-get install mdadm rsync initramfs-tools

 

3 Get Started

Copy the drive partitions:

sfdisk -d /dev/sda > partition.txt
sfdisk --force /dev/sdb < partition.txt

 

Set the partition type on sdb to 0xfd (Linux RAID autodetect) – I used Disk Manager. Reportedly, this is not necessary. Skipping

4 Now it is time to create new degraded RAID arrays

I am going to just create an array of the root partition. I do not have a swap partition, for I do not think it is necessary on this server. The newly partitioned drive, consisting of a root partition, can be added to new RAID1 arrays using mdadm. This is for the NEW drive, and the RAID will consist of two drives:

mdadm --create rootpart --level=1 --raid-devices=2 missing /dev/sdb3 

 

and formatted like this:

mkfs.ext4 /dev/md/rootpart

 

Record this into the /etc/mdadm/mdadm.conf by:

/usr/share/mdadm/mkconf > /etc/mdadm/mdadm.conf

 

Check the status of your RAID arrays at any time by running this command:

cat /proc/mdstat

 

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md127 : active raid1 sdb3[1]

      1952937792 blocks super 1.2 [2/1] [_U]


unused devices: <none>

5- Prepare the boot

In this step we shall prepare the system to boot the newly created boot array. Of course we won’t actually do that before copying our data into it.

Start by editing /etc/grub.d/40_custom and adding a new entry to boot the raid array. The easiest way is to copy the latest boot stanza from /boot/grub/grub.cfg and modify it. The boot stanza looks something like this:

menuentry 'Ubuntu, with Linux 3.2.0-35-generic' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        gfxmode $linux_gfx_mode
        insmod gzio
        insmod part_gpt
        insmod ext2
        set root='(hd0,gpt2)'
        search --no-floppy --fs-uuid --set=root ae67b87a-53b0-4689-9971-1e8417cb5bfd
        linux   /vmlinuz-3.2.0-35-generic root=/dev/mapper/server2-root ro
        initrd  /initrd.img-3.2.0-35-generic

}

 

First we need to add

insmod raid
insmod mdraid1x

 

just after the rest of the insmod lines. This will load the necessary GRUB modules to detect your raid array during the bootprocess. If you decided to go for 0.9 metadata earlier (despite my recommendation…) you will need to load mdraid09 instead of mdraid1x. Next we need to modify the root partition. This is done my modifying the UUID (those random looking hex-and-hyphens strings) arguments to the lines starting with search and linux. To find out the UUID for your root partition run

blkid /dev/md/rootpart

Which will give something like

/dev/md/rootpart: UUID="455cf8f6-532c-442c-814c-b8c4d280d170" TYPE="ext4"

and confirm by looking up everything:

root@server2:/dev/disk/by-uuid# ll
total 0
drwxr-xr-x 2 root root 140 Apr 30 10:00 ./
drwxr-xr-x 8 root root 160 Apr 29 16:13 ../
lrwxrwxrwx 1 root root  11 Apr 30 10:00 455cf8f6-532c-442c-814c-b8c4d280d170 -> ../../md127
lrwxrwxrwx 1 root root  10 Apr 29 16:13 4B56-F26C -> ../../sda1
lrwxrwxrwx 1 root root  10 Apr 29 22:40 8b6cdcf2-952e-49d7-ad8f-b08f44b3c202 -> ../../dm-1
lrwxrwxrwx 1 root root  10 Apr 29 16:13 ae67b87a-53b0-4689-9971-1e8417cb5bfd -> ../../sda2
lrwxrwxrwx 1 root root  10 Apr 29 16:13 b0c6a16d-18d0-4ebc-8485-b9861b6c4747 -> ../../dm-0

The set root line can be removed as the search line overrides it.

Last but not least add bootdegraded=true to the kernel parameters, which will allow you to boot the degraded array without any hassles. The result should look something like this:

menuentry 'Ubuntu, with Linux 3.2.0-35-generic (raid)' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        gfxmode $linux_gfx_mode
        insmod gzio
        insmod part_gpt
        insmod ext2
      insmod raid
      insmod mdraid1x
      search --no-floppy --fs-uuid --set=root 455cf8f6-532c-442c-814c-b8c4d280d170
      linux   /vmlinuz-3.2.0-35-generic root=UUID=455cf8f6-532c-442c-814c-b8c4d280d170 ro bootdegraded=true
        initrd  /initrd.img-3.2.0-35-generic
}

 

Now run update-grub as root so it actually updates the /boot/grub/grub.cfg file.

Afterwards, run

update-initramfs -u -k all

 

This will make sure that the updated mdadm.conf is put into the initramfs. If you don’t do so the names of your new RAID arrays will be a mess after reboot.

6- Copy existing data onto the new drive

Copy everything that's on the existing drive onto the new one using rsync:

mkdir /tmp/mntroot 
mount /dev/md/rootpart /tmp/mntroot 
rsync -auxHAX --exclude=/proc/* --exclude=/sys/* --exclude='/*/.gvfs' --exclude=/tmp/* / /tmp/mntroot/

 

7- Get ready to reboot using the RAIDed drive and test system

Before rebooting,

#old open /tmp/mntroot/etc/fstab, and change /dev/sda1 and /dev/sda2 to /dev/md0 and /dev/md1respectively.

Next you need to update /tmp/mntroot/etc/fstab with the UUIDs of the new partition (which you can get using blkid). If you have encrypted swap, you should also update /tmp/mntroot/etc/crypttab.

Original:

# <file system>                           <mount point>   <type>  <options>            <dump>  <pass>
proc                                       /proc           proc    nodev,noexec,nosuid 0        0
/dev/mapper/server2-root                   /               ext4    errors=remount-ro   0        1
/dev/mapper/server2-swap_1                 none            swap    sw                  0        0
UUID=ae67b87a-53b0-4689-9971-1e8417cb5bfd  /boot           ext2    defaults            0        2
UUID=4B56-F26C                             /boot/efi       vfat    defaults            0        1

 

Change to:

# <file system>                           <mount point>   <type>  <options>            <dump>  <pass>
proc                                       /proc           proc    nodev,noexec,nosuid 0        0
UUID=455cf8f6-532c-442c-814c-b8c4d280d170  /               ext4    errors=remount-ro   0        1
/dev/mapper/server2-swap_1                 none            swap    sw                  0        0
UUID=ae67b87a-53b0-4689-9971-1e8417cb5bfd  /boot           ext2    defaults            0        2
UUID=4B56-F26C                             /boot/efi       vfat    defaults            0        1

 

Last thing before the reboot is to deal with /boot and /boot/efi mount points and partitions and grub. I am not certain how to deal with this and have posted a question on ubuntu forums.

 

LASTLY, reboot the computer.

Once the system is up, you can check that the root partition is indeed using the RAID array by running mount and looking for something like:

/dev/md/rootpart on / type ext4 (rw,noatime,errors=remount-ro)

 

8- Wipe the original drive by adding it to the RAID array

Once you have verified that everything is working on /dev/sdb, it's time add sda to the RAID. This will take some time, and will erase the original hard drive. Make certain you are ready before proceeding.

mdadm /dev/md/rootpart -a /dev/sda3

 

You'll have to wait until the two partitions are fully synchronized but you can check the sync status using:

watch -n1 cat /proc/mdstat

 

9- Test booting off of the original drive

Shut the system down:

shutdown -h now

 

Physically disconnect /dev/sdb and turn the machine back on to test booting with only /dev/sda present.

After a successful boot, shut the machine down and plug the second drive back in before powering it up again.

 

10- Re-establish array, and sync again.

If everything works, you should see the following after running

cat /proc/mdstat

Result:

md0 : active raid1 sda3[1] 
280567040 blocks [2/1] [_U]

 

indicating that the RAID array is incomplete and that the second drive is not part of it.

 

To add the second drive back in and start the sync again:

mdadm /dev/md/rootpart -a /dev/sdb3

 

11- Test booting off of the new drive

 

To complete the testing, shut the machine down, pull /dev/sda out and try booting with /dev/sdb only.

 

12- Reboot with the two drives, re-establish the array again, re-sync again.

 

Once you are satisfied that it works, reboot with both drives plugged in and re-add the first drive to the array:

mdadm /dev/md/rootpart -a /dev/sda3

 

13 Prepare for failure

I recommend making sure the two RAIDed drives stay in sync by enabling periodic RAID checks.

Something else you should seriously consider is to install the smartmontools package and run weekly SMART checks by putting something like this in your /etc/smartd.conf:

 

/dev/sda -a -d ata -o on -S on -s (S/../.././02|L/../../6/03) 
/dev/sdb -a -d ata -o on -S on -s (S/../.././02|L/../../6/03)

 

These checks, performed by the hard disk controllers directly, could warn you of imminent failures ahead of time. Personally, when I start seeing errors in the SMART log (smartctl -a /dev/sda), I order a new drive straight away.