Docu review done: Thu 29 Jun 2023 12:33:11 CEST

Table of content

Commands and Descriptions

CommandsDescriptions
cat /proc/mdstatshow status of all raids
mdadm --detail /dev/md0detailed status of raid md0
mdadm --create /dev/md0 -n 2 -l 1 /dev/sda1 /dev/sdb1new raid md0 with 2 disks, raid level 1 on sda1 and sda2
mdadm --fail /dev/md0 /dev/sda1 ; mdadm --remove /dev/md0 /dev/sda1remove sda1 from md0
mdadm --add /dev/md0 /dev/sda1add sda1 to md0
mdadm --grow /dev/md0 -n 3use 3 disks in raid md0 (e.g. add an additional disk, so a damaged drive can be removed later-on)
mdadm --grow /dev/md0 -n 4 --add /dev/sda3 -l 1adds sda3 and grows md0
mdadm --grow /dev/md0 -n 6 --add /dev/sda4 -l /dev/sda5 -l 1adds sda4+sda5 and grows md0
mdadm --assemble /dev/md0Assemble md0 (e.g. when running live system)
mdadm --detail --scan >> /etc/mdadm/mdadm.confUpdate list of arrays in /etc/mdadm/mdadm.conf ; you should remove old list by hand first!
mdadm --examine /dev/sda1What is this disk / partition?
sysctl -w dev.raid.speed_limit_min=10000Set minimum raid rebuilding speed to 10000 kiB/s (default 1000)
sfdisk -d /dev/sdX | sfdisk /dev/sdYCopy partition table from sdX to sdY (MBR only)
sgdisk /dev/sdX -R /dev/sdY ; sgdisk -G /dev/sdYCopy partition table from sdX to sdY (GPT)

-n [0-9]+ is equivalent to --raid-devices=[0-9]+

-l[0-9]+ is equivalent to --level [0-9]+

To boot a machine even with a degraded array, modify /etc/initramfs-tools/conf.d/mdadm and run update-initramfs -c -kall (Use with caution!)

Installation

$ apt install mdadm

Raid Levels

To get an over view of RAID have a look at the RAID documentation

Create array

$ mdadm --create /dev/md/<lable> --level=<RAID-Level> --raid-devices=<sum of physical partitions in array> /dev/<device1> /dev/<device2> /dev/<deviceX>

Parameter

  • --create : for optional labble on raid devie, e.g. /dev/md/md_test
  • --level= : defines raid level. Allowed values are: linear, raid0, 0, stripe, raid1, 1, mirror, raid4, 4, raid5, 5, raid6, 6, raid10, 10, multipath, mp, faulty, container
  • --raid-devies= : specifies the phyical partion inside the newly generated sofware raid, e.g. --raid-devices=2 /dev/sdb /dev/sdc3

Sample Create RAID 0

Creating a RAID 0 (block level striping) with two partitions

$ mdadm --create /dev/md/md_test --level=0 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/md_test started.

Sample Create RAID 1

Createing a RAID 1

$ mdadm --create /dev/md/md_test --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? yes
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/md_test started.

Delete array

To be able to remoe an array it needs to be unmounted and you have to run

$ mdadm --stop /dev/md/<raid name>

To remove the array you need to set the super block to null on each disk

$ mdadm --zero-superblock /dev/sdX

Sample Delete array

$ umount -l /mnt/test
$ mdadm --stop /dev/md/md_test
mdadm: stopped /dev/md/md_test
$ mdadm --zero-superblock /dev/sdb1
$ mdadm --zero-superblock /dev/sdc1

List arrays and partitions

RAID-Arrays can be lsited with two commands

  • --detail : is showing the full acvie array
  • --examine : shows details about individual physical devices inside the raid
$ mdadm --examine --brief --scan  --config=partitions
ARRAY /dev/md/md_test metadata=1.2 UUID=81c1d8e5:27f6f8b9:9cdc99e6:9d92a1cf name=swraid:md_test

This command can be shorted with -Ebsc partitions

$mdadm --detail /dev/md/md_test
/dev/md/md_test:
        Version : 1.2
  Creation Time : Fri Jul  5 09:14:36 2013
     Raid Level : raid0
     Array Size : 16776192 (16.00 GiB 17.18 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Fri Jul  5 09:14:36 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : swraid:md_test  (local to host swraid)
           UUID : 81c1d8e5:27f6f8b9:9cdc99e6:9d92a1cf
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1

Hotspare

A hotspare disc/partition are devices which are not used in emergency cases. They will be used if an active disc/partition from a RAID has issues or is broken. If there is not hotspare disk defined inside of a RAID, you need to perform the rebuild manually. If there is one defined the rebuild will happen automatically. To add a disc/partition as hotspare, you can run the command:

mdadm --add /dev/md/<RAID Name> /dev/sdX

If you wana remove a hotspare disc/partition from an existing raid, you can use the command:

mdadm --remove /dev/md/<RAID Name> /dev/sdX

Rebuild

If there are issues on a disc/partition inside of the RAID, you need to trigger the rebuild. To do that, first you need to remove broken disc/drive form the RAID which will be done with the command

$ mdadm --mange /dev/md/<RAID Name> -r /dev/sdX

Important for the new disc/partition is that the size is the same as the broken one. There are several tools which help you on performing partitioning of drives e.g fdisk /dev/sdX, cfdisk /dev/sdX, parted /dev/sdX.

Is the size of the new disc/partition matching the size from the broken one it can be added to the RAID. For adding it, run the command:

$ mdadm --manage /dev/md/<RAID Name> -a /dev/sdX

If there are no isses during the adding of the disc/partition you can start the rebuild. To achieve that you need to set the new disk/partition to the state “faulty” by running:

$ mdadm --manage --set-faulty /dev/md/<RAID Name> /dev/sdX

By doing that, the rebuild will be triggered and you can watch the status of it with the command cat /proc/mdstat.

Every 2.0s: cat /proc/mdstat                                                         Fri Jul  5 09:59:16 2013


root@swraid:/dev# watch cat /proc/mdstat
Personalities : [raid0] [raid1]
md127 : active raid1 sdc1[1] sdb1[0]
      8384448 blocks super 1.2 [2/2] [UU]
      [==============>......]  check = 74.7% (6267520/8384448) finish=0.1min speed=202178K/sec

unused devices: <none>

After the rebuild process finished, you need to remove and add the disk/partition again to the RAID to remove the “faulty” state. Just run the commands like this for removing/adding the disk/partition:

$ mdadm --manage /dev/md/<RAID Name> -r /dev/sdX && mdadm --manage /dev/md/<RAID Name> -a /dev/sdX

To verify that the state is good again of the RAID you can use the command mdadm --detail /dev/md/<RAID Name> and this should show now State: clean.

Checking array

To make use of permanent monitoring the tool checkarray needs to be available and can be combined within cron.

/usr/share/mdadm/checkarray --cron --all --quiet

md device vanished and mdadm sefault

If cat /proc/mdstat does not return any output and/or hangs and you see segfault in dmesg about mdadm like this:

mdadm[42]: segfault at ffffffff0000009c ip 00000000004211fe sp 000050bbaf1211fb error 5 in mdadm[400000+78000]

You have to enforce the assemble of the md device and re-grow it

$ mdadm --assemble --foce /dev/md0 /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4 /dev/sda5
$ mdadm --grow /dev/md0 -n 6 -l 1

Now it should be back in an healthy state.