RAID

From Hurlster Wiki
Jump to navigation Jump to search

Quick reference

Create a RAID 1 (mirror) volume from two partitions of exactly or nearly exactly the same size:

mdadm --create /dev/md0 --level=mirror --raid-devices=2 /dev/sda1 /dev/sdb1

note: you can create a RAID 1 array with more than two devices. This gives you multiple copies. While there is little extra safety in this, it makes sense when you are creating a RAID 5 array for most of your disk space and using RAID 1 only for a small /boot partition. Using the same partitioning for all member drives keeps things simple.

Create a RAID 5 volume from three partitions of exactly or nearly exactly the same size:

mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/sda2 /dev/sdb2 /dev/sdc2

If you receive an error, such as: "mdadm: RUN_ARRAY failed: Invalid argument", make sure your kernel supports (either via a module or by being directly compiled in) the raid mode you are trying to use. Most modern kernels do, but you never know...

If the partitions used in your RAID array are not the same size, mdadm will use the smallest.

To add RAID device md0 to /etc/mdadm/mdadm.conf so that it is recognized the next time you boot.

mdadm -Es | grep md0  >>/etc/mdadm/mdadm.conf

View the status of a multi disk array.

mdadm --detail /dev/md0
cat /proc/mdstat

note: you may want to keep the output from this on a separate machine in case you need to know the drive order if mdadm can't start the array. Providing you have sufficient working drives, you can manually start a RAID array using the mdadm—create command if you specify the member partitions in the correct order.

Migrate an existing Linux partition to a RAID 5 volume of three partitions, where the existing Linux system is on /dev/sda1:

mdadm --create /dev/md1 --level=5 --raid-devices=3 missing /dev/sdb2 /dev/sdc2
mdadm -Es >>/etc/mdadm/mdadm.conf
update-initramfs -u
dd if=/dev/sda1 of=/dev/md1
(reboot into /dev/md1)
mdadm --add /dev/md1 /dev/sda1

note: the /boot directory should be on /dev/md0 (see top). You should also verify your ability to boot into /dev/md1 before adding your original boot partition into the array! Also, make sure you boot loader (Lilo, Grub or whatever) is installed on each physical drive containing a /boot copy.

Add a new drive to a RAID 5 volume:

mdadm --add /dev/md1 /dev/sdd1
mdadm --grow /dev/md1 --raid-devices=4

note: this adds a "spare" drive to the array then grows the array to use its space.

In some configurations you may not be able to grow the array until you have removed the internal bitmap. You can add this again after the array has been grown.

mdadm --grow /dev/md1 -b none

and to add the bitmap back.

mdadm --grow /dev/md1 -b internal

Making the array larger by increasing the size of the component partitions can be done by replacing the partitions in the array one at a time, letting them resync before replacing the next one. This can take time and, if your machine doesn't allow hot-swapping, take many reboots.

You may not actually do this all at one sitting. For example, if your drives are getting old, you may replace each one when it fails with a larger drive partitioned to allow for a larger array. When you've replaced all the drives, then grow the array to use the full space.

mdadm /dev/md1 --fail /dev/sda1
(replace the first drive with the new, larger one then partition it)
mdadm --add /dev/md1 /dev/sda1
(wait for re-sync to complete then repeat for next drive)
mdadm --grow /dev/md1 --size=max
(now resize the file-system(s) on the RAID array to use the new space)

Deleting a RAID array: You may halt and delete a RAID array with following commands

mdadm --stop /dev/md0  # to halt the array
mdadm --remove /dev/md0  # to remove the array
mdadm --zero-superblock /dev/sd[abc]1  # delete the superblock from all drives in the array
gedit /etc/mdadm/mdamd.conf # to delete any rows related to deleted array

Optimize

RAID0 or RAID5 can be boosted by increasing the number in the following file.

/sys/block/md0/md/stripe_cache_size
*change from 256(default) to 8192

Check performance

  • writeperformance.sh
dd if=/dev/zero of=/var/local/storage/10gb.16384k.stripe.out bs=1M count=10240
  • readperformance.sh
dd if=/var/local/storage/10gb.16384k.stripe.out of=/dev/null bs=1M

Example change:

MD1 - RAID5
BEFORE(256):
write = 10737418240 bytes (11 GB) copied, 242.562 s, 44.3 MB/s
read = 10737418240 bytes (11 GB) copied, 57.0372 s, 188 MB/s

AFTER(8192):
write = 10737418240 bytes (11 GB) copied, 75.4038 s, 142 MB/s
read = 10737418240 bytes (11 GB) copied, 58.3019 s, 184 MB/s

Speed up RAID1 resync

echo 200000 > /proc/sys/dev/raid/speed_limit_max
echo 200000 > /proc/sys/dev/raid/speed_limit_min

SNMP monitor script

/usr/local/sbin/check-md0.sh

#!/bin/sh

MDADM=`mdadm --detail /dev/md0 | grep State\ \:|cut -c19-60`
#SNMP=`snmpwalk -v1 -c YOUR-PUBLIC $1 extOutput |grep mdadm`
TMP1=`echo $MDADM |grep degraded`
TMP2=`echo $MDADM |grep clean`

if [ "$TMP1" = "" ]; then
  echo "OK"
  return 0
else
  echo "DEGRADED"
  return 2
fi

Call with snmpd in snmpd.conf

exec md0status /usr/local/sbin/check-md0.sh

Disk Problems

The Linux command to enter at a shell prompt to surface scan a disk partition is

  fsck -f -p -y -C -V -c -c /dev/xxan

Those parameters mean:

-f 	"force" the complete operation to occur even if the flag says the disk is "clean"
-y	automatically answer "yes" whenever an error is discovered which asks the user what to do
(needed for unattended automatic fix operation)
-C	(note upper case)
present a progress bar showing how much of the operation has been completed
-V	(note upper case)
enable "verbose" mode to keep the user apprised of what's happening by printing informative messages
-c -c	(no that's not a stutter, it really does need to be entered twice for this particular piece of software - only once would specify the  "shortcut" of only reading the existing data from every sector and relying completely on the disk itself to detect any problems, thus missing  many flaws)
do a comprehensive surface scan operation by saving the existing contents of each area, writing a test pattern to the area, reading the area  and making sure the data read is exactly the same as the pattern that was written, then finally restoring the initial contents of the area
xxan	the disk partition to be operated on:
xx is either sd for SCSI and for Serial/Sata disks, or hd for older Parallel/Pata ATA/IDE disks
a is a, b, c etc. to specify which disk
n is 1, 2, 3 etc. to specify which partition on the disk

(Note the disk partition should not be mounted [i.e. any possibility of being in use] during the operation

LSI SAS3008

VMware install vmware-esx-sas3ircu.vib

Disk Management

Display disks on controller "0"

/opt/lsi/bin/sas3ircu 0 display

Show volume status

/opt/lsi/bin/sas3ircu 0 status

Offline disk (PHY[2] Enclosure#/Slot# : 1:6)

 /opt/lsi/bin/sas3ircu 0 setoffline 1:6