Random Post: Tar and Restore Files Over SSH
RSS .92| RSS 2.0| ATOM 0.3
  • Home
  •  

    Coraid Odyssey: Part 3 (performance testing)

    Performance and failure testing are next up in building our kickin’ iSCSI/AoE device.

    The Debian Etch installer supports building and installing onto software RAID arrays. Because of that…

    during installation I configured the initial RAID1 boot volume with hot spare, consisting of three WD 160GB SATA 3Gbps disks. mdadm sees / and swap as /dev/md0 and /dev/md1 respectively. There was some remaining space on the drives which I set up as /dev/md2 for future use. The remaining arrays I decided to create manually using mdadm after getting a usable system up and running.

    First of all we need figure out what devices we want in the arrays by probing /dev. The ultimate goal here is to build three arrays: 1 x RAID1 with 1 hot spare (8GB root, 1GB swap, 151GB extra), 1 x RAID6 with 1 hot spare (4TB Xen LVM’s + 2TB Bacula LVM), 1 x RAID5 (2TB offsite mirror).

    I was forward thinking enough to label all the drive carriers with the serial number of the disk in it so all we need to do is get the disk-by-id name from /dev/disk/by-id/ and then build our test array like so:


    mdadm --create /dev/md3 --level=6 --raid-devices=8 \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ09HZR \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ09JV3 \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ06913 \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ08XHB \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ09KNM \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ09JHQ \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ0817T \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ09B9P \
    --spare-devices=1 \
    /dev/disk/by-id/scsi-SATA_ST31000340NS_5QJ07CF3

    This assembles our /dev/md3 device with the default options for RAID6 (64k chunk, left-symmetric parity)


    stor01:~# cat /proc/mdstat
    Personalities : [raid1] [raid6] [raid5] [raid4]
    md3 : active raid6 sdl[8](S) sdk[7] sdj[6] sdf[5] sdh[4] sdg[3] sdi[2] sde[1] sdd[0]
    5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
    [>....................] resync = 0.8% (8790864/976762496) finish=217.3min speed=74238K/sec

    Here you can see that CPU load is moderate doing the first full sync of that array:


    Cpu0 : 0.0%us, 33.3%sy, 0.0%ni, 63.3%id, 0.0%wa, 1.3%hi, 2.0%si, 0.0%st
    Cpu1 : 0.0%us, 7.0%sy, 0.0%ni, 91.0%id, 0.0%wa, 0.3%hi, 1.7%si, 0.0%st

    So, we wait about 4 hours… and its done :-)

    For formatting the volume, I decided to use the stride= option to mkfs.ext3. This provides for optimal striping across a raid array. The secret here is to make $stride = $chunks / $block_size. In our case, thats 4096 byte (4k) blocks divided by 64k chunks. So 16 would be our optimal stride value.


    root@stor01:~# mkfs.ext3 -E stride=16 /dev/md3

    So how does it perform? Well enough to saturate the dual gigabit NIC ports – which is all that matters :-)


    root@stor01:~# dd bs=4M if=/dev/zero of=/dev/md3
    4469+0 records in
    4469+0 records out
    18744344576 bytes (19 GB) copied, 97.583 seconds, 192 MB/s

    I also ran a more real-world test with bonnie++


    /usr/sbin/bonnie++ -d /mnt -s 4096Mb -n 100 -x 10 -q

    This test showed approximately 335MB/s read, 170MB/s write. You can download the actual data here if you wish.

    For failure testing the array, its as simple as removing and inserting disks while the array is up and running. Both tests work fine for the onboard SATA II controller but alas, the sata_mv kernel module does not yet support hotplug so all we can do on the remaining drives is simulate a drive failure by removing one disk. This does work fine but I need to see if there is a way to refresh the SATA bus to get any replacement drive to show up and be added back into the running array. Otherwise, we will need to power down the array to replace a faulty disk which kind of ruins the whole project, dont you think? ;-)

    So while we wait for sata_mv to start working (or find a different SATA controller to use in this project) we will move on to the remaining issues. Up next; getting port trunking working with a cisco switch…

    Leave a Reply