FreeBSD mfiutil and the Common RAID Format Spec

Recently I have been setting up a few new boxes that have an LSI MegaRAID controller. As of FreeBSD 8.0, there is a much more friendly tool included called mfiutil. It saves from having to install sysutils/megacli and using its rather complicated arguments (well, at first, after using it for a little while they make sense). Now, I know I setup a RAID10 that spans 3, 2 disk arrays. However, mfiutil is reporting it as a RAID1 volume.

After setting up a RAID10 array using the WebBIOS tool and installing FreeBSD I ran mfiutil show volumes to make sure I was referencing the right virtual drive and to my astonishment I see:

mfi0 Volumes:
  Id     Size    Level   Stripe  State   Cache   Name
 mfid0 (  408G) RAID-1      64K OPTIMAL Enabled 

Say WHAT?! That volume is NOT supposed to be RAID1. So I ran mfiutil show config to get some more details about my configuration.

mfi0 Configuration: 3 arrays, 1 volumes, 0 spares
    array 0 of 2 drives:
        drive 8 (  137G) ONLINE <SEAGATE ST3146356SS 0005 serial=SERIAL> SAS enclosure 1, slot 0
        drive 9 (  137G) ONLINE <SEAGATE ST3146356SS 0005 serial=SERIAL> SAS enclosure 1, slot 1
    array 1 of 2 drives:
        drive 10 (  137G) ONLINE <SEAGATE ST3146356SS 0005 serial=SERIAL> SAS enclosure 1, slot 4
        drive 11 (  137G) ONLINE <SEAGATE ST3146356SS 0004 serial=SERIAL> SAS enclosure 1, slot 5
    array 2 of 2 drives:
        drive 12 (  137G) ONLINE <SEAGATE ST3146356SS 0004 serial=SERIAL> SAS enclosure 1, slot 6
        drive 13 (  137G) ONLINE <SEAGATE ST3146356SS 0005 serial=SERIAL> SAS enclosure 1, slot 7
    volume mfid0 (408G) RAID-1 64K OPTIMAL spans:
        array 0
        array 1
        array 2

Alright, that looks right but it still says RAID1 as well. Now this kind of bugs me, since I even followed the MegaRAID SAS User Guide to setup the RAID10 array. Alright, let’s get mfiutil to spit out some debug output; which requires applying a patch and recompiling.

cd /usr/src/usr.sbin/mfiutil
make clean
patch fix-mfiutil-debug.diff
make -DDEBUG
/usr/obj/usr/src/usr.sbin/mfiutil/mfiutil debug

Ok, now this looks right to me. Primary RAID level is 1 and secondary RAID level is 0.

mfi0 Configuration (Debug): 3 arrays, 1 volumes, 0 spares
  array size: 288
  volume size: 256
  spare size: 40
    array 0 of 2 drives:
      size = 285155328
        drive 8 ONLINE
          raw size: 286749488
          non-coerced size: 285700912
          coerced size: 285155328
        drive 9 ONLINE
          raw size: 286749488
          non-coerced size: 285700912
          coerced size: 285155328
    array 1 of 2 drives:
      size = 285155328
        drive 10 ONLINE
          raw size: 286749488
          non-coerced size: 285700912
          coerced size: 285155328
        drive 11 ONLINE
          raw size: 286749488
          non-coerced size: 285700912
          coerced size: 285155328
    array 2 of 2 drives:
      size = 285155328
        drive 12 ONLINE
          raw size: 286749488
          non-coerced size: 285700912
          coerced size: 285155328
        drive 13 ONLINE
          raw size: 286749488
          non-coerced size: 285700912
          coerced size: 285155328
    volume mfid0 RAID-1 OPTIMAL
      primary raid level: 1
      raid level qualifier: 0
      secondary raid level: 0
      stripe size: 7
      num drives: 2
      init state: 0
      consistent: 1
      no bgi: 0
      spans:
        array 0 @ 0 : 285155328
        array 1 @ 0 : 285155328
        array 2 @ 0 : 285155328

So why is it still reporting RAID1? Well let’s dive into the source of mfiutil. Ah-ha! The issue stems from mfi_cmd.c

const char *
mfi_raid_level(uint8_t primary_level, uint8_t secondary_level)
{
        static char buf[16];

        switch (primary_level) {
        case DDF_RAID0:
                return ("RAID-0");
        case DDF_RAID1:
                if (secondary_level != 0)
                        return ("RAID-10");
                else
                        return ("RAID-1");

Well, okay then. That is only partially correct. According to Common RAID Disk Data Format Specification that is an incorrect assumption. I dug a little deeper into the issue and found this thread on the freebsd-current mailing list. The post that made it clear why mfiutil may make this assumption was by John Baldwin

Previous RAID-10 volumes that I've seen MFI BIOSes create used a non-zero 
secondary raid level (they all used '3', which is what mfiutil uses to 
create RAID-10 volumes itself). 

-- 
John Baldwin 

I found a thread on the Dell Linux-PowerEdge mailing list and according to I believe a Dell rep, saying their PERC controllers (which is based on MegaRAID) reports RAID levels based on the SNIA DDF standard. So I followed his suggestion and started reading the SNIA DDF specification I linked above.

On Page 84:


4.3 Secondary RAID Level

Table 15 lists values used in the Secondary_RAID_Level field of the Virtual Disk Configuration Record
(Section 5.9.1) and their definitions. The table defines secondary RAID levels such as Striped, Volume
Concatenation, Spanned, and Mirrored for hybrid or multilevel virtual disks. The Secondary_RAID_Level
field in the Virtual Disk Configuration Record MUST use the values defined in Table 15.

Table 15: Secondary RAID Levels

Name SRL Byte Description
Striped 0x00 Data is striped across Basic VDs. First strip stored on first BVD and
next on next BVD.
NOTE: BVD sequence is determined by the Secondary_Element_Seq
field in the Virtual Disk Configuration Record (Section 5.9.1).
Mirrored 0x01 Data is mirrored across Basic VDs.
Concatendated 0x02 Basic VDs combined head to tail.
Spanned 0x03 A combination of stripping and concatenations involving Basic VDs of
different sizes.
NOTE: BVD sequence is determined by the Secondary_Element_Seq
field in the Virtual Disk Configuration Record (Section 5.9.1).

So now it all makes sense. The WebBIOS tool is using a secondary level of 0 because my VDs are all the same size so it has no reason to instruct the controller to use a secondary level 3. Now I don’t have to worry that I may of configured my array incorrectly.