Tuesday, September 30, 2008

How To Resolve Veritas Disk Group Cluster Volume Management Problems On Linux or Unix

Hey There,

Today we're going to look at an issue that, while it doesn't happen all that often, happens just enough to make it post-worthy. I've only seen it a few times in my "career," but I don't always have access to the fancy software, so this problem may be more widespread than I've been lead to believe ;) The issue we'll deal with today is: What do you do when disk groups, within a cluster, conflict with one another? Or, more correctly, what do you do when disk groups within a cluster conflict with one another even though all the disk is being shared by every node in the cluster? If that still doesn't make sense (and I'm not judging "you," it just doesn't sound right to me, yet ;) what do you do in a situation where every node in a cluster shares a common disk group and, for some bizarre reason, this creates a conflict between nodes in the cluster and some of them refuse to use the disk even though it's supposed to be accessible through every single node? Enough questions... ;)

Check out these links for a smattering of other posts we've done on dealing with Veritas Volume Manager and fussing with Veritas Cluster Server. Some of the material covered may be useful if you have problems with any of the concepts glossed over in the problem resolution at the end.

Like I mentioned, this "does" happen from time to time, and not for the reasons you might generally suspect (like one node having a lock on the disk group and refusing to share, etc). In fact, the reason this happens sometimes (in this very particular case) is quite interesting. Even quite disturbing, since you'd expect that this shouldn't be able to happen.

Here's the setup, and another reason this problem seems kind of confusing. A disk group (we'll call it DiskGroupDG1 because we're all about creativity over here ;) is being shared between 2 nodes in a 2 node cluster. Both nodes have Veritas Cluster Server (VCS) set up correctly and no other problems with Veritas exist. If the DiskGroupDG1 disk group is imported on Node1, using the Cluster Volume Manager (CVM), it can be mounted and accessed by Node2 without any issues. However, if DiskGroupDG1 is imported on Node2, using CVM, it cannot be mounted and/or access by Node1.

All things being equal, this doesn't readily make much sense. There are no disparities between the nodes (insofar as the Veritas Cluster and Volume Management setup are concerned) and things should be just peachy going one way or the other. So, what's the deal, then?

The problem, actually, has very little to do with VCS and/or CVM (Although they're totally relevant and deserve to be in the title of the post -- standard disclaimer ;). The actual issue has to do, mostly, with minor disk numbering on the Node1 and Node2 servers. What???

Here's what happens:
In the first scenario (where everything's hunky and most everything's dorey) the DiskGroupDG1 disk group is imported by CVM on Node1 and Node1 notices that the "minor numbers" of the disks in the disk group are exactly the same as the "minor numbers" on disk it already has mounted locally. You can always tell a disk's (or any other device's) minor number by using the ls command on Linux or Unix, like so:

host # /dev/dsk # ls -ls c0t0d0s0
2 lrwxrwxrwx 1 root root 41 May 11 2001 c0t0d0s0 -> ../../devices/pci@1f,4000/scsi@3/sd@0,0:a
host # /dev/dsk # ls -ls ../../devices/pci@1f,4000/scsi@3/sd@0,0:a
0 brw-r----- 1 root sys 32, 0 May 11 2001 ../../devices/pci@1f,4000/scsi@3/sd@0,0:a
<-- In this instance, the device's "major number" is 32 and the device's "minor number" is 0. Generally, with virtual disks, etc, you won't see numbers that low.

Now, on Node1, since it recognizes this conflict on import, does what Veritas VM naturally does to avoid conflict; it renumbers the imported volumes ("minor number" only) so that the imported volumes won't conflict with volumes in another disk group that's already resident on the system it's managing. Therefore, when Node2 attempts to mount, with CVM, the command is successful.
In the second scenario (where thing are a little bit hunky, but not at all dorey), Node2 imports the DiskGroupDG1 disk group and none of the minor numbers in that disk group's volumes conflict with any of its local (or already mounted) disk. The disk group volumes are imported with no error, but, the "minor numbers" are not temporarily changed, either. You see where this is going. It's a freakin' train wreck waiting to happen ;)

Now, when Node1 attempts to mount, it determines there's a conflict, but can't renumber the "minor numbers" on the disk group's volumes (since they're already imported and mounted on Node2) and, therefore, takes the only other course of action it can think of and bails completely.

So, how do you get around this for once and all time? Well, I'm not sure it's entirely possible to anticipate this problem with a variable number of nodes in a cluster, all with independent disk groups and, also, sharing volume groups between nodes, although you could take simple measures to prevent it most of the time (like running ls against every volume in every disk group in a cluster every now and again and making sure no conflicts existed. The script should be pretty easy to whip up).
Basically, in this instance (and any like it), the solution involves doing what Veritas VM did in the first scenario; except doing it all-the-way. No temporary-changing of "minor numbers." For our purposes, we'd like to change them permanently, so that they never conflict again! It can be done in a few simple steps.

1. Stop VCS on the problem node first.

2. Stop any applications using the local disk group whose "minor numbers" conflict with the "minor numbers" of the volumes in DiskGroupDG1.

3. Unmount (umount) the filesystems and deport the affected disk group.

4. Now, pick a new "minor number" that won't conflict with the DiskGroupDG1 "minor numbers." Higher is generally better, but I'd check the minor numbers on all the devices in my device tree just to be sure.

5. Run the following command against your local disk group (named, aptly, LocalDG1 ;) :

host # vxdg reminor LocalDG1 3900 <-- Note that this number is the base, so every volume, past the initial, within the disk group will have a "minor number" one integer higher than the last (3900, 3901, etc)

6. Reimport the LocalDG1 disk group

7. Remount your filesystems, restart your applications and restart VCS on the affected node.

8. You don't have to, but I'd do the same thing on all the nodes, if I had a window in which to do it.

And, that would be that. Problem solved.

You may never ever see this issue in your lifetime. But, if you do, hopefully, this page (or one like it) will still be cyber-flotsam on the info-sea ;)

Cheers,

, Mike




Please note that this blog accepts comments via email only. See our Mission And Policy Statement for further details.