Saturday, July 07, 2007

ASM Disk Group and Securepath

Here is a point of confusion that I've seen during ASM installs on HP-UX 11.23 using Securepath virtual devices. On one install we used one each of the underlying /dev/rdsk/xxxx paths from the autopath display to create the ASM disk groups. If you look at an autopath display, you will see the Device Path listed at the end of Lun WWN. That is what we used and it took us a while to get it straight.

The best practice is to ignore the individual devices in the path and use the secure virtual device file instead when creating your ASM disk groups. For example /hpap/dsk/hpap??

The secure virtual device takes away the confusion of the underlying devices which are just paths to the same device. Securepath handles the rest. Also, by using the secure virtual device file, the ASM disk group automatically shows up on all instances that use the same virtual device file.

Cannot Lock /etc/mnttab

Here is a strange issue. On HP-UX Itanium we couldn't reach some of our directories on the EVA5000 storage. Whenever we tried to list the contents of a couple of specific mount points it would lock up our PuTTY session. Also, we were getting the following error whenever we tried a bdf or df command:

bdf: cannot lock /etc/mnttab; still trying ...
bdf: cannot lock /etc/mnttab; still trying ...
bdf: cannot lock /etc/mnttab; still trying ...
bdf: cannot lock /etc/mnttab; still trying ...

The /etc/mnttab is the mounted file system table. Our SAN was showing that 3 disks had failed in the same 7 disk group. Oracle was trying to expdp to the mount points and NetBackup was trying to read from the same mount points. We couldn't get Oracle to shutdown or NetBackup either.

HP physically went and pulled out one of the bad disks. When the bad disk was physically ejected, Oracle came down and so did NetBackup. Turns out that there was only one bad disk in the Data Replication Group. The bad disk did not eject itself cleanly and hung up the other 2 disks. So, it looked like there were 3 bad disks. Once the bad disk was replaced, the disk group began leveling as normal. The /etc/mnttab became available and all data was present. There was no data loss. We started Oracle and everything looked fine. So, what looked like a bad issue, turned out to be a disk that was not ejected cleanly from the disk group that also locked up the /etc/mnttab file.

Go figure, still not impressed with HP storage especially the EVA5000. We have had lots of bad issues with EVA5000 storage everything from strange things like above to data corruption of the Oracle database to loosing mount points.