Sunday, April 19, 2009

Bug 7171446 - NUMA Issues from 10.2.0.4 Patchset

Here is a NUMA bug introduced in 10.2.0.4. If you are running HP-UX IA64 and your system starts locking up or you find a number of unexpected problems such as skewed CPU usage and ORA-600 errors. Here are some details you should review in note: 7171446.8 on MetaLink.

Unless the system has been specifically set up and tuned for NUMA at both OS and database level then disable Oracle NUMA optimizations by setting the following in the pfile / spfile / init.ora used to start the instances:

_enable_NUMA_optimization=FALSE
_db_block_numa=1

At some point, Oracle should release a patch for bug 7171446. Once it is released, we will install the patch and remove the hidden parameters from the parameters.

CRS Log Directory Permissions

During this last RAC install, we ran into an issue where the crs home log directory for the one of the nodes had the wrong owner, group and permissions. So, CRS couldn't log an issue that it was having.

Here is a hint, if you are having CRS issues, always first check to make sure all of the directories exist for the CRS processes to create log files.

What was totally crazy, the other node owner, group and permissions were right. Don't know if the owner, etc. got goofed up during the patchset install for 10.2.0.4. or if it was another Clusterware merge patch. Never did figure out the why of it.

11.1 CLUVFY HP-UX ia64 Issues

For our latest install of 10.2 RAC on HP-UX 11.31 Itanium 64 bit, we used the 11g version of Cluster Verify (cluvfy). Here are the issues that we had. First, it would complain about the MTU values being different between the private and public interfaces.

Different MTU values used across network interface(s).

These different values are necessary because we are using Infiniband as the private interconnect and you want the larger Maximum Transmission Unit (MTU) value for the more robust interconnect. The public interface is a standard gigabit network so a lower value makes sense. So, we basically ignored that error because changing the Infiniband to a lower MTU value is not practical just to get a clean cluvfy before installing the Clusterware. For more info on MTU and Oracle RAC see MetaLink note: 341788.1

This is a known bug discussed in metalink note 758102.1. Root cause is BUG
7493420 fixed in 11.2. This configuration is valid since the interfaces across nodes have the same MTU. As long as the interfaces across the nodes have the same MTU, you are good to go.

The next issue had to do with shared storage. With HP-UX ia64 11.31, we created shared raw ASM LUNs then created aliases to those LUNs for the ASM diskstring. This storage is shared between the two nodes using EVA8000 storage. Cluvfy does not recognize that shared storage is available and it is working correctly. The failure message you get can be ignored. Here is the message:

Shared storage check failed on nodes "xxxxx"

In the known limitations section of the cluvfy readme, it clearly states the following:

"Sharedness check of SCSI disks is currently not supported."

If these are SCSI disks, then this error is expected as cluvfy cannot handle this check. As long as these disks can be seen from both nodes and has the correct permissions and ownership, ASM should install/work fine.

As long as your storage is working correctly, you can ignore the shared storage check because cluvfy is not able to verify multipath / autopath type software like that built in to HP-UX 11.31 using virtual disk devices on EVA8000 storage.

HP-UX Async IO

After we installed a 10.2.0.4 RAC database on an HP-UX Itanium 64-bit platform, we noticed some errors related to asynchronous IO in the database trace files. Here is the message:

Ioctl ASYNC_CONFIG error, errno = 1

After further analysis on MetaLink, and assistance from Support, we determined that asynch IO was not configured. The following are the steps that we did as root to resolve the issue:

  • created /etc/privgroup. Added the following entries in the file:
  • dba RTPRIO RTSCHED MLOCK
  • oinstall RTPRIO RTSCHED MLOCK
  • /usr/sbin/setprivgrp -f /etc/privgroup
  • getprivgrp dba
  • getprivgroup oinstall
  • cd /dev/async
  • chown oracle:dba async
  • chmod 660 async
This was an interesting issue because of the Oracle 10gR2 documentation. The Oracle Clusterware and Oracle Real Application Clusters Installation Guide for HP-UX doesn't include these procedures. However, the Administrator's Reference for UNIX-Based Operating Systems does in an appendix for HP-UX.

We just had to say "Isn't that interesting..."

Saturday, July 07, 2007

ASM Disk Group and Securepath

Here is a point of confusion that I've seen during ASM installs on HP-UX 11.23 using Securepath virtual devices. On one install we used one each of the underlying /dev/rdsk/xxxx paths from the autopath display to create the ASM disk groups. If you look at an autopath display, you will see the Device Path listed at the end of Lun WWN. That is what we used and it took us a while to get it straight.

The best practice is to ignore the individual devices in the path and use the secure virtual device file instead when creating your ASM disk groups. For example /hpap/dsk/hpap??

The secure virtual device takes away the confusion of the underlying devices which are just paths to the same device. Securepath handles the rest. Also, by using the secure virtual device file, the ASM disk group automatically shows up on all instances that use the same virtual device file.

Cannot Lock /etc/mnttab

Here is a strange issue. On HP-UX Itanium we couldn't reach some of our directories on the EVA5000 storage. Whenever we tried to list the contents of a couple of specific mount points it would lock up our PuTTY session. Also, we were getting the following error whenever we tried a bdf or df command:

bdf: cannot lock /etc/mnttab; still trying ...
bdf: cannot lock /etc/mnttab; still trying ...
bdf: cannot lock /etc/mnttab; still trying ...
bdf: cannot lock /etc/mnttab; still trying ...

The /etc/mnttab is the mounted file system table. Our SAN was showing that 3 disks had failed in the same 7 disk group. Oracle was trying to expdp to the mount points and NetBackup was trying to read from the same mount points. We couldn't get Oracle to shutdown or NetBackup either.

HP physically went and pulled out one of the bad disks. When the bad disk was physically ejected, Oracle came down and so did NetBackup. Turns out that there was only one bad disk in the Data Replication Group. The bad disk did not eject itself cleanly and hung up the other 2 disks. So, it looked like there were 3 bad disks. Once the bad disk was replaced, the disk group began leveling as normal. The /etc/mnttab became available and all data was present. There was no data loss. We started Oracle and everything looked fine. So, what looked like a bad issue, turned out to be a disk that was not ejected cleanly from the disk group that also locked up the /etc/mnttab file.

Go figure, still not impressed with HP storage especially the EVA5000. We have had lots of bad issues with EVA5000 storage everything from strange things like above to data corruption of the Oracle database to loosing mount points.

Tuesday, October 03, 2006

Opatch with RAC

I got to use Opatch for RAC today to install some one-off patches. That is a subject for later. Man, what a lot of bugs in 10.2. Opatch is cluster aware and will propagate the changes to the other nodes and relink Oracle on the other nodes. There is an issue however. You cannot have UNIX banners being used with ssh. This is the same issue that we ran into on the Clusterware install. So, we had to rename the /etc/issue.net in order to get Opatch to propagate the changes to the other nodes. Opatch actually works really well.

Wednesday, September 27, 2006

ORA-29701 Cluster Manager

I had something strange occur today. A user query kept returning ORA-29701: unable to connect to Cluster Manager. Therefore, I searched MetaLink for this error. But, I couldn't find anything that applied to my situation (10gR2 RAC on HP-UX Itanium). This error only showed up on one of the nodes. The other nodes and instances were fine.

After stopping everything on the node, the ASM instance would not start and immediately barked at me with the same ORA-29701 error. At this point I asked someone else that has more experience with Oracle Clusterware than I do.

They checked it out and found that somehow the ASM /dev/rdsks and special files for the OCR and voting disks had changed ownership. Someone with root access must have run insf -e to reinstall the special files. Oh, great!

A sys admin had already created a shell script to change the ownership of the /dev/rdsks to oracle:dba and chmod them to 660. So, all we had to do was ask one of our sysadmins to run the script. They also had to manually chown root:oinstall /dev/voting and chown oracle:oinstall /dev/ocr.

So, if you get an ORA-29701 and can't figure it out, check the owner and permissions of your Oracle devices.

Sunday, September 10, 2006

Database Control RMAN Wizard Bug

Here is a bug that I found in our 10gR1 systems. This bug has to do with Enterprise Manager (EM) Database Control 10.1.0.5 and scheduling backups with the Recovery Manager (RMAN) wizard. Whenever you do a custom RMAN backup and choose to only backup certain tablespaces, the UNDO tablespace is never given as an option.

Now, without the undo tablespace, database recovery is not possible. However, the EM RMAN wizard will not allow you the option of adding the undo tablespace to the list of tablespaces to be selected for backup.

So, here is the workaround:

  • Modify the RMAN script that is created and manually include the undo tablespace in your list of tablespaces to be backed up. Then submit your job.

I will be creating some RMAN backups for a 10.2 database soon and will let you know if this bug is also in 10gR2.

Saturday, September 09, 2006

Old Dog New Tricks

With all of the assistance I've had lately, this old dog has learned some new tricks. Now, everybody I'm sure already knows these; however, in my career I've had to learn things by the seat of my pants and on the job in production.

tnsnames.ora changes do not require restart of the listener. Kind of cool. And may favorite is "lsnrctl reload". The reload resets the listener without stopping and starting it.

Friday, September 08, 2006

HP EVA8000 Autopath Tuning

Here is some information that might be useful to you. Following our RAC cluster setup and install, the EVA 8000 performance was very disappointing. First, our Oracle performance was disturbing. Then we did some more I/O benchmarks using "dd" and found that I/O performance was actually worse than the first baseline benchmark. So, this indicated a problem with the underlying storage and not the Oracle setup.

The UNIX administrators, HP, and my peers began looking into the problem. I wasn't totally engaged but want to post the information anyway. Props to my colleages. This is more system administration territory; however, as DBAs we need to know the impact of EVA tuning.

HP checked the load balancing policy on the nodes. It was set to “No Load Balancing” which greatly impacts performance. Now, how it was changed to "No Load Balancing" after the first I/O benchmarks is still a mystery.

To see the LUNs, issue the command “autopath display” as root. This will list all the LUNs and show the HP Autopath load balancing policy.

root@hostname:/ # autopath display
...
Load Balancing Policy : No Load Balancing
...

So, HP and my peers recommended that the HP Autopath load balancing policy be set to round robin for all LUNs. The autopath set_lbpolicy command sets the load balancing policy for the specified device path.

autopath set_lbpolicy <{policy name} {path}>
description: sets load balancing policy
usage: autopath set_lbpolicy <{policy name} {path}>
Policy name: The load balancing policy to set
Valid policies are
RR : Round Robin.
SST : Shortest Service Time.
SQL : Shortest Queue Length.
NLB/OFF : No load Balancing.
Path : Device Special File e.g./dev/dsk/c#t#d#

Example:
# autopath set_lbpolicy RR /dev/dsk/c0t0d0
The example above sets the policy to Round Robin.

Here is a little more information about what the policies mean:
  • RR : Round Robin. - I/O is routed through the paths of the active controller in round-robin order
  • SST : Shortest Service Time. - is a measurement against the average service time of the IOs on a path
  • SQL : Shortest Queue Length. - is a calculation on the device queue depth when it is on a certain path.
  • NLB/OFF : No load Balancing. - No consideration given to service times or queue lengths, typically impacts performance.

Friday, August 25, 2006

ASM Instance Changes

Here are some changes that we had to make after using DBCA to create our ASM instances:

First, we had to increase the large pool to 100M. Next, we had to increase the number of processes. The number of processes should be the default (40) times the number of nodes in the Oracle cluster (40 * # of nodes)

Two MetaLink Notes - Bookmark

Here are two more MetaLink notes to bookmark:
  1. 368055.1 Deployment of very large databases (10TB to PB range) with Automatic Storage Management (ASM)
  2. 139272.1 HP-UX: Asynchronous i/o

Both of these notes, I found useful during our recent VLDB 10g RAC HP-UX Itanium install. A friend of mine at work found the second note and told me about it. Thanks.

Tuesday, August 22, 2006

We Have An Oracle Cluster

Now, that we have an Oracle cluster and RAC database. Here is a quick high-level touchbase on the RAC environment from the Oracle documentation:

As a DBA, after installation your tasks are to administer your RAC environment at three levels:
  • Instance Administration
  • Database Administration
  • Cluster Administration


For administering Real Application Clusters, use the following tools to perform administrative tasks in RAC:

  • Cluster Verification Utility (CVU)—Install and use CVU before you install RAC to ensure that your configuration meets the minimum RAC installation requirements. Also use the CVU for on-going administrative tasks, such as node addition and node deletion.
  • Enterprise Manager—Oracle recommends that you use Enterprise Manager to perform administrative tasks whenever feasible.
  • Task-specific GUIs such as the Database Configuration Assistant (DBCA) and the Virtual Internet Protocol Configuration Assistant (VIPCA)
  • Command-line tools such as SQL*Plus, Server Control (SRVCTL), the Oracle Clusterware command-line interface, and the Oracle Interface Configuration tool (OIFCFG)

I've got plenty of posts saved up and will enter them either tonight or soon. It has been crazy at work. Funny, how you learn so much more while in a storm than you do when things are calm. It is like that in life as well as at work. God has a way of tempering you in the storms that you face and bringing you thru them stronger and better than before. Man, I could preach on that especially with things going on in my personal life right now.

Tuesday, August 15, 2006

ORA-00600 [KGHALO4]

We got an ORA-00600 [KGHALO4] error today every time we tried to create the ASM instance with DBCA. We found that this is a bug in 10.2.0.1. See MetaLink Note: 340976.1.

Solutions:

Apply the one-off patch for bug 4414666 or set the _enable_NUMA_optimization=FALSE parameter in the init.ora as a workaround. For an ASM instance, two parameters are needed:
_enable_NUMA_optimization = FALSE _disable_instance_params_check = TRUE. Another solution is to apply the 10.2.0.2 patchset which fixes this problem.

Because this bug happens whenever you are trying to create the ASM Instance, it would be advisable to just apply the one-off patch for the bug and then install the 10.2.0.2 patchset as normal once everything is created and working.

Monday, August 14, 2006

OUI Trace

Here is how we traced the Oracle Universal Installer. Execute the following and redirect the output to a file:

./runInstaller -J-DTRACING.ENABLED=true -J-DTRACING.LEVEL=2

We used this method to review what the OUI was doing and determine that the OUI saw ServiceGuard (SG) and SG libraries on the cluster nodes. See my previous posts for more information.

HP Autopath and Oracle

We saw with the OCR and voting raw disks that we had to create a common device special filename (alias) that mapped to the multiple paths to the disks.

/dev/rdsk/ocr
/dev/rdsk/voting

It was all because of the HP Autopaths and virtual storage platform (EVA8000). Oracle Installer only allows entry of one path to the OCR and voting disks. Therefore, we had to create the special filenames in order to use OUI. mksf is the command. See HP portion in Note: 293819.1

We are now wondering about the creation of the ASM disk groups. Will HP autopathing cause us a problem? We should know soon.

Saturday, August 12, 2006

Cleanup Failed CRS Install on HP-UX Itanium

Since we had to do this several times, the following is how to cleanup after a failed CRS Install on HP-UX Itanium. (Refer to MetaLink Note: 239998.1). There were problems with the rootdelete.sh and rootdeinstall.sh scripts so we always had to cleanup the failed install manually:
  1. srvctl stop nodeapps -n (we didn't have to do this because our failed installs never got this far).
  2. As root:
  3. rm /sbin/init.d/init.cssd
  4. rm /sbin/init.d/init.crs
  5. rm /sbin/init.d/init.crsd
  6. rm /sbin/init.d/evmd
  7. rm /sbin/rc2.d/K001init.crs
  8. rm /sbin/rc2.d/K960init.crs
  9. rm /sbin/rc3.d/K001init.crs
  10. rm /sbin/rc3.d/K960init.crs
  11. rm /sbin/rc3.d/S960init.crs
  12. rm -Rf /var/opt/oracle/scls_scr
  13. rm -Rf /var/opt/oracle/oprocd
  14. rm /etc/inittab.crs
  15. cp /etc/inittab.orig /etc/inittab
  16. If they are not already down, kill the EVM, CRS, and CSS processes.
  17. rm -Rf /var/tmp/.oracle
  18. rm -Rf /tmp/.oracle
  19. remove the ocr.loc file
  20. rm -Rf /* CRS Install Location */
  21. De-install the CRS home in the OUI
  22. Clean out the OCR and voting files with dd commands Example:
  23. dd if=/dev/zero of=/dev/rdsk/voting bs=8192 count=2560
  24. dd if=/dev/zero of=/dev/rdsk/ocr bs=8192 count=12800
  25. rm -Rf /app/oracle/oraInventory

Once those are done, you can restart the OUI install of Clusterware at the very beginning. Oracle has also just released a new cleanup utility for failed CRS installs here is the link:

http://download-west.oracle.com/otndocs/products/clustering/deinstall/clusterdeconfig.zip

The new script didn't work for us either. Of course, I didn't try it after our system admins removed ServiceGuard so it may work now.

Friday, August 11, 2006

Clusterware Install Tips HP-UX Itanium

OK, we finally have a working Oracle Cluster. Praise God! I wanted to add some of the new things that have been discovered since the last post.

Make sure if you are not going to use HP ServiceGuard on your RAC cluster, that all of ServiceGuard has been stopped and uninstalled. Don't leave any libraries laying around. We found out this the hard way, after spending over a week trying to figure out why the Oracle Installer was acting so crazy and bizarre.

Here are the symptoms: First, if the OUI does not give you the option to add the other nodes and you have to use a configuration file, this is a red flag that the OUI thinks that you are using some vendor cluster software (in this case HP ServiceGuard) instead of using Oracle's. Secondly, if you have some variables that are not assigned (see previous post) in the rootconfig script this indicates that it is not really trying to install rather it is trying to upgrade/update the OCR.

If for some reason, you get the clusterware services running on one of the nodes but it doesn't start on the others and locks up. It probably means that your removal of ServiceGuard was incomplete and left a few SG libraries laying around.

We found all of this out the hard way because HP installed and started ServiceGuard when they installed the HP 9000 Superdome!

Finally, here is an undocumented procedure for HP-UX Itanium Clusterware 10gR2 installation: Shutdown the VIP interface BEFORE beginning the install. If you don't, an error message will appear saying that the VIP interface is being used by another system. Then you have to shut the VIPs down before continuing the install. This is weird because the VIP must be up in order for cluvfy nodecon to work.

Friday, August 04, 2006

CVU Shared Storage Accessibility Check

We are almost ready to reinstall Clusterware. However our cluvfy comp ssa check is returning the following in the cluvfy trace file:

ERROR>/tmp/9999//bin/lsnodes: cannot get local node number

Although Cluster Verify Utility is an excellent tool for prerequisites, there is room for improvement. Since we did not run cluvfy as thoroughly on our first attempt, we did not encounter Bug 4714708 - Cvu Cannot See Shared Drives.

It turns out that CVU currently does not work with devices other than SCSI devices.