09-25-2003 03:17 PM
I am having trouble with using the regions on a GS80 NUMA enabled 2-node cluster running Tru64 5.1A with Oracle RAC 220.127.116.11. Each node has 10GB RAM and 8 CPUs.
There is no NUMA optimisation on 9i Release 1, so i don't have to rebuid the kernel which is mandatory in 9i Release 2 IF you want to use Regions.
Important init.ora setting are,
Each node has approximately 1400 sessions connected to it. Strangely, i only see about 800 processess in the 'top' output. But, the main issue is that i want to use regions for performance reasons and the kernel crashes the moment i mount the database. Here are the kernel settings,
vm_swap_eager=0 (we want to use lazy mode)
The proc parameters are well tuned.
We have reduced the ubc parameters after monitoring, so that is working well and i have some free memory, it was paging out active pages heavily earlier. So, we arrived at this value for ubc due to the very heavy load.
The kernel boots properly with no errors, but when we start the instance on node 1, the kernel panic message is
Can someone pls let us know if there is anything wrong with RAD settings?
09-25-2003 06:48 PM
Please open a case within your support center, provide sys_check -escalate and contact Oracle to be sure all required software patches and settings meet the requirements and configuration (software/hardware) is supported with this Oracle version.
09-26-2003 06:28 AM
Oracle RAC with 9.0 was more or less a version 1 producet. You really really realy do NOT want to be using that version in productions.
I suggest you try again with V5.1A PK4 or better and Oracle 9.2.0.x. If not, you risk wasting your time, and the support system time because just too many things have been fixed in the very areas you are discussing.
Don't worry too much about Numatizing Oracle as that benefit is only modest... AND requirer multiple NICs and KGPSAs and a controlled SQLnet usage patterns
And you are discussing quite a few things... rdg, RAC, Numa, UBC, GH all seem to be involved. It is not clear to me whether the rdg_wiring would have any relation to gh_rad_regions from your writeup.
It is clear to me that your gh settings are overly generous based on the init.ora settings: They suggest you use only 500MB GH while allocating 2000MB. If that is correct, then you are wasting 1500GB memory as GH memory can only be used for shmget stuff.
Please verify with ipcs (and vmstat -P).
You may only need say 256M gh rad regions.
Please also consider Tru64 V5.1B. The new big_pages are so much more flexible and useable then GH, and give just about the same performance potential.
[later I'll ask my RDG friend whether this rdg_wiring rings a recent bell]
09-26-2003 06:46 AM
Thanks a lot for the wonderful write up and clarification.
The patchkit level is 4 and we are using 5.1A on GS80.
Unfortunately, we have constraints that limit us to RAC 18.104.22.168, atleast for another 3 months and that is because of the application requirements. I am always a big fan of new releases but this time i am going have to accept what i have :(
I set the regions higher because, if i increase my SGA, i don't need additional downtime for changing the rad parameters and reboot the nodes. It is for testing the implementation that i used a small SGA.
But, what you say is very right, the newer versions are better and using NUMA on 9i R1 is not of much immediate benefit.
Apparently, there is a kernel bug in vmap functions that causes high cpu wait/sleep times (one of my KEY issues) on 5.1A. The patch is T64V51AB21-C0123800-18858.
I would like to test this patch and see if i get rid of the issue.
For the regions, i guess i will reply on the good old ipc mechanism :) ..atleast for now!
Thanks again for your help and i will post here as to what happened.