Troubleshoot Oracle RAC issues
This page provides troubleshooting tips for issues related to Oracle RAC on Bare Metal Solution.
Check if your question or problem has already been addressed on the Known issues and limitations page.
SSH verification fails with OpenSSH error
SSH verification might fail with the following OpenSSH error:
OpenSSH_6.7: ERROR [INS-06003] Failed to setup passwordless SSH connectivity During Grid Infrastructure Install
To resolve this issue, do the following:
In the
/etc/ssh/sshd_config
file, add the following line:KexAlgorithms curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1,diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1
Restart the
sshd
service to apply the changes./etc/init.d/sshd restart
SCP file copy taking too long
The SCP file copy with rekey operation might take too long to complete due to a Bare Metal Solution SSH daemon configuration issue.
To resolve this issue, do the following:
On your Bare Metal Solution server, open the
sshd_config
file in edit mode.vi /etc/ssh/sshd_config
In the
sshd_config
file, add following line. If the line already exists in the file, modify it as follows:ClientAliveInterval 420
Restart the
sshd
service to apply the changes./etc/init.d/sshd restart
CRS root.sh
or OCSSD fails with No Network HB
error
CRS root.sh
script fails with the following error if the node pings
the IP address 169.254.169.254:
has a disk HB, but no network HB
The IP address 169.254.169.254 is the Google Cloud metadata service which registers the instance in Google Cloud. If you block this IP address, the Google Cloud VM can't boot up. This in turn can interrupt the HAIP communication route causing the Bare Metal Solution RAC servers to experience HAIP communication issues.
To resolve this issue, you need to block the IP address or disable HAIP. The
following example shows how to block IP address with route
commands. The
changes made by route
statement are not persistent. Therefore, you need to
modify the system startup scripts.
To resolve this issue, do the following:
On all the nodes, run the following command before rerunning the
root.sh
script./sbin/route add -host 169.254.169.254 reject
Make the
rc
script executable.chmod +x /etc/rc.d/rc.local
In the
/etc/rc.d/rc.local
file, add the following lines:/sbin/route add -host 169.254.169.254 reject Enable rc-local service systemctl status rc-local.service systemctl enable rc-local.service systemctl start rc-local.service
Reboot process not responding
If your server is running Red Hat Linux, OVM, or SUSE Linux, and there are many LUNs attached to it, the reboot process might stop responding.
To resolve this issue, increase the default watchdog timeout value:
Under
/etc/systemd
, create a folder namedsystem.conf.d
.In the folder, create a
*.conf
file. For example,/etc/systemd/system.conf.d/kernel-reboot-workaround.conf
.In the
*.conf
file, add the following code:[Manager] RuntimeWatchdogSec=5min ShutdownWatchdogSec=5min
An alternative workaround is as follows:
Open the
grub.cfg
file in edit mode.vi /etc/default/grub
Remove the
quiet
parameter from the settings.Add the following after the parameter
GRUB_CMDLINE_LINUX
:acpi_no_watchdog DefaultTimeoutStartSec=900s DefaultTimeoutStopSec=900s
Rebuild the
grub.cfg
file.grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
Oracle Grid infrastructure 12c fails with Rejecting connection
error
Oracle Grid infrastructure 12c installation might fail with the following error:
Rejecting connection from node 2 as MultiNode RAC is not supported or certified in this Configuration.
This error occurs because the IP address 169.254.169.254 is forwarded to the local metadata service of a Compute Engine VM, making it look like the Bare Metal Solution host is a Compute Engine VM. Such a configuration might also leak the Compute Engine VM's private service account keys.
To resolve this issue, consider the security implications of your NAT configuration and limit external network access as much as possible. Do the following:
Block the access to the metadata service on your cloud VM:
firewall-cmd --direct --add-rule ipv4 filter FORWARD 0 -d 169.254.169.254 -j REJECT --reject-with icmp-host-unreachable firewall-cmd --permanent --direct --add-rule ipv4 filter FORWARD 0 -d 169.254.169.254 -j REJECT --reject-with icmp-host-unreachable
Block access to the metadata service on the Bare Metal Solution host:
firewall-cmd --direct --add-rule ipv4 filter OUTPUT 0 -d 169.254.169.254 -j REJECT --reject-with icmp-host-unreachable firewall-cmd --permanent --direct --add-rule ipv4 filter OUTPUT 0 -d 169.254.169.254 -j REJECT --reject-with icmp-host-unr