Oracle 11.2.0.4在RedHat Linux 6.1上执行/u01/app/product/11.2.0/crs/root.sh脚本时报以下错误信息:
/u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n linuxidc1 ... failed
FirstNode configuration failed at /u01/app/product/11.2.0/crs/crs/install/crsconfig_lib.pm line 9379.
/u01/app/product/11.2.0/crs/perl/bin/perl -I/u01/app/product/11.2.0/crs/perl/lib -I/u01/app/product/11.2.0/crs/crs/install /u01/app/product/11.2.0/crs/crs/install/rootcrs.pl execution failed
从上面的错误信息可以看到在执行srvctl start nodeapps -n bieku1时失败,尝试手动执行这个命令
[grid@linuxidc1 bin]$ ./srvctl start nodeapps -n linuxidc1
PRCR-1013 : Failed to start resource ora.ons
PRCR-1064 : Failed to start resource ora.ons on node linuxidc1
CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/linuxidc1/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-2674: Start of 'ora.ons' on 'linuxidc1' failed
错误信息是Start of 'ora.ons' on 'linuxidc1' failed,那么来检查$ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_$HOSTNAME.log日志文件
[grid@linuxidc1 crs]$ cd $ORACLE_HOME/cfgtoollogs/crsconfig/
[grid@linuxidc1 crsconfig]$ ls -lrt
total 332
-rwxrwxr-x 1 grid oinstall 81336 Aug 26 15:36 srvmcfg0.log
-rwxrwxr-x 1 grid oinstall 18719 Aug 26 15:36 srvmcfg1.log
-rwxrwxr-x 1 grid oinstall 23213 Aug 26 15:36 srvmcfg2.log
-rwxrwxr-x 1 grid oinstall 24700 Aug 26 15:36 srvmcfg3.log
-rwxrwxr-x 1 grid oinstall 10705 Aug 26 15:36 srvmcfg4.log
-rwxrwxr-x 1 grid oinstall 25594 Aug 26 15:37 srvmcfg5.log
-rwxrwxr-x 1 grid oinstall 132771 Aug 26 15:37 rootcrs_linuxidc1.log
[grid@linuxidc1 crsconfig]$ cat rootcrs_linuxidc1.log
2015-08-26 15:36:52: J2EE (OC4J) Container Resource Add Wallet ... passed ...
2015-08-26 15:36:52: Running as user grid: /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate
2015-08-26 15:36:52: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate '
2015-08-26 15:36:54: Removing file /tmp/fileoriV8Q
2015-08-26 15:36:54: Successfully removed file: /tmp/fileoriV8Q
2015-08-26 15:36:54: /bin/su successfully executed
2015-08-26 15:36:54: qosctl output: User qosadmin added successfully.
User oc4jadmin added successfully.
2015-08-26 15:36:54: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin
2015-08-26 15:36:54: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin '
2015-08-26 15:36:55: Removing file /tmp/fileHsIIY7
2015-08-26 15:36:55: Successfully removed file: /tmp/fileHsIIY7
2015-08-26 15:36:55: /bin/su successfully executed
2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin
2015-08-26 15:36:55: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin '
2015-08-26 15:36:55: Removing file /tmp/fileQXtLZo
2015-08-26 15:36:55: Successfully removed file: /tmp/fileQXtLZo
2015-08-26 15:36:55: /bin/su successfully executed
2015-08-26 15:36:55: Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: trace file=/u01/app/product/11.2.0/crs/cfgtoollogs/crsconfig/srvmcfg5.log
2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/srvctl add cvu
2015-08-26 15:36:55: Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu" as user "grid"
2015-08-26 15:36:55: Executing /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: Executing cmd: /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:57: add cvu ... success
2015-08-26 15:36:57: starting nodeapps...
2015-08-26 15:36:57: DHCP_flag=0
2015-08-26 15:36:57: nodes_to_start=linuxidc1
2015-08-26 15:37:18: exit value of start nodeapps/vip is 1
2015-08-26 15:37:18: output for start nodeapps is PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node linuxidc1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/linuxidc1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'linuxidc1' failed
2015-08-26 15:37:18: output of startnodeapp after removing already started mesgs is PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node linuxidc1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/linuxidc1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'linuxidc1' failed
2015-08-26 15:37:18: /u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n linuxidc1 ... failed
检查I $GRID_HOME/opmn/logs/ons.log.*文件,看是否有以下错误:
1.
[grid@linuxidc1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@linuxidc1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.linuxidc1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@linuxidc1 logs]$ cat ons.log.linuxidc1
[2015-08-26T15:37:02+08:00] [internal] getaddrinfo(::0, 6200, 1) failed (Hostname and service name not provided or found): Connection timed out
如果存在上面的错误信息,那么原因就是/etc/hosts文件中localhost对应的IP地址不是127.0.0.1。解决方法如就是确保DNS和/etc/hosts文件正确设置了localhost,DNS或/etc/hosts文件依赖于(/etc/nsswitch.conf, or /etc/netsvc.conf depend on platform),这些配置文件中的命名解决方案的设置,可以参考MOS中的ID 942166.1 or ID 969254.1文档来进行处理。
2.
[grid@linuxidc1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@linuxidc1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.linuxidc1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@linuxidc1 logs]$ cat ons.log.linuxidc1
[2015-08-26T15:37:02+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:37:02+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:39:42+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:39:42+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
原因是有其它的进程占用的ONS服务的端口
[grid@linuxidc1 logs]$ grep port $ORACLE_HOME/opmn/conf/ons.config
localport=6100 # line added by Agent
remoteport=6200 # line added by Agent
[root@linuxidc1 /]# lsof | grep 6200 | grep LISTEN
ons 16413 grid 6u IPv6 162533 TCP *:6200 (LISTEN)
可以看到进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用,如果是在执行 rootupgrade.sh脚本进行升级之前被占用,那么可能的原因是旧版本的ons进程还在运行。
3.
[grid@linuxidc1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@linuxidc1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.linuxidc1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@linuxidc1 logs]$ cat ons.log.linuxidc1
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] 0000:0000:0000:0000:0000:0000:0000:0001,6100: BIND (Cannot assign requested address)
这种情况可能是IPV6被部分配置了,11gR2 Grid Infrastructure不支持IPv6。解决方法就是在$GRID_HOME/opmn/conf/ons.config and ons.config.文件中设置下面的参数:
interface=ipv4
这里出现的错误是第2种,进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用
[root@linuxidc1 /]# lsof | grep 6200 | grep LISTEN
ons 16413 grid 6u IPv6 162533 TCP *:6200 (LISTEN)
[root@linuxidc1 /]# kill -9 16413
再重新执行root.sh脚本
[root@linuxidc1 /]# ./u01/app/product/11.2.0/crs/root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /u01/app/product/11.2.0/crs
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/product/11.2.0/crs/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
PRKO-2190 : VIP exists for node linuxidc1, VIP name linuxidc1-vip
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
在kill掉占用6200端口的进程之后,root.sh脚本可以成功执行。
: