Oracle GI的高度自动化 、智能化功能伴随着系统体系架构的复杂性增加,随之而来的运维管理要求也在逐步攀升。仅从DBA角度看,除了数据库核心技术之外,网络、存储、主机系统等相关技术的要求也在“水涨船高”。在这样的背景下,Oracle对一些复杂性很强的操作,也在逐版本的进行封装简化。
这样的变化的确可以帮助我们解决很多问题,减少很多负担。但是,不完全的文档和谬误的经验秘籍可能会帮上倒忙。这个时候,还是需要我们进行逐层分析,慢慢解决问题。
本篇主要介绍在进行OPatch auto模式给GI打补丁过程中遇到的一个问题,记录下来,待有需要的朋友不时之需。
1、问题说明
笔者使用环境为Oracle 11gR2,小版本为11.2.0.4。
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 – Production
底层环境为GI+Single Instance Database,为了进行安全升级,计划打GI的PSU升级包。
首先,根据提示要求,关闭监听器、数据库实例和ASM实例。
[grid@NCR-Standby-Asm 19852360]$ srvctl stop database -d sicsstb
PRCC-1016 : sicsstb was already stopped
[grid@NCR-Standby-Asm 19852360]$ srvctl stop asm
PRCR-1065 : Failed to stop resource ora.asm
CRS-2529: Unable to act on 'ora.asm' because that would require stopping or relocating 'ora.DATA.dg', but the force option was not specified
[grid@NCR-Standby-Asm 19852360]$ srvctl stop asm -f
[grid@NCR-Standby-Asm 19852360]$ srvctl stop listener
[grid@NCR-Standby-Asm 19852360]$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE OFFLINE ncr-standby-asm
ora.LISTENER.lsnr
OFFLINE OFFLINE ncr-standby-asm
ora.RECO.dg
ONLINE OFFLINE ncr-standby-asm
ora.asm
OFFLINE OFFLINE ncr-standby-asm Instance Shutdown
ora.ons
OFFLINE OFFLINE ncr-standby-asm
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 ONLINE ONLINE ncr-standby-asm
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE ONLINE ncr-standby-asm
ora.sicsstb.db
1 OFFLINE OFFLINE Instance Shutdown
安装包解压。
[root@NCR-Standby-Asm gidir]# ls -l
total 1338912
-rw-r--r-- 1 root root 665035674 May 25 10:27 p20485808_112040_Linux-x86-64.zip
[root@NCR-Standby-Asm gidir]# unzip p20485808_112040_Linux-x86-64.zip
inflating: 20485808/20299019/files/bin/acfsrepl_monitor
inflating: 20485808/20299019/files/bin/acfsdriverstate
inflating: 20485808/20299019/files/bin/acfsreplcrs
inflating: 20485808/20299019/files/bin/acfsrepl_initializer
inflating: 20485808/20299019/files/bin/acfsreplcrs.pl
inflating: 20485808/20299019/files/bin/acfsregistrymount
inflating: 20485808/20299019/README.txt
creating: 20485808/20299019/custom/
creating: 20485808/20299019/custom/scripts/
inflating: 20485808/20299019/custom/scripts/prepatchverify.sh
inflating: 20485808/20299019/custom/scripts/pre.txt
inflating: 20485808/20299019/custom/scripts/post.txt
inflating: 20485808/bundle.xml
inflating: PatchSearch.xml
[root@NCR-Standby-Asm gidir]# ls -l
total 1338920
drwxr-xr-x 5 root root 4096 Mar 27 23:33 20485808
-rw-r--r-- 1 root root 665035674 May 25 10:27 p20485808_112040_Linux-x86-64.zip
进入目录,确定文件目录权限对应的都是root所有者。
[root@NCR-Standby-Asm gidir]# cd 20485808/
[root@NCR-Standby-Asm 20485808]# ls -l
total 136
drwxrwxr-x 8 root root 4096 Mar 4 18:30 20299013
drwxr-xr-x 5 root root 4096 Mar 27 23:26 20299019
drwxr-xr-x 5 root root 4096 Mar 27 23:19 20420937
-rw-r--r-- 1 root root 549 Mar 27 23:32 bundle.xml
-rw-rw-r-- 1 root root 61184 Mar 6 13:43 PatchSearch.xml
-rw-rw-r-- 1 root root 60655 Apr 13 15:17 README.html
-rw-r--r-- 1 root root 0 Mar 27 23:33 README.txt
使用OPatch auto模式,要求输入一个OCM(Oracle Configuration Manager)的相应文件response file,用于进行多个Patch安装过程中的交互信息的输入。
生成OCM响应文件,要进入OPatch生成。
[grid@NCR-Standby-Asm gidir]$ cd $ORACLE_HOME
[grid@NCR-Standby-Asm grid]$ cd OPatch
[grid@NCR-Standby-Asm OPatch]$ cd ocm/bin
[grid@NCR-Standby-Asm bin]$ ls -l
total 12
-rwxr-x--- 1 grid oinstall 9063 Nov 27 2009 emocmrsp
调用emocmrsp脚本生成。
[grid@NCR-Standby-Asm bin]$ ./emocmrsp -no_banner -output /u01/unconfig.rsp
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name:
You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]: Y
The OCM configuration response file (/u01/unconfig.rsp) was successfully created.
Validation of Oracle Inventory
GI和Database的Home目录对比。由于GI补丁过程是要进行GI和database两个组件,要进行一下目录对比。
[grid@NCR-Standby-Asm bin]$ env | grep ORACLE_HOME
ORACLE_HOME=/u01/app/grid/product/11.2.0/grid
[oracle@NCR-Standby-Asm ~]$ env | grep ORACLE_HOME
ORACLE_HOME=/u02/app/oracle/product/11.2.0/dbhome_1
[grid@NCR-Standby-Asm bin]$ opatch lsinventory -detail -oh /u02/app/oracle/product/11.2.0/dbhome_1
OPatch could not create/open history file for writing.
Oracle Interim Patch Installer version 11.2.0.3.10
Copyright (c) 2015, Oracle Corporation. All rights reserved.
Oracle Home : /u02/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /u01/app/oraInventory
from : /u02/app/oracle/product/11.2.0/dbhome_1/oraInst.loc
OPatch version : 11.2.0.3.10
OUI version : 11.2.0.4.0
Log file location : /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2015-05-25_14-51-28PM_1.log
Lsinventory Output file location : /u02/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/lsinv/lsinventory2015-05-25_14-51-28PM.txt
--------------------------------------------------------------------------------
Local Machine Information::
Hostname: localhost
ARU platform id: 226
ARU platform description:: Linux x86-64
Installed Top-level Products (1):
Oracle Database 11g 11.2.0.4.0
There are 1 products installed in this Oracle Home.
(篇幅原因,有省略……)
/oracle/jdbc/driver/OracleSql.class --> ORACLE_HOME/jdbc/lib/ojdbc6.jar
Patch Location in Inventory:
/u02/app/oracle/product/11.2.0/dbhome_1/inventory/oneoffs/19852360
Patch Location in Storage area:
/u02/app/oracle/product/11.2.0/dbhome_1/.patch_storage/19852360_Oct_20_2014_08_17_43
--------------------------------------------------------------------------------
OPatch succeeded.
进行正式的补丁过程。注意:这里面是需要使用root来执行opatch脚本。
[root@NCR-Standby-Asm OPatch]# ./opatch auto /upload/gidir/20485808 -ocmrf /u01/unconfig.rsp
Executing /u01/app/grid/product/11.2.0/grid/perl/bin/perl ./crs/patch11203.pl -patchdir /upload/gidir -patchn 20485808 -ocmrf /u01/unconfig.rsp -paramfile /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
This is the main log file: /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto2015-05-25_15-03-06.log
This file will show your detected configuration and all the steps that opatchauto attempted to do on your system:
/u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto2015-05-25_15-03-06.report.log
2015-05-25 15:03:06: Starting Oracle Restart Patch Setup
Using configuration parameter file: /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
Stopping RAC /u02/app/oracle/product/11.2.0/dbhome_1 ...
Stopped RAC /u02/app/oracle/product/11.2.0/dbhome_1 successfully
patch /upload/gidir/20485808/20299013 apply successful for home /u02/app/oracle/product/11.2.0/dbhome_1
patch /upload/gidir/20485808/20420937/custom/server/20420937 apply successful for home /u02/app/oracle/product/11.2.0/dbhome_1
Stopping CRS...
Stopped CRS successfully
Error : The opatch Applicable check failed. The patch /upload/gidir/20485808/20420937 is not applicable to /u01/app/grid/product/11.2.0/grid
Error:Patch Applicable check failed for /u01/app/grid/product/11.2.0/grid
Starting CRS...
ERROR: Prereq checkApplicable failed. Refer log file for more details.
opatch auto failed.
在补丁过程中出错,过程日志看是在关闭CRS,进行CRS升级过程中出现的错误。
2、问题分析
对于复杂脚本执行过程中出现的问题,对于初学者是很头疼的。其实,如同学生时代综合题,都是由一系列的基础问题构成。对于执行脚本错误问题,执行日志就是贯穿基础问题的线索。
从脚本中,可以定位到主日志的名称位置。/u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto2015-05-25_15-03-06.log
在其中,逐步进行检查,定位错误发生的位置。
[root@NCR-Standby-Asm ~]# cd /u01/app/grid/product/11.2.0/grid/cfgtoollogs/
[root@NCR-Standby-Asm cfgtoollogs]# ls -l | grep opatchauto
drwxr-xr-x 3 grid oinstall 4096 May 25 15:03 opatchauto
-rwxrwx--- 1 grid oinstall 55630 May 25 15:08 opatchauto2015-05-25_15-03-06.log
-rw-r--r-- 1 root root 5203 May 25 15:06 opatchauto2015-05-25_15-03-06.report.log
[root@NCR-Standby-Asm cfgtoollogs]# tail -n 100 opatchauto2015-05-25_15-03-06.log
(问题片段展示)
2015-05-25 15:06:42: Running as user grid: /u01/app/grid/product/11.2.0/grid/OPatch/opatch prereq CheckApplicable -ph /upload/gidir/20485808/20420937 -oh /u01/app/grid/product/11.2.0/grid -customLogDir /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto/core
2015-05-25 15:06:42: s_run_as_user2: Running /bin/su grid -c ' /u01/app/grid/product/11.2.0/grid/OPatch/opatch prereq CheckApplicable -ph /upload/gidir/20485808/20420937 -oh /u01/app/grid/product/11.2.0/grid -customLogDir /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto/core '
2015-05-25 15:06:44: Removing file /tmp/filemKCSmj
2015-05-25 15:06:44: Successfully removed file: /tmp/filemKCSmj
2015-05-25 15:06:44: /bin/su exited with rc=1
2015-05-25 15:06:44: Error : The opatch Applicable check failed. The patch /upload/gidir/20485808/20420937 is not applicable to /u01/app/grid/product/11.2.0/grid
2015-05-25 15:06:44: Running as user grid: /u01/app/grid/product/11.2.0/grid/OPatch/opatch prereq CheckApplicable -ph /upload/gidir/20485808/20299019 -oh /u01/app/grid/product/11.2.0/grid -customLogDir /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto/core
2015-05-25 15:06:44: s_run_as_user2: Running /bin/su grid -c ' /u01/app/grid/product/11.2.0/grid/OPatch/opatch prereq CheckApplicable -ph /upload/gidir/20485808/20299019 -oh /u01/app/grid/product/11.2.0/grid -customLogDir /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto/core '
2015-05-25 15:06:46: Removing file /tmp/filepuKZDC
2015-05-25 15:06:46: Successfully removed file: /tmp/filepuKZDC
2015-05-25 15:06:46: /bin/su successfully executed
2015-05-25 15:06:46: Status of Applicable check for /u01/app/grid/product/11.2.0/grid is 1
2015-05-25 15:06:46: Error:Patch Applicable check failed for /u01/app/grid/product/11.2.0/grid
2015-05-25 15:06:46: Executing cmd: /bin/rpm -q sles-release
2015-05-25 15:06:47: Command output:
> package sles-release is not installed
从片段中,我们发现两个线索:首先,虽然是root执行脚本,但是在过程中出现了一个“Running as user gird”提示。说明在执行脚本Check的时候,脚本实现了从root切换到gird的动作。另一个是报错的主因:是进行Check的时候。
那么,从重现问题的角度看。可以手工模拟一下出现问题:
[grid@NCR-Standby-Asm ~]$ /u01/app/grid/product/11.2.0/grid/OPatch/opatch prereq CheckApplicable -ph /upload/gidir/20485808/20420937 -oh /u01/app/grid/product/11.2.0/grid -customLogDir /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto/core
Oracle Interim Patch Installer version 11.2.0.3.10
Copyright (c) 2015, Oracle Corporation. All rights reserved.
PREREQ session
Oracle Home : /u01/app/grid/product/11.2.0/grid
Central Inventory : /u01/app/oraInventory
from : /u01/app/grid/product/11.2.0/grid/oraInst.loc
OPatch version : 11.2.0.3.10
OUI version : 11.2.0.4.0
Log file location : /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2015-05-25_15-17-39PM_1.log
Invoking prereq "checkapplicable"
ZOP-46: The patch(es) are not applicable on the Oracle Home because some patch actions are not applicable. All required components, however, are installed.
Prereq "checkApplicable" for patch 20420937 failed.
The details are:
Patch 20420937:
Copy Action: Source File "/upload/gidir/20485808/20420937/files/bin/appvipcfg.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'appvipcfg.pl' to '/u01/app/grid/product/11.2.0/grid/bin/appvipcfg.pl'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/bin/oclumon.bin" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'oclumon.bin' to '/u01/app/grid/product/11.2.0/grid/bin/oclumon.bin'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/bin/ologgerd" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'ologgerd' to '/u01/app/grid/product/11.2.0/grid/bin/ologgerd'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/bin/osysmond.bin" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'osysmond.bin' to '/u01/app/grid/product/11.2.0/grid/bin/osysmond.bin'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/crs/demo/coldfailover/act_db.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'act_db.pl' to '/u01/app/grid/product/11.2.0/grid/crs/demo/coldfailover/act_db.pl'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/crs/demo/coldfailover/act_listener.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'act_listener.pl' to '/u01/app/grid/product/11.2.0/grid/crs/demo/coldfailover/act_listener.pl'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/crs/demo/coldfailover/act_resgroup.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'act_resgroup.pl' to '/u01/app/grid/product/11.2.0/grid/crs/demo/coldfailover/act_resgroup.pl'
Copy Action: Source File "/upload/gidir/20485808/20420937/files/crs/demo/demoActionScript" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'demoActionScript' to '/u01/app/grid/product/11.2.0/grid/crs/demo/demoActionScript'
onewaycopyAction : Source File "/upload/gidir/20485808/20420937/files/crs/install/tfa_setup.sh" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'tfa_setup.sh' to '/u01/app/grid/product/11.2.0/grid/crs/install/tfa_setup.sh'
Copy Action: Directory is not writeable: "/u01/app/grid/product/11.2.0/grid"
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'libsrvm11.so' to '/u01/app/grid/product/11.2.0/grid/oui/lib/linux/libsrvm11.so'
OPatch failed with error code 1
错误原因均为不能拷贝。查看其中一个文件存在情况。
[grid@NCR-Standby-Asm install]$ ls -l | grep tfa
-rwxr-x--- 1 root root 14530337 Mar 27 23:20 tfa_setup.sh
笔者猜测,系列文件的所有者为root,但是执行过程中却自动切换到grid用户。也许是文件权限出现问题。这种情况下,降低权限也是是不错的选择。
3、问题解决
将补丁包文件夹权限设置为grid,并且修改为755权限类型。
[root@NCR-Standby-Asm gidir]# chown -R grid:oinstall 20485808
[root@NCR-Standby-Asm gidir]# chmod -R 755 20485808/
[root@NCR-Standby-Asm gidir]# ls -l
total 1338920
drwxr-xr-x 5 grid oinstall 4096 Mar 27 23:33 20485808
重新尝试auto补丁。
[root@NCR-Standby-Asm OPatch]# ./opatch auto /upload/gidir/20485808 -ocmrf /u01/unconfig.rsp
Executing /u01/app/grid/product/11.2.0/grid/perl/bin/perl ./crs/patch11203.pl -patchdir /upload/gidir -patchn 20485808 -ocmrf /u01/unconfig.rsp -paramfile /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
This is the main log file: /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto2015-05-25_15-22-24.log
This file will show your detected configuration and all the steps that opatchauto attempted to do on your system:
/u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatchauto2015-05-25_15-22-24.report.log
2015-05-25 15:22:24: Starting Oracle Restart Patch Setup
Using configuration parameter file: /u01/app/grid/product/11.2.0/grid/crs/install/crsconfig_params
Stopping RAC /u02/app/oracle/product/11.2.0/dbhome_1 ...
Stopped RAC /u02/app/oracle/product/11.2.0/dbhome_1 successfully
patch /upload/gidir/20485808/20299013 apply successful for home /u02/app/oracle/product/11.2.0/dbhome_1
patch /upload/gidir/20485808/20420937/custom/server/20420937 apply successful for home /u02/app/oracle/product/11.2.0/dbhome_1
Stopping CRS...
Stopped CRS successfully
patch /upload/gidir/20485808/20299013 apply successful for home /u01/app/grid/product/11.2.0/grid
patch /upload/gidir/20485808/20420937 apply successful for home /u01/app/grid/product/11.2.0/grid
patch /upload/gidir/20485808/20299019 apply successful for home /u01/app/grid/product/11.2.0/grid
Starting CRS...
CRS-4123: Oracle High Availability Services has been started.
Starting RAC /u02/app/oracle/product/11.2.0/dbhome_1 ...
Started RAC /u02/app/oracle/product/11.2.0/dbhome_1 successfully
opatch auto succeeded.
使用lsinventory确认补丁。
[grid@NCR-Standby-Asm ~]$ opatch lsinventory
Oracle Interim Patch Installer version 11.2.0.3.10
Copyright (c) 2015, Oracle Corporation. All rights reserved.
Oracle Home : /u01/app/grid/product/11.2.0/grid
Central Inventory : /u01/app/oraInventory
from : /u01/app/grid/product/11.2.0/grid/oraInst.loc
OPatch version : 11.2.0.3.10
OUI version : 11.2.0.4.0
Log file location : /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatch/opatch2015-05-25_15-36-53PM_1.log
Lsinventory Output file location : /u01/app/grid/product/11.2.0/grid/cfgtoollogs/opatch/lsinv/lsinventory2015-05-25_15-36-53PM.txt
--------------------------------------------------------------------------------
Local Machine Information::
Hostname: localhost
ARU platform id: 226
ARU platform description:: Linux x86-64
Installed Top-level Products (1):
Oracle Grid Infrastructure 11g 11.2.0.4.0
There are 1 products installed in this Oracle Home.
Interim patches (4) :
Patch 20299019 : applied on Mon May 25 15:30:25 CST 2015
Unique Patch ID: 18573450
Patch description: "ACFS Patch set update : 11.2.0.4.6 (20299019)"
Created on 27 Mar 2015, 15:26:30 hrs
Bugs fixed:
17510275, 17172303, 16318126, 19690653, 17203009, 17376318, 17503605
20140148, 17611362, 17721778, 17164243, 19053182, 17696547, 17699423
(篇幅原因,有省略……)
Patch 19852360 : applied on Mon May 25 10:02:48 CST 2015
Unique Patch ID: 18170553
Created on 20 Oct 2014, 08:17:43 hrs PST8PDT
Bugs fixed:
19852360
--------------------------------------------------------------------------------
OPatch succeeded.
补丁成功!
4、结论
GI问题常常伴随执行脚本的错误。依据日志信息,逐步深入、抽丝剥茧往往可以获得比较好的问题解决。
: