进入Oracle 11gR2,ASM(Automatic Storage Management)从Database组件中剥离出来,作为独立组件Component进入Grid管理范畴。
本篇主要介绍笔者遇到的一个数据库启动加载过程中出现的问题。同官方MOS推荐的策略相比,有一些不同之处。记录下来,留待需要的朋友待查使用。
1、问题说明
笔者环境是Oracle 单实例+Grid Infrastructure,版本号为11.2.0.4。由于安全原因,从MOS上下载了最新的安全补丁和升级补丁。升级之后的版本为11.2.0.4.6。
但是,在升级最后步骤——执行SQL脚本环节,出现了一些问题。
SQL*Plus: Release 11.2.0.4.0 Production on Mon May 25 16:08:57 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2087780352 bytes
Fixed Size 2254824 bytes
Variable Size 553650200 bytes
Database Buffers 1526726656 bytes
Redo Buffers 5148672 bytes
ORA-00205: error in identifying control file, check alert log for more info
从提示信息角度看,Oracle在经历启动nomount阶段之后,在定位control file的过程中出现了问题。
老实说,虽然是测试环境,但是笔者还是比较惊慌的。于是尝试使用srvctl集群件启动策略。
[grid@NCR-Standby-Asm ~]$ srvctl start database -d sicsstb
PRCC-1014 : sicsstb was already running
PRCR-1004 : Resource ora.sicsstb.db is already running
PRCR-1079 : Failed to start resource ora.sicsstb.db
CRS-5702: Resource 'ora.sicsstb.db' is already running on 'ncr-standby-asm'
2、问题分析
首先确认系统是否可以使用srvctl启动,判断一下GI上面各种资源resource状态。
[grid@NCR-Standby-Asm ~]$ srvctl stop database -d sicsstb
[grid@NCR-Standby-Asm ~]$ srvctl status asm
ASM is running on ncr-standby-asm
[grid@NCR-Standby-Asm ~]$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE ncr-standby-asm
ora.LISTENER.lsnr
ONLINE ONLINE ncr-standby-asm
ora.RECO.dg
ONLINE ONLINE ncr-standby-asm
ora.asm
ONLINE ONLINE ncr-standby-asm Started
ora.ons
OFFLINE OFFLINE ncr-standby-asm
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 ONLINE ONLINE ncr-standby-asm
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 ONLINE ONLINE ncr-standby-asm
ora.sicsstb.db
1 OFFLINE OFFLINE Instance Shutdown
[grid@NCR-Standby-Asm ~]$ srvctl start database -d sicsstb
[grid@NCR-Standby-Asm ~]$
[oracle@NCR-Standby-Asm ~]$ cd $ORACLE_HOME/rdbms/admin
[oracle@NCR-Standby-Asm admin]$ sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Mon May 25 16:14:00 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> select open_mode from v$database;
OPEN_MODE
--------------------
READ WRITE
笔者猜测,这个故障和ASM相关。按照逐步抽丝剥茧的思路,先从数据库日志入手(找到失败启动的那次动作)。
Mon May 25 16:09:28 2015
MMON started with pid=17, OS id=4151
Mon May 25 16:09:28 2015
MMNL started with pid=18, OS id=4153
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
NOTE: initiating MARK startup
Starting background process MARK
ORACLE_BASE from environment = /u02/app/oracle
Mon May 25 16:09:28 2015
MARK started with pid=21, OS id=4161
NOTE: MARK has subscribed
Mon May 25 16:09:28 2015
ALTER DATABASE MOUNT
Mon May 25 16:09:28 2015
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1521))' SCOPE=MEMORY SID='sicsstb';
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
SUCCESS: diskgroup DATA was dismounted
ERROR: diskgroup DATA was not mounted
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
SUCCESS: diskgroup RECO was dismounted
ERROR: diskgroup RECO was not mounted
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+RECO/sicsstb/controlfile/current.256.878897845'
ORA-17503: ksfdopn:2 Failed to open file +RECO/sicsstb/controlfile/current.256.878897845
ORA-15001: diskgroup "RECO" does not exist or is not mounted
ORA-15040: diskgroup is incomplete
ORA-15040: diskgroup is incomplete
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DATA/sicsstb/controlfile/current.260.878897845'
ORA-17503: ksfdopn:2 Failed to open file +DATA/sicsstb/controlfile/current.260.878897845
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15040: diskgroup is incomplete
ORA-15040: diskgroup is incomplete
ORA-15040: diskgroup is incomplete
ORA-205 signalled during: ALTER DATABASE MOUNT...
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Mon May 25 16:09:31 2015
SUCCESS: diskgroup DATA was dismounted
ERROR: diskgroup DATA was not mounted
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
SUCCESS: diskgroup RECO was dismounted
ERROR: diskgroup RECO was not mounted
Errors in file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
从提示信息看,Oracle在mount阶段时候,利用spfile中指定的control file位置去访问+DATA和+RECO磁盘组,但是两个磁盘组没有mount,所以才开始报错。
参数中,control file以镜像冗余方式存在在ASM Diskgroup中。
SQL> show parameter spfile
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string +DATA/sicsstb/spfilesicsstb.ora
SQL> show parameter control
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time integer 7
control_files string +DATA/sicsstb/controlfile/curr
ent.260.878897845, +RECO/sicsstb/controlfile/current.256.878
897845
control_management_pack_access string DIAGNOSTIC+TUNING
注意:此处的ASM无法启动,并不是笔者没有启动ASM组件。如果是简单因为ASM组件没有开启,先启动数据库服务的话,错误信息如下:
[oracle@NCR-Standby-Asm ~]$ sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Mon Jun 1 08:39:11 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected to an idle instance.
SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/sicsstb/spfilesicsstb.ora'
ORA-17503: ksfdopn:10 Failed to open file +DATA/sicsstb/spfilesicsstb.ora
ORA-15077: could not locate ASM instance serving a required diskgroup
nomount阶段要访问spfile,我们的SPFILE是在+DATA里面,如果ASM真的不可用的话,连nomount阶段都不能进入。
提示信息上,似乎是笔者的ASM驱动有问题。笔者操作系统环境是Red Hat Linux 6.5,使用kmod作为ASM驱动程序。
[root@NCR-Standby-Asm ~]# rpm -qa | grep asm
libatasmart-0.17-4.el6_2.x86_64
oracleasmlib-2.0.4-1.el6.x86_64
oracleasm-support-2.1.8-1.el6.x86_64
kmod-oracleasm-2.0.6.rh1-3.el6_5.x86_64
查找对应生成的trace文件,可以看到问题的更详细描述。
[root@NCR-Standby-Asm trace]# tail -n 200 sicsstb_rbal_4147.trc
Trace file /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/trace/sicsstb_rbal_4147.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u02/app/oracle/product/11.2.0/dbhome_1
System name: Linux
Node name: NCR-Standby-Asm
Release: 2.6.32-431.el6.x86_64
Version: #1 SMP Sun Nov 10 22:19:54 EST 2013
Machine: x86_64
VM name: VMWare Version: 6
Instance name: sicsstb
Redo thread mounted by this instance: 0
Oracle process number: 15
Unix process pid: 4147, image: oracle@NCR-Standby-Asm (RBAL)
*** 2015-05-25 16:09:31.634
*** SESSION ID:(190.1) 2015-05-25 16:09:31.634
*** CLIENT ID:() 2015-05-25 16:09:31.634
*** SERVICE NAME:() 2015-05-25 16:09:31.634
*** MODULE NAME:() 2015-05-25 16:09:31.634
*** ACTION NAME:() 2015-05-25 16:09:31.634
ERROR: asm_version error. err: driver/agent not installed rc:2
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ERROR: asm_version error. err: driver/agent not installed rc:2
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ERROR: asm_version error. err: driver/agent not installed rc:2
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ERROR: asm_version error. err: driver/agent not installed rc:2
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ERROR: asm_version error. err: driver/agent not installed rc:2
ORA-15183: ASMLIB initialization error [driver/agent not installed]
ORA-15183: ASMLIB initialization error [driver/agent not installed]
Incident 9721 created, dump file: /u02/app/oracle/diag/rdbms/sicsstb/sicsstb/incident/incdir_9721/sicsstb_rbal_4147_i9721.trc
ORA-00600: internal error code, arguments: [kfdskAlloc0], [], [], [], [], [], [], [], [], [], [], []
error 488 detected in background process
ORA-00600: internal error code, arguments: [kfdskAlloc0], [], [], [], [], [], [], [], [], [], [], []
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+465