公司一台测试环境的基于linux 平台下 Oracle 11.2.0.3 的数据库,为开归档,未备份。 21号晚上,因/目录下 空间使用%100,oracle HOME目录在系统 / 目录下:
因硬盘资源占尽,不能连接操作,oracle 数据库挂起。
某人的操作,查看undotbs1 占用最大,通过mv 移动到 另一目录,同时系统被重启,使得undotbs1 数据文件损坏,不能使用,最后又做了一个rm 操作, 重启库,导致故障出现!
报错一:
Wed Jan 22 09:42:50 2014
ALTER DATABASE OPEN
Errors in file /u01/app/oracle/diag/rdbms/gtadata13/gtadata13/trace/gtadata13_dbw0_4245.trc:
ORA-01157: cannot identify/lock data file 3 - see DBWR trace file
ORA-01110: data file 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
ORA-27047: unable to read the header block of file
Linux-x86_64 Error: 25: Inappropriate ioctl for device
Additional information: 1
Wed Jan 22 09:42:52 2014
Checker run found 1 new persistent data failures
Errors in file /u01/app/oracle/diag/rdbms/gtadata13/gtadata13/trace/gtadata13_ora_4361.trc:
ORA-01157: cannot identify/lock data file 3 - see DBWR trace file
ORA-01110: data file 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
ORA-1157 signalled during: ALTER DATABASE OPEN...
--- 就是oracle 在mount后,不能加载到open 状态。
2 接下来操作: 因为undo tablespace 数据文件undotbs1 没有了,想通过重建一个undo 表空间 undotbs2 把数据库启动到open 状态
操作:
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1
SQL > CREATE UNDO TABLESPACE UNDOTBS2 DATAFILE '/XXXX.DBF' SIZE 32M AUTOEXTEND ON NEXT 32M MAXSIZE 10G; --重创建表空间
SQL > SELECT * FROM V$TABLESAPCE SELECT NAME,STATUS FROM V$DATAFILE -- 查询其状态值
SQL > ALTER SYSTEM SET UNDO_TABLESPACE=UNDOTBS2 SCOPE=BOTH -- 通过show parameter undo 查看是否使用。
3 此时,数据库可以open起来, 但是通过client ,或者其他用户连接时,报错:
报错二
SQL> conn input/INPUT
ERROR:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
ORA-02002: error while writing to audit trail
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
4 根据报错,发现不仅仅是 undotbs1数据文件有问题,还有开启了审计 audit: 如是
先关闭审计
SQL > SHOW PARAMETER AUDIT
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
audit_file_dest string /u01/app/oracle/admin/gtadata1
3/adump
audit_sys_operations boolean FALSE
audit_syslog_level string
audit_trail string DB
SQL > alter system set audit_trail=none scope=spfile -- 设置后需要重启库。 --具体见审计
5 再通过对undotbs1数据文件操作,使其offline 处理(看行否)
SQL > alter database datafile 3 offline drop ;
6 通过 v$logfile,dba_tablespaces, dba_data_files 查看数据表空间,数据文件的状态:
SQL> select tablespace_name,file_id,file_name from dba_data_files;
TABLESPACE_NAME FILE_ID FILE_NAME
------- ---------- -------------------------------------------------------------
USERS 4 /u01/app/oracle/oradata/gtadata13/users01.dbf
UNDOTBS1 3 /u01/app/oracle/oradata/gtadata13/undotbs01.dbf
SQL> select status,tablespace_name from dba_tablespaces;
STATUS TABLESPACE_NAME
--------- ------------------------------
ONLINE SYSTEM
ONLINE SYSAUX
ONLINE UNDOTBS1
7 此时发现undotbs1 数据文件还在,同时undotbs1 表空online
如是操作:
报错三
SQL> alter tablespace UNDOTBS1 offline;
alter tablespace UNDOTBS1 offline
*
ERROR at line 1:
ORA-01191: file 3 is already offline - cannot do a normal offline
ORA-01110: data file 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
--- 此时心想,怎么不能offline了,看能否风能 temporary offline
查询数据文件头,select FILE#,checkpoint_change#,recover, fuzzy from v$datafile_header;
最后通过 SQL> alter system checkpoint; --做一个检查点,再试试:
System altered.
SQL> alter tablespace undotbs1 offline temporary;
Tablespace altered.
再次通过 dba_tablespaces 查看 undotbs1 的状态,发现 是否offline。 offline 状态。
8 测试再看看能否通过其他用户连接或client 连接:
-- 发现ok,可以通过其他用户连接了,但是一些程序 涉及到报错:
报错四:
执行存储过程失败 ORA-00376: 此时无法读取文件 3
ORA-01110: 数据文件 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
ORA-06512: 在 "GTA_DATA.SP_QA_TIMELINESS", line 54
ORA-06512: 在 line 1
如是想了想 ,确实,因为undotbs1 是通过物理删除的,那么oracle 一致性 会是这些需要recovery恢复:
9 既然offline,可否删除掉,(估计比较麻烦,这回退给干掉了,怎么回退了?)
通过dba_rollback_segs 发现 还有很多 recovery 的undotbs1 段需要回滚恢复,是数据一致性。
SQL> select segment_name,tablespace_name,status from dba_rollback_segs;
SEGMENT_NAME TABLESPACE_NAME STATUS
------------------------------ ------------------------------ ----------------
SYSTEM SYSTEM ONLINE
_SYSSMU122_928896348$ UNDOTBS1 OFFLINE
_SYSSMU121_4101333926$ UNDOTBS1 OFFLINE
_SYSSMU120_471964226$ UNDOTBS1 OFFLINE
_SYSSMU119_3645569891$ UNDOTBS1 OFFLINE
_SYSSMU118_1816999230$ UNDOTBS1 OFFLINE
_SYSSMU117_3513527861$ UNDOTBS1 OFFLINE
_SYSSMU116_2167311593$ UNDOTBS1 OFFLINE
_SYSSMU90_1969094056$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU89_2804401042$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU88_3446396459$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU87_268667266$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU86_1912503840$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU85_2732352333$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU84_1805825668$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU83_1984855352$ UNDOTBS1 NEEDS RECOVERY
_SYSSMU212_1777710046$ UNDOTBS2 ONLINE
_SYSSMU211_3260590093$ UNDOTBS2 ONLINE
_SYSSMU210_1915944113$ UNDOTBS2 ONLINE
_SYSSMU209_2868303011$ UNDOTBS2 ONLINE
_SYSSMU208_3687438092$ UNDOTBS2 ONLINE
_SYSSMU207_752508113$ UNDOTBS2 ONLINE
此时,百度,及询问了一些高手,说最好做个备份: 如是想通过expdp 导入导出:
报错五:
[oracle@gtadata13 dump_dir]$ impdp dcsys/DCSYS directory=dump_dir dumpfile=TBL_CHN_FN_ForecFin.dmp
Import: Release 11.2.0.3.0 - Production on Wed Jan 22 14:40:30 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORA-31626: job does not exist
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.KUPV$FT", line 1042
ORA-31637: cannot create job SYS_IMPORT_FULL_01 for user DCSYS
ORA-31632: master table "DCSYS.SYS_IMPORT_FULL_01" not found, invalid, or inaccessible
ORA-31635: unable to establish job resource synchronization
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.KUPV$FT_INT", line 2401
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/gtadata13/undotbs01.dbf'
-- 这也不行,看来,只能老实的弄了
10 ,打算删除 这offline undotbs1表空间,看是否跳过:
报错六:
SQL> drop tablespace undotbs1;
drop tablespace undotbs1
*
ERROR at line 1:
ORA-01548: active rollback segment '_SYSSMU1_1240252155$' found, terminate dropping tablespace
SQL> DROP ROLLBACK SEGMENT "_SYSSMU1_1240252155$";
DROP ROLLBACK SEGMENT "_SYSSMU1_1240252155$"
*
ERROR at line 1:
ORA-30025: DROP segment '_SYSSMU1_1240252155$' (in undo tablespace) not allowed
再次通过百度,高手请教: 发现需要在pfile 上 添加隐藏参数文件_offline_rollback_segments (‘xx’)和 _corrupted_rollback_segments ('xx') 后再删除,看否跳过
在pfile中加入参数
_offline_rollback_segments=(‘’)
_corrupted_rollback_segments=(‘’) ---括号参数为dba_rollback_segs中 undotbs1 status 为need recovery 状态的这种值“_SYSSMU122_928896348$”
10 : 于是通过 pfile添加影藏参数 或者
alter system set _offline_rollback_segments = " 值 " socpe=spfile
alter system set _corrupted_rollback_segments = " 值 " socpe=spfile 进行操作。
当时我通过重建pfile参数文件 *._offline_rollback_segments=('_SYSSMU90_1969094056$',。。。。)
*._corrupted_rollback_segments=('_SYSSMU90_1969094056$', 来操作
然后 通过删除所有 dba_rollback_segs 下的所有值后,在drop undotbs1 表空间:
SQL> drop rollback segment "_SYSSMU1_1240252155$"; ---注意双引号不能有空格
Rollback segment dropped. ---对应的值,一个一个删除。
11 : 最后删除 undotbs1 表空间
---ok,可以删除了,再通过dba_rollback_segs发现,没有了undtotbs1 的表空间了。
SQL> select segment_name,tablespace_name,status from dba_rollback_segs;
SEGMENT_NAME TABLESPACE_NAME STATUS
------------------------------ ------------------------------ ----------------
SYSTEM SYSTEM ONLINE
_SYSSMU212_1777710046$ UNDOTBS2 ONLINE
_SYSSMU211_3260590093$ UNDOTBS2 ONLINE
_SYSSMU210_1915944113$ UNDOTBS2 ONLINE
_SYSSMU209_2868303011$ UNDOTBS2 ONLINE
_SYSSMU208_3687438092$ UNDOTBS2 ONLINE
_SYSSMU207_752508113$ UNDOTBS2 ONLINE
_SYSSMU206_883733676$ UNDOTBS2 ONLINE
_SYSSMU205_725465268$ UNDOTBS2 ONLINE
_SYSSMU204_1401227473$ UNDOTBS2 ONLINE
_SYSSMU203_3100642042$ UNDOTBS2 ONLINE
12 : 扫尾: a: 恢复原来好审计功能设置,
b: 多切换几次,查看业务数据
c: 这样操作,虽然 可以了,但是有部分业务数据丢失
d: 做好备份
e: 就像大师说的,遇事,莫急躁