有朋友跟我说了一个关于ORA-600[4000]错误的恢复,他是这么做的:
1、 插入了2000条记录,但不commit;
2、 马上shutdown abort;
3、 接着rm掉所有的redo log;
他尝试过一些恢复手段后,碰到了ORA-600[4000]错误。
此时他陷入了一种两难的境地:如果他不用_corrupted_rollback_segments,则上述ORA-600[4000]无法
解决;如果他用了_corrupted_rollback_segments,则Oracle报错:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number with name "" too small
我看了他发过来的trace文件和操作步骤,里面还是有一些不必要的操作。
这样吧,我在这里就把他做过的事情再做一遍,然后我会尝试恢复上述数据库,希望如下的过程能对朋友们有所帮助:
首先我插入2000条记录但不commit:
SQL_testdb>conn scott/tiger@testdb;
Connected.
SQL_testdb>create table testtb (type number,ts timestamp);
Table created.
SQL_testdb>begin
2 for i in 1..2000 loop
3 insert into testtb values (i,sysdate);
4 end loop;
5 end;
6 /
PL/SQL procedure successfully completed.
然后起另外一个session,执行shutdown abort:
SQL_testdb>shutdown abort
ORACLE instance shut down.
最后我把所有的redo log都rm掉:
$ ls -l
total 2543072
-rw-r----- 1 oracle dba 1531904 Dec 23 18:59 control01.ctl
-rw-r----- 1 oracle dba 1531904 Dec 23 18:59 control02.ctl
-rw-r----- 1 oracle dba 1531904 Dec 23 18:59 control03.ctl
-rw-r--r-- 1 oracle dba 20979712 Dec 23 18:27 cwmlite01.dbf
-rw-r--r-- 1 oracle dba 20979712 Dec 23 18:27 drsys01.dbf
-rw-r--r-- 1 oracle dba 144842752 Dec 23 18:27 example01.dbf
-rw-r--r-- 1 oracle dba 26222592 Dec 23 18:27 indx01.dbf
-rw-r--r-- 1 oracle dba 18 Dec 23 18:47 login.sql
-rw-r--r-- 1 oracle dba 20979712 Dec 23 18:27 odm01.dbf
-rw-r----- 1 oracle dba 104858112 Dec 23 18:27 redo01.log
-rw-r----- 1 oracle dba 104858112 Dec 23 18:27 redo02.log
-rw-r----- 1 oracle dba 104858112 Dec 23 18:59 redo03.log
-rw-r--r-- 1 oracle dba 419438592 Dec 23 18:58 system01.dbf
-rw-r--r-- 1 oracle dba 42999808 Dec 23 18:17 temp01.dbf
-rw-r--r-- 1 oracle dba 10493952 Dec 23 18:27 tools01.dbf
-rw-r--r-- 1 oracle dba 209723392 Dec 23 18:59 undotbs01.dbf
-rw-r--r-- 1 oracle dba 26222592 Dec 23 18:27 users01.dbf
-rw-r--r-- 1 oracle dba 39985152 Dec 23 18:27 xdb01.dbf
$rm redo*.log
$ ls -l
total 1928648
-rw-r----- 1 oracle dba 1531904 Dec 23 18:59 control01.ctl
-rw-r----- 1 oracle dba 1531904 Dec 23 18:59 control02.ctl
-rw-r----- 1 oracle dba 1531904 Dec 23 18:59 control03.ctl
-rw-r--r-- 1 oracle dba 20979712 Dec 23 18:27 cwmlite01.dbf
-rw-r--r-- 1 oracle dba 20979712 Dec 23 18:27 drsys01.dbf
-rw-r--r-- 1 oracle dba 144842752 Dec 23 18:27 example01.dbf
-rw-r--r-- 1 oracle dba 26222592 Dec 23 18:27 indx01.dbf
-rw-r--r-- 1 oracle dba 18 Dec 23 18:47 login.sql
-rw-r--r-- 1 oracle dba 20979712 Dec 23 18:27 odm01.dbf
-rw-r--r-- 1 oracle dba 419438592 Dec 23 18:58 system01.dbf
-rw-r--r-- 1 oracle dba 42999808 Dec 23 18:17 temp01.dbf
-rw-r--r-- 1 oracle dba 10493952 Dec 23 18:27 tools01.dbf
-rw-r--r-- 1 oracle dba 209723392 Dec 23 18:59 undotbs01.dbf
-rw-r--r-- 1 oracle dba 26222592 Dec 23 18:27 users01.dbf
-rw-r--r-- 1 oracle dba 39985152 Dec 23 18:27 xdb01.dbf
好了,现在我们来开始恢复。
现在直接open resetlog肯定是打不开的:
SQL_testdb>startup mount
ORACLE instance started.
Total System Global Area 505382744 bytes
Fixed Size 743256 bytes
Variable Size 285212672 bytes
Database Buffers 218103808 bytes
Redo Buffers 1323008 bytes
Database mounted.
SQL_testdb>recover database until cancel;
ORA-00279: change 188425 generated at 12/23/2009 18:27:25 needed for thread 1
ORA-00289: suggestion : /cadrasu01/app/oracle/product/9.2.0/dbs/arch1_1.dbf
ORA-00280: change 188425 for thread 1 is in sequence #1
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/dras21/testdb/system01.dbf'
ORA-01112: media recovery not started
SQL_testdb>alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/dras21/testdb/system01.dbf'
我们来创建pfile:
SQL_testdb>create pfile='/dras21/testdb/inittestdb.ora' from spfile;
File created.
然后我们在pfile中加入*._allow_resetlogs_corruption=TRUE后再次open resetlog:
SQL_testdb>startupmountpfile='/dras21/testdb/inittestdb.ora';
ORACLE instance started.
Total System Global Area 505382744 bytes
Fixed Size 743256 bytes
Variable Size 285212672 bytes
Database Buffers 218103808 bytes
Redo Buffers 1323008 bytes
Database mounted.
SQL_testdb>alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
此时的alert log里会记录:
Errors in file /cadrasu01/app/oracle/admin/testdb/udump/testdb_ora_7090390.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number with name "" too small
Wed Dec 23 19:20:51 2009
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 7090390
ORA-1092 signalled during: alter database open resetlogs...
这里我们重现了那位朋友提到的错误。
上述trace文件(即/cadrasu01/app/oracle/admin/testdb/udump/testdb_ora_7090390.trc)不具备参
考价值,我们来做一个10046,步骤为:
SQL_testdb>startupmountpfile='/dras21/testdb/inittestdb.ora';
ORACLE instance started.
Total System Global Area 505382744 bytes
Fixed Size 743256 bytes
Variable Size 285212672 bytes
Database Buffers 218103808 bytes
Redo Buffers 1323008 bytes
Database mounted.
SQL_testdb>oradebug setmypid
Statement processed.
SQL_testdb>oradebug unlimit
Statement processed.
SQL_testdb>oradebug event 10046 trace name context forever,level 12
Statement processed.
SQL_testdb>recover database until cancel;
ORA-00279: change 208432 generated at 12/23/2009 19:45:06 needed for thread 1
ORA-00289: suggestion : /cadrasu01/app/oracle/product/9.2.0/dbs/arch1_1.dbf
ORA-00280: change 208432 for thread 1 is in sequence #1
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/dras21/testdb/system01.dbf'
ORA-01112: media recovery not started
SQL_testdb>alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
此时,我们再去分析上述10046产生的trace文件,直接看上述trace文件的最末尾,我们发现有这样的内
容:
EXEC #10:c=0,e=236,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=1232000925678362
WAIT #10: nam='db file sequential read' ela= 176p1=1 p2=202p3=1
WAIT #10: nam='db file sequential read' ela= 183p1=1 p2=106p3=1
FETCH #10:c=0,e=454,p=2,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=1232000925678842
FETCH #4:c=0,e=1713,p=3,cr=5,cu=0,mis=0,r=0,dep=1,og=4,tim=1232000925678907
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number with name "" too small
EXEC #1:c=200000,e=5922169,p=75,cr=705,cu=1,mis=0,r=0,dep=0,og=4,tim=1232000925679637
ERROR #1:err=1092 tim=1602843205
也就是说oracle在读file 1,block 202和file 1,block 106的时候可能出了问题。
我用BBED看了上述block,没发现有什么异常,也就是说这条路已经走不下去了。
好了,我们来换一条路,直接open:
SQL_testdb>alter database open;
alter database open
*
ERROR at line 1:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4193], [0], [57], [], [], [], [], []
呵呵,见到4193了,那这个库肯定可以打开了。
我们把*._corrupted_rollback_segments=(_SYSSMU1$, _SYSSMU2$, _SYSSMU3$, _SYSSMU4$, _SYSSMU5
$, _SYSSMU6$, _SYSSMU7$, _SYSSMU8$, _SYSSMU9$, _SYSSMU10$)加到上述pfile里再次open:
SQL_testdb>startupmountpfile='/dras21/testdb/inittestdb.ora';
ORACLE instance started.
Total System Global Area 505382744 bytes
Fixed Size 743256 bytes
Variable Size 285212672 bytes
Database Buffers 218103808 bytes
Redo Buffers 1323008 bytes
Database mounted.
SQL_testdb>alter database open;
Database altered.
好了,现在我们已经成功把上述数据库给打开了,接下来的收尾工作就是要切换undo tablespace,重建
spfile,这个大家肯定都知道,我这里无需再赘述。
最后我们来看一下那2000条数据还在不在:
SQL_testdb>conn scott/tiger@testdb;
Connected.
SQL_testdb>select count(*) from testtb;
select count(*) from testtb
*
ERROR at line 1:
ORA-00942: table or view does not exist
正如kamus所说,online redo log缺失通常意味着必然会有数据的丢失。
Categories:
oracle experiences
Tags:
recover of missing current online redo log
No TrackBacks
TrackBack URL: http://dbsnake.com/cgi-bin/mt/mt-tb.cgi/98
1 Comment
张晓明|December 24, 2009 11:07 AM|Reply
我又做了一次,新建了个库(原来那个库没了)
建库后
session 1
SQL> create tablespace zxm datafile '/zxm/testzxm1/zxm.dbf' size 100m
2 ;
Tablespace created.
SQL> create user zxm identified by zxm default tablespace zxm;
User created.
SQL> grant connect,resource to zxm;
Grant succeeded.
SQL> alter system archive log current;
System altered.
SQL> alter system switch logfile;
System altered.
SQL> conn zxm/zxm
Connected.
SQL> create table a (id number,name varchar2(200));
Table created.
SQL> declare
2 i number;
3 begin
4 i:=1;
5 while i 6 loop
7 insert into a values (i,'asfdasfasfd'||i);
8 i:=i+1;
9 end loop;
10 end;
11 /
PL/SQL procedure successfully completed.
session 2
SQL> shutdown abort;
ORACLE instance shut down.
rm *.log
然后开始测试
SQL> startup mount;
SQL> recover database until cancel;
ORA-00279: change 179545 generated at 12/24/2009 09:17:16 needed for thread 1
ORA-00289: suggestion : /zxm/testzxm1/1_10_706437772.dbf
ORA-00280: change 179545 for thread 1 is in sequence #10
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/zxm/testzxm1/testzxm1/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/zxm/testzxm1/testzxm1/system01.dbf'
SQL> create pfile='/zxm/inita.ora' from spfile;
File created.
添加 _resetlogs....参数
SQL> startup pfile='/zxm/inita.ora' mount;
ORACLE instance started.
Total System Global Area 209715200 bytes
Fixed Size 2019608 bytes
Variable Size 79695592 bytes
Database Buffers 121634816 bytes
Redo Buffers 6365184 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01248: file 5 was created in the future of incomplete recovery
ORA-01110: data file 5: '/zxm/testzxm1/zxm.dbf'
SQL> alter database datafile 5 offline drop;
Database altered.
SQL> alter database open resetlogs;
Database altered.
: