今天在一套RAC环境下删除、切换其中一个实例的undo表空间时,遭遇ORA-30013及ORA-01548的错误,下面记录过程及解决方法。
1 创建新的UNDO表空间UNDOTBS3,并将当前实例的UNDO切换到UNDOTBS3之后,删除旧的UNDOTBS1时,遇到下述错误:
SQL> drop tablespace undotbs1 including contents and datafiles;
drop tablespace undotbs1 including contents and datafiles
*
ERROR at line 1:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
2 从上可以看到UNDOTBS1目前正在被使用。查询MetaLink,Unable to Drop Undo Tablespace ORA-30013 [ID 835944.1]获取基本思路,依据该文档给出的提示,执行下述命令:
SQL> select segment_name,owner,tablespace_name,status from dba_rollback_segs
2 where tablespace_name='UNDOTBS1' and status = 'ONLINE';
SEGMENT_NAME OWNER TABLESPACE_NAME STATUS
---------------- ------ ------------------ ---------
_SYSSMU2$ PUBLIC UNDOTBS1 ONLINE
_SYSSMU3$ PUBLIC UNDOTBS1 ONLINE
_SYSSMU6$ PUBLIC UNDOTBS1 ONLINE
_SYSSMU8$ PUBLIC UNDOTBS1 ONLINE
SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */ KTUXESTA Status,
2 KTUXECFL Flags FROM x$ktuxe WHERE ktuxesta!='INACTIVE';
KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
2 38 583286 PREPARED SCO|COL|REV|DEAD|EXTDTX
3 29 982959 PREPARED SCO|COL|REV|DEAD|EXTDTX
6 14 945326 PREPARED SCO|COL|REV|DEAD|EXTDTX
8 7 957413 PREPARED SCO|COL|REV|DEAD|EXTDTX
13 19 507098 PREPARED SCO|COL|REV|EXTDTX
SQL> select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID STATE
---------------------- ----------------
14.28.100017 collecting
2.38.583286 prepared
8.7.957413 prepared
3.29.982959 prepared
6.14.945326 prepared
13.19.507098 prepared
6 rows selected.
3 从上步的结果,结合文档ID 835944.1,基本上可以找到问题的原因:当前数据库中有未结束的分布式事务,而这些未结束的分布式事务占用了UNDOTBS1,最终导致不能删除UNDOTBS1。 文档ID 835944.1给出的解决方案是结束掉这些分布式事务。MetaLink上给出的思路是参照 Note 401302.1文档,而不凑巧的是,现在这篇文档无法查看,不得已,尝试重启数据库并删除UNDOTBS1。 重启之后,alert日志里看到如下错误信息:
*** SERVICE NAME:(SYS$BACKGROUND) 2012-02-10 11:27:15.527
*** SESSION ID:(431.1) 2012-02-10 11:27:15.527
*** 2012-02-10 11:27:15.527
ERROR, tran=14.28.100017, session#=1, ose=0:
ORA-02019: connection description for remote database not found
ORA-02019: connection description for remote database not found
*** 2012-02-10 11:27:50.676
ERROR, tran=14.28.100017, session#=1, ose=0:
ORA-02019: connection description for remote database not found
ORA-02019: connection description for remote database not found
*** 2012-02-10 11:28:42.707
ERROR, tran=14.28.100017, session#=1, ose=0:
ORA-02019: connection description for remote database not found
ORA-02019: connection description for remote database not found
尝试再次删除UNDOTBS1:
SQL> drop tablespace undotbs1 including contents and datafiles;
drop tablespace undotbs1 including contents and datafiles
*
ERROR at line 1:
ORA-01548: active rollback segment '_SYSSMU2$' found, terminate dropping tablespace
这次报出ORA-01548的错误!!!原来还是分布式事务未提交导致的:
SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */ KTUXESTA Status,
2 KTUXECFL Flags FROM x$ktuxe WHERE ktuxesta!='INACTIVE';
KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
2 38 583286 PREPARED SCO|COL|REV|DEAD|EXTDTX
3 29 982959 PREPARED SCO|COL|REV|DEAD|EXTDTX
6 14 945326 PREPARED SCO|COL|REV|DEAD|EXTDTX
8 7 957413 PREPARED SCO|COL|REV|DEAD|EXTDTX
13 19 507098 PREPARED SCO|COL|REV|EXTDTX
SQL> select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID STATE
---------------------- ----------------
14.28.100017 collecting
2.38.583286 prepared
8.7.957413 prepared
3.29.982959 prepared
6.14.945326 prepared
13.19.507098 prepared
6 rows selected.
SQL>
4 继续MetaLink:ORA-1548 Dropping UNDO Tablespace Distributed Transaction Pending:Prepared / Dead [ID 1321093.1]根据该文档,执行下述命令:
SQL> Select segment_id,segment_name,status,tablespace_name
2 from dba_rollback_segs where status not in ('ONLINE','OFFLINE');
SEGMENT_ID SEGMENT_NAME STATUS TABLESPACE_NAME
---------- ------------------------------ ---------------- ------------------------------
2 _SYSSMU2$ PARTLY AVAILABLE UNDOTBS1
3 _SYSSMU3$ PARTLY AVAILABLE UNDOTBS1
6 _SYSSMU6$ PARTLY AVAILABLE UNDOTBS1
8 _SYSSMU8$ PARTLY AVAILABLE UNDOTBS1
SQL>
You find you have segments that are 'Partly Available' This usually means they still have active transactions pending and you can not drop the tablespace until the transaction is committed or rolled back. 当回滚段状态为Partly Available时,说明还是有事务没结束!!!
SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
2 KTUXESTA Status,
3 KTUXECFL Flags
4 FROM x$ktuxe
5 WHERE ktuxesta!='INACTIVE'
6 AND ktuxeusn
7 in(2,3,6,8);
KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
2 38 583286 PREPARED SCO|COL|REV|DEAD|EXTDTX
3 29 982959 PREPARED SCO|COL|REV|DEAD|EXTDTX
6 14 945326 PREPARED SCO|COL|REV|DEAD|EXTDTX
8 7 957413 PREPARED SCO|COL|REV|DEAD|EXTDTX
SQL> select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID STATE
---------------------- ----------------
14.28.100017 collecting
2.38.583286 prepared
8.7.957413 prepared
3.29.982959 prepared
6.14.945326 prepared
13.19.507098 prepared
6 rows selected.
SQL>
通过上面的结果,看到2、3、6、8号回滚段上有活动的事务。该文档依然指出解决方案是查看Note 401302.1文档,而该文档又无法打开,不得已Google之,参照 获得解决问题的最终方法。
5 根据 直接强制提交这些活动的分布式事务:
SQL> select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID STATE
---------------------- ----------------
14.28.100017 collecting
2.38.583286 prepared
8.7.957413 prepared
3.29.982959 prepared
6.14.945326 prepared
13.19.507098 prepared
6 rows selected.
SQL> commit force '2.38.583286';
Commit complete.
SQL> select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID STATE
---------------------- ----------------
14.28.100017 collecting
2.38.583286 forced commit
8.7.957413 prepared
3.29.982959 prepared
6.14.945326 prepared
13.19.507098 prepared
6 rows selected.
SQL> commit force '8.7.957413';
Commit complete.
SQL> commit force '3.29.982959';
Commit complete.
SQL> commit force '6.14.945326';
Commit complete.
SQL> commit force '13.19.507098';
Commit complete.
SQL> select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID STATE
---------------------- ----------------
14.28.100017 collecting
2.38.583286 forced commit
8.7.957413 forced commit
3.29.982959 forced commit
6.14.945326 forced commit
13.19.507098 forced commit
6 rows selected.
SQL> Select segment_id,segment_name,status,tablespace_name
2 from dba_rollback_segs where status not in ('ONLINE','OFFLINE');
no rows selected
SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
2 KTUXESTA Status,
3 KTUXECFL Flags
4 FROM x$ktuxe
5 WHERE ktuxesta!='INACTIVE'
6 AND ktuxeusn
7 in(2,3,6,8);
no rows selected
SQL> drop tablespace undotbs1 including contents and datafiles;
Tablespace dropped.
SQL>
6 终于干掉了那个旧的UNDO,而此时,alert日志里的信息如下:
Fri Feb 10 15:08:16 CST 2012
DISTRIB TRAN 44444444.D7D4863A714B974489CD48496956271900000000
is local tran 2.38.583286 (hex=02.26.8e676)
change pending prepared tran, scn=125560421 (hex=0.077be665)
to pending forced commit tran, scn= (hex=0.00000000)
Fri Feb 10 15:08:56 CST 2012
DISTRIB TRAN 44444444.D88A6D59B486E44D8BAD8DABFDCF289C00000000
is local tran 8.7.957413 (hex=08.07.e9be5)
change pending prepared tran, scn=194136242 (hex=0.0b9248b2)
to pending forced commit tran, scn= (hex=0.00000000)
Fri Feb 10 15:09:08 CST 2012
DISTRIB TRAN 44444444.4A67F0F3F3EA464081883577EE646AAB00000000
is local tran 3.29.982959 (hex=03.1d.effaf)
change pending prepared tran, scn=195270309 (hex=0.0ba396a5)
to pending forced commit tran, scn= (hex=0.00000000)
Fri Feb 10 15:09:20 CST 2012
DISTRIB TRAN 44444444.97CB87A6BAE9E943B761C9C7FDA9844600000000
is local tran 6.14.945326 (hex=06.0e.e6cae)
change pending prepared tran, scn=196753377 (hex=0.0bba37e1)
to pending forced commit tran, scn= (hex=0.00000000)
Fri Feb 10 15:09:36 CST 2012
DISTRIB TRAN 44444444.192AEFF6D316B2468F5D74FE5EBDC9EC00000000
is local tran 13.19.507098 (hex=0d.13.7bcda)
change pending prepared tran, scn=332059676 (hex=0.13cad41c)
to pending forced commit tran, scn= (hex=0.00000000)
Fri Feb 10 15:13:17 CST 2012
drop tablespace undotbs1 including contents and datafiles
Fri Feb 10 15:13:25 CST 2012
Deleted Oracle managed file +ORADATA/glndb/datafile/undotbs1.261.726057859
Completed: drop tablespace undotbs1 including contents and datafiles
7 总结:对于分布式事务,目前还不是很清楚。而这个案例中涉及到的情况还有可能出现更为复杂的情况,需要深入研究一下,而我遇到的这种情况属于比较简单的。