只要存在有效的备份,恢复SYSTEM表空间数据文件丢失故障是比较容易的。这里演示的是最基本的使用物理备份恢复SYSTEM表空间丢失的方法。
1.环境准备
我们在Oracle11g中进行测试,数据库处于非归档状态。
SQL>
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL>
SQL> archive log list;
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch
Oldest online log sequence 6
Current log sequence 8
SQL>
2.打tar包,进行物理备份
首先要弄清两个概念:打包和压缩。打包是指将一大堆文件或目录变成一个总的文件;压缩则是将一个大的文件通过一些压缩算法变成一个小文件。linux下最常用的打包程序就是tar了,使用tar程序打出来的包我们常称为tar包,tar包文件的命令通常都是以.tar结尾的。生成tar包后,就可以用其它的程序来进行压缩。
我们使用tar命令将HOEGH数据库的物理文件打tar包,命名为HOEGH.tar.gz。注意,物理备份必须是在数据库关停阶段进行。
[oracle@hoegh oradata]$ tar -zcvf HOEGH.tar.gz HOEGH
HOEGH/
HOEGH/redo03.log
HOEGH/temp01.dbf
HOEGH/control01.ctl
HOEGH/control02.ctl
HOEGH/system01.dbf
HOEGH/sysaux01.dbf
HOEGH/users01.dbf
HOEGH/undotbs01.dbf
HOEGH/example01.dbf
HOEGH/redo02.log
HOEGH/redo01.log
3.启动数据库,删除system数据文件
下面,我们来模拟system数据文件丢失的故障场景。
首先,启动数据库。
SQL> startup
ORACLE instance started.
Total System Global Area 941600768 bytes
Fixed Size 1348860 bytes
Variable Size 515902212 bytes
Database Buffers 419430400 bytes
Redo Buffers 4919296 bytes
Database mounted.
Database opened.
SQL>
接下来,删除system01.dbf数据文件。
[oracle@hoegh HOEGH]$ rm system01.dbf
[oracle@hoegh HOEGH]$
4.重启数据库报错ORA-01157和ORA-01110
删除数据文件后,我们重启数据库,数据库在尝试启动到open状态时,由于找不到system表空间的数据文件,报错。
SQL>
SQL> shu immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL> startup
ORACLE instance started.
Total System Global Area 941600768 bytes
Fixed Size 1348860 bytes
Variable Size 515902212 bytes
Database Buffers 419430400 bytes
Redo Buffers 4919296 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/u01/app/oracle/oradata/HOEGH/system01.dbf'
SQL>
SQL> select status from v$instance;
STATUS
------------
MOUNTED
SQL>
我们看到,这个时候数据库处于mount状态。查看报警日志文件,我们可以更清晰的看到整个过程。
[oracle@hoegh trace]$ pwd
/u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace
[oracle@hoegh trace]$
[oracle@hoegh trace]$ tailf alert_HOEGH.log
……
ALTER DATABASE MOUNT
Successful mount of redo thread 1, with mount id 2106090167
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE MOUNT
Sat Jul 11 09:01:47 2015
ALTER DATABASE OPEN
Errors in file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_dbw0_6016.trc:
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/u01/app/oracle/oradata/HOEGH/system01.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
Errors in file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_ora_6135.trc:
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/u01/app/oracle/oradata/HOEGH/system01.dbf'
ORA-1157 signalled during: ALTER DATABASE OPEN...
5.恢复数据文件
我们需要把之前的数据备份恢复到数据库当中,因此,首先我们就要解tar包,恢复之前备份的数据文件;然后,将备份的system数据文件拷贝到HOEGH数据文件目录当中。
[oracle@hoegh oradata]$ mkdir -p back
[oracle@hoegh oradata]$
[oracle@hoegh oradata]$ tar -zxvf HOEGH.tar.gz -C back/
HOEGH/
HOEGH/redo03.log
HOEGH/temp01.dbf
HOEGH/control01.ctl
HOEGH/control02.ctl
HOEGH/system01.dbf
HOEGH/sysaux01.dbf
HOEGH/users01.dbf
HOEGH/undotbs01.dbf
HOEGH/example01.dbf
HOEGH/redo02.log
HOEGH/redo01.log
[oracle@hoegh oradata]$
[oracle@hoegh oradata]$ cp back/HOEGH/system01.dbf HOEGH/
[oracle@hoegh oradata]$
6.恢复数据库
首先,尝试使用alter database open;命令打开数据库,我们看到系统提示需要进行介质恢复。
接下来,使用recover database;命令恢复数据库;
最后,再次使用alter database open;命令打开数据库。
SQL>
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/u01/app/oracle/oradata/HOEGH/system01.dbf'
SQL>
SQL> recover database;
Media recovery complete.
SQL>
SQL> alter database open;
Database altered.
SQL>
SQL> select status from v$instance;
STATUS
------------
OPEN
SQL>
此时我们看到数据库已经处于open状态了,至此我们成功地使用物理备份恢复了之前“丢失”的system数据文件。通过alert报警日志我们再来看一下介质恢复以及打开数据库的整个过程。
Sat Jul 11 09:02:46 2015
alter database open
Errors in file /u01/app/oracle/diag/rdbms/HOEGH/HOEGH/trace/HOEGH_ora_6135.trc:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/u01/app/oracle/oradata/HOEGH/system01.dbf'
ORA-1113 signalled during: alter database open...
ALTER DATABASE RECOVER database
Media Recovery Start
started logmerger process
Sat Jul 11 09:02:53 2015
Recovering data file 1 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Parallel Media Recovery started with 2 slaves
Recovery of Online Redo Log: Thread 1 Group 1 Seq 7 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/HOEGH/redo01.log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 8 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/HOEGH/redo02.log
Media Recovery Complete (HOEGH)
Completed: ALTER DATABASE RECOVER database
Sat Jul 11 09:03:23 2015
alter database open
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Started redo scan
Completed redo scan
read 0 KB redo, 0 data blocks need recovery
Started redo application at
Thread 1: logseq 8, block 878, scn 919739
Recovery of Online Redo Log: Thread 1 Group 2 Seq 8 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/HOEGH/redo02.log
Completed redo application of 0.00MB
Completed crash recovery at
Thread 1: logseq 8, block 878, scn 939740
0 data blocks read, 0 data blocks written, 0 redo k-bytes read
Sat Jul 11 09:03:24 2015
Thread 1 advanced to log sequence 9 (thread open)
Thread 1 opened at log sequence 9
Current log# 3 seq# 9 mem# 0: /u01/app/oracle/oradata/HOEGH/redo03.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Jul 11 09:03:24 2015
SMON: enabling cache recovery
[6135] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:1328894 end:1328914 diff:20 (0 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Jul 11 09:03:24 2015
QMNC started with pid=22, OS id=6188
Completed: alter database open
从报警日志我们看到,数据库通过redo重做日志文件完成了介质恢复;然后,检查redo、undo等正常后,顺利打开数据库。
备份第一
备份是系统中需要考虑的最重要的事项,虽然它在系统的整个规划、开发和测试过程中甚至占不到1%,看似不太重要且默默无闻的工作只有到恢复的时候才能真正体现出其重要性,任何数据的丢失与长时间的数据down机,都是不可以被接受的。当然,仅仅依靠物理备份做不到这一点,RMAN的功能更为强大。
: