用_minimum_giga_scn解决无法启动的数据库


今天遇到一个数据库无法启动,看到alertlog中主要是ora 600和[2662]的报错:

SQL> startup
ORACLE instance started.
 
Total System Global Area 1076850392 bytes
Fixed Size                   736984 bytes
Variable Size             536870912 bytes
Database Buffers          536870912 bytes
Redo Buffers                2371584 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
 
 
SQL>

其中alertlog中报错:

Completed: ALTER DATABASE   MOUNT
Thu Jan 22 13:05:08 2009
ALTER DATABASE OPEN
Thu Jan 22 13:05:09 2009
Beginning crash recovery of 1 threads
Thu Jan 22 13:05:09 2009
Started first pass scan
Thu Jan 22 13:05:09 2009
Completed first pass scan
 0 redo blocks read, 0 data blocks need recovery
Thu Jan 22 13:05:09 2009
Started recovery at
 Thread 1: logseq 2, block 3, scn 0.43536037
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/ora9i/redo01.log
Thu Jan 22 13:05:09 2009
Ended recovery at
 Thread 1: logseq 2, block 3, scn 0.43556038
 0 data blocks read, 0 data blocks written, 0 redo blocks read
Crash recovery completed successfully
Thu Jan 22 13:05:10 2009
Thread 1 advanced to log sequence 3
Thread 1 opened at log sequence 3
  Current log# 2 seq# 3 mem# 0: /oracle/oradata/ora9i/redo02.log
Successful open of redo thread 1.
Thu Jan 22 13:05:10 2009
SMON: enabling cache recovery
Thu Jan 22 13:05:10 2009
Errors in file /oracle/admin/ora9i/udump/ora9i_ora_12968.trc:
ORA-00600: internal error code, arguments: [2662], [0], [43556042], [261], [2396789971], [4194729], [], []

遇到ora-600和2662的问题,我们一般有2种方法解决:

一种是在open的状态下:用alter session set events 'IMMEDIATE trace name adjust_scn level n';
一种是在mount状态下:用alter session set events '10015 trace name adjust_scn level n';

其中n的运算如下:
根据alertlog中的报错:
ORA-00600: internal error code, arguments: [2662], [0], [43556042], [261], [2396789971], [4194729], [], []
这边,我们把2662后的参数[2662],[a],[b],[c],[d],[e]…
[a] Current SCN WRAP
[b] Current SCN BASE
[c] dependent SCN WRAP
[d] dependent SCN BASE
[e] Where present this is the DBA where the dependent SCN came from.

其中scn可以用十六进制表示0Xffff.ffffffff。为了方便,oracle把前面的4个字节表示scn wrap,后面的8个字节表示scn base。scn最低值是0X0000.00000000,最高值是0Xffff.ffffffff。高位是scn wrap,低位是scn base。根据报错,我们需要把scn增进到dependent SCN WRAP为261。

而我们增进的level n,n是表示1g(即1024×1024×1024),也就是说,调整是以g为单位进行的。

而高位的scn wrap的一个1,即0X0001.00000000=0X000100000000(去掉便于分隔高低位的点)=100000000000000000000000000000000=2^32(即2乘以10的32次方)=4×2^30(4乘以2的30次方)=4×(1024×1024×1024)=4g。因此我们要增加到的scn,根据level n,n表示g,调整的level为4×261。即1044,再比这个数字大一些,我们可以设置成1045,1047都可以。

尝试用上述的方法去解决。由于是mount状态,因此只能用10015 trace name的adjust scn:
其中的隐含参数:

cat initora9i.ora
……
*.user_dump_dest='/oracle/admin/ora9i/udump'
*._allow_resetlogs_corruption=TRUE
"initora9i.ora" 47 lines, 1465 characters
SQL> startup nomount pfile='?/dbs/initora9i.ora'
ORACLE instance started.
 
Total System Global Area 1076850392 bytes
Fixed Size                   736984 bytes
Variable Size             536870912 bytes
Database Buffers          536870912 bytes
Redo Buffers                2371584 bytes
SQLalter database mount;
 
Database altered.
 
SQL> alter session set events '10015 trace name ADJUST_SCN level 1045';
 
Session altered.
 
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

alertlog中:

Thu Jan 22 13:27:56 2009
SMON: enabling cache recovery
Thu Jan 22 13:27:56 2009
Errors in file /oracle/admin/ora9i/udump/ora9i_ora_13322.trc:
ORA-00600: internal error code, arguments: [2662], [0], [43576046], [261], [2396789971], [4194729], [], []
  Thu Jan 22 13:28:37 2009
Errors in file /oracle/admin/ora9i/udump/ora9i_ora_13322.trc:
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [0], [43576046], [261], [2396789971], [4194729], [], []
Thu Jan 22 13:28:37 2009
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 13322
ORA-1092 signalled during: alter database open...

看到报错信息中的scn还是没有到达目标scn。版本是9i的,应该不会限制啊,根据在某些10g版本中需要另外一个隐含参数_allow_error_simulation,才能增进scn,继续修改初始化参数,尝试启动:

cat initora9i.ora
……
*.user_dump_dest='/oracle/admin/ora9i/udump'
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
"initora9i.ora" 47 lines, 1465 characters
SQL> startup mount pfile='?/dbs/initora9i.ora'
ORACLE instance started.
 
Total System Global Area 1076850392 bytes
Fixed Size                   736984 bytes
Variable Size             536870912 bytes
Database Buffers          536870912 bytes
Redo Buffers                2371584 bytes
Database mounted.
SQLalter session set events '10015 trace name ADJUST_SCN level 20000';
 
Session altered.
 
SQL> alter database open;

alertlog中报错依旧:

Ended recovery at
 Thread 1: logseq 5, block 3, scn 0.43616049
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Crash recovery completed successfully
Thu Jan 22 13:41:49 2009
Thread 1 advanced to log sequence 6
Thread 1 opened at log sequence 6
  Current log# 2 seq# 6 mem# 0: /oracle/oradata/ora9i/redo02.log
Successful open of redo thread 1.
Thu Jan 22 13:41:49 2009
SMON: enabling cache recovery
Thu Jan 22 13:41:49 2009
Errors in file /oracle/admin/ora9i/udump/ora9i_ora_13660.trc:
ORA-00600: internal error code, arguments: [2662], [0], [43616053], [261], [2396789971], [4194729], [], []

看来是不能用上述的方法了,小熊这个时候再次提出了一个隐含参数:_minimum_giga_scn,把该参数设置成1047再尝试启动:

cat initora9i.ora
……
*.user_dump_dest='/oracle/admin/ora9i/udump'
#*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
*._minimum_giga_scn=1047
"initora9i.ora" 47 lines, 1465 characters
SQL> startup mount pfile='?/dbs/initora9i.ora'
ORACLE instance started.
 
Total System Global Area 1076850392 bytes
Fixed Size                   736984 bytes
Variable Size             536870912 bytes
Database Buffers          536870912 bytes
Redo Buffers                2371584 bytes
Database mounted.
SQL> alter database open;
 
Database altered.
 
SQL>

数据库终于起来了!

查询了一下orafaq,这个参数是表示Minimum SCN to start with in 2^30 units ,2乘以10的三十次方,也就是1024×1024×1024,也就是g了。这个参数是oracle723就开始有了,表示最小scn的起始值1g,我们这边的scn wrap有261,因此需要4×261,再比这个稍微大一些,就得出1047了。

总结:在一般情况下,遇到ora-600,2662的报错,可以通过10015的adjust scn起来,但是遇到Current SCN WRAP和dependent SCN WRAP相距比较远,通过上述方法起不来,我们可以通过隐含参数_minimum_giga_scn直接设置最小scn,启动数据库。

· 【文章发布信息】发表于: 2009-01-22 @ 17:05:38 · ||分类: ..experience, Working case

1 条评论 »

  1. 伊思 于 2010-05-05 @ 10:39:28 留言

    我发现你们在写技术时总是不给别人讲原理和原因细节啊,这样才显得高深

RSS 为此帖反馈评论 · 反向跟踪 网站

留条评论