建表空间的时候数据库宕机


今天在建表空间的时候,数据库不知道被谁停了,在建的时候报oracle not avaliable,等再次确认没有“雷锋”帮我shutdown数据库之后,startup了数据库。启动时,没有报错,提示数据库已经mount、已经open。

可是,再次重建表空间的时候,却报错了数据文件已经存在:

SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse
 
2  extent management local segment space management auto uniform size 1024k;
create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse
*
ERROR at line 1:
ORA-01537: cannot add data file '/opt/oradata/dfile/ts_data_r_001g_001.dbf' - file already part of database

检查了一下这个数据文件,发现在v$datafile中能找到,且状态是offine状态,但是在dba_data_files中找不到:

SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- ---------------------------------------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 OFFLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.
 
SQL> select tablespace_name ,file_name from dba_data_files;
 
TABLESPACE_NAME                FILE_NAME
----------------------------
-- --------------------------------------------------------
SYSTEM                         /opt/oradata/dfile/system01.dbf
UNDOTBS1                       /opt/oradata/dfile/undotbs01.dbf
INDX                           /opt/oradata/dfile/indx01.dbf
TOOLS                          /opt/oradata/dfile/tools01.dbf
USERS                          /opt/oradata/dfile/users01.dbf
 
SQL>

为了能新(重)建表空间,尝试把表空间drop,但是报错表空间已经不存在:

SQL> drop tablespace TS_DATA_R including CONTENTS and datafiles;
drop tablespace TS_DATA_R including CONTENTS and datafiles
*
ERROR at line 1:
ORA-00959: tablespace 'TS_DATA_R' does not exist

那只能先offline drop 数据文件了(数据库在非归档模式):

SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf'offline drop;
 
Database altered.
 
SQL>

ok!!在此以为总大功告成了吧,来我们重建表空间!可是接下来的事情却让人那么的沮丧:

SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse
 
2  extent management local segment space management auto uniform size 1024k;
create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse
*
ERROR at line 1:
ORA-01537: cannot add data file '/opt/oradata/dfile/ts_data_r_001g_001.dbf' - file already part of database
 
 
SQL> drop tablespace TS_DATA_R including CONTENTS and datafiles;
drop tablespace TS_DATA_R including CONTENTS and datafiles
*
ERROR at line 1:
ORA-00959: tablespace 'TS_DATA_R' does not exist
 
 
SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- -----------------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 OFFLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.
 
SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf'offline drop;
 
Database altered.
 
SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- -------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 OFFLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.
 
SQL> drop tablespace TS_DATA_R;
drop tablespace TS_DATA_R
*
ERROR at line 1:
ORA-00959: tablespace 'TS_DATA_R' does not exist

由于ts_data_r_001g_001.dbf文件一直存在,无法offline drop,无论怎么操作,都能在v$datafile中看到它,那么我索性就干脆先把它online在尝试offline drop吧:

SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' online;
 
Database altered.
 
SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- ----------------------------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 ONLINE  /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.
 
SQL> alter database open;
 
Database altered.
 
SQL> select tablespace_name ,file_name from dba_data_files;
 
TABLESPACE_NAME                FILE_NAME
----------------------------
-- ------------------------------------
SYSTEM                         /opt/oradata/dfile/system01.dbf
UNDOTBS1                       /opt/oradata/dfile/undotbs01.dbf
INDX                           /opt/oradata/dfile/indx01.dbf
TOOLS                          /opt/oradata/dfile/tools01.dbf
USERS                          /opt/oradata/dfile/users01.dbf
 
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
 
Total System Global Area 1646861168 bytes
Fixed Size                   453488 bytes
Variable Size             687865856 bytes
Database Buffers          956301312 bytes
Redo Buffers                2240512 bytes
Database mounted.
Database opened.
SQL> select tablespace_name ,file_name from dba_data_files;
 
TABLESPACE_NAME                FILE_NAME
----------------------------
-- ----------------------------------
SYSTEM                         /opt/oradata/dfile/system01.dbf
UNDOTBS1                       /opt/oradata/dfile/undotbs01.dbf
INDX                           /opt/oradata/dfile/indx01.dbf
TOOLS                          /opt/oradata/dfile/tools01.dbf
USERS                          /opt/oradata/dfile/users01.dbf
 
SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- ---------------------------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 ONLINE  /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.
 
SQL>
SQL>
SQL>
SQL>
SQL> drop tablespace TS_DATA_R including contents and datafiles;
drop tablespace TS_DATA_R including contents and datafiles
*
ERROR at line 1:
ORA-00959: tablespace 'TS_DATA_R' does not exist
 
 
SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf'offline drop;
 
Database altered.
 
SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- ---------------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 RECOVER /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.
 
SQL>

这里,终于看到ts_data_r_001g_001.dbf再次执行offline drop之后,状态变成recover了……(但是也奇怪,如果offline drop了,这里应该也是看不到才对呀)。

好,我们现在再试试重建表空间:

SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse
 
2  extent management local segment space management auto uniform size 1024k;
create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse
*
ERROR at line 1:
ORA-01537: cannot add data file '/opt/oradata/dfile/ts_data_r_001g_001.dbf' - file already part of database
 
 
SQL> alter database datafile 6 offline drop;
 
Database altered.
 
SQL> select TS#,status,name from v$datafile;
 
      
TS# STATUS  NAME
--------
-- ------- ------------------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
        
6 RECOVER /opt/oradata/dfile/ts_data_r_001g_001.dbf
 
6 rows selected.

不幸的是,这个该死的ts_data_r_001g_001.dbf始终无法去掉!!

开始想偏方了……我为何不建一个表空间ts_data_r,且用另外一个数据文件ts_data_r_001g_002.dbf,等建立之后,我再drop tablespace,这样不就可以把ts_data_r_001g_001.dbf一起干掉了吗?

SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_002.dbf' size 1024m
 
2  extent management local segment space management auto uniform size 1024k;
create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_002.dbf' size 1024m
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error

orz……终于把oracle惹毛了,直接把我的session断掉了!!此时去看alertlog中,一堆600的报错:

……
[oracle@localhost bdump]$ tail -400 alert_ora9i.log
ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], []
Mon Mar 31 15:38:02 2008
Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc:
ORA-00600: internal error code, arguments: [kccocx_01], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], []
Mon Mar 31 15:38:02 2008
Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc:
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], []
Mon Mar 31 15:38:02 2008
Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc:
ORA-00600: internal error code, arguments: [kccocx_01], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], []
Mon Mar 31 15:38:02 2008
Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc:
ORA-00600: internal error code, arguments: [kccocx_01], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], []
ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], []
……

仔细看了一下报600错之前的alertlog,偶滴神!!原来alertlog中在意外宕机再次启动的时候,已经有报错信息和相关处理建议了。本来觉得v$datafile是mount时候能查到的视图,dba_data_files是open时候能查到的视图,2者不一致,能用绝招就是重建控制文件,但是……死活想用别的办法解决,就是不想重建控制文件!!-_-!!偶TMD还真固执:

……
SMON: enabling tx recovery
Fri Mar 28 11:51:03 2008
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN
Mon Mar 31 14:52:03 2008
create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' size 1024m
extent management local segment space management auto uniform size 1024k
Mon Mar 31 14:52:12 2008
WARNING! CONTROLFILE SEQUENCE NUMBER TOO OLD, RE-READING...
WARNING! CONTROLFILE SEQUENCE NUMBER TOO OLD, RE-READING...
WARNING! CONTROLFILE SEQUENCE NUMBER TOO OLD, RE-READING...
********************* ATTENTION: ********************
 The controlfile header block returned by the OS
 has a sequence number that is too old.
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely,
 please do the following:
 (1) Save all copies of the controlfile for later
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue:
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance.
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database.
*****************************************************
Instance terminated by DBW0, pid = 4437
……

最后,shutdown immedate,挂死,pmon切出大量600报错的日志,再尝试shutdown abort,终于关闭。再次startup,smon切出大量日志,提示数据库mount、数据库open,open之后,用alter database backup controlfile to trace as ‘/oracle/111.txt’获取重建控制文件的脚本,修改脚本中ts_data_r_001g_001.dbf存在的这个文件。再尝试shutdown immediate,正常关闭,再次startup 到nomount后重建控制文件,open resetlogs,终于……搞定了!:

SQL> select TS#,status,name from v$datafile;
 
 
      
TS# STATUS  NAME
--------
-- ------- ----------------------
        
0 SYSTEM  /opt/oradata/dfile/system01.dbf
        
1 ONLINE  /opt/oradata/dfile/undotbs01.dbf
        
3 ONLINE  /opt/oradata/dfile/indx01.dbf
        
4 ONLINE  /opt/oradata/dfile/tools01.dbf
        
5 ONLINE  /opt/oradata/dfile/users01.dbf
 
SQL> select tablespace_name ,file_name from dba_data_files;
 
TABLESPACE_NAME                FILE_NAME
----------------------------
-- ---------------------
SYSTEM                         /opt/oradata/dfile/system01.dbf
UNDOTBS1                       /opt/oradata/dfile/undotbs01.dbf
INDX                           /opt/oradata/dfile/indx01.dbf
TOOLS                          /opt/oradata/dfile/tools01.dbf
USERS                          /opt/oradata/dfile/users01.dbf
 
SQL>
SQL>
SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' size 1024m reuse
 
2  extent management local segment space management auto uniform size 1024k;
 
Tablespace created.

        本次故障解决的过程其实难度并不大,但是给了我们一个教训:对于异常宕机,再次启动,虽然在启动的时候不提示什么,但是只要是异常宕机,就一定要看alertlog!!

· 【文章发布信息】发表于: 2008-04-01 @ 01:14:50 · ||分类: ..experience, Working case

5 条评论 »

  1. NinGoo 于 2008-04-01 @ 09:17:57 留言

    alert应当实时监控,有错误就发告警短信和邮件

  2. David.Guo 于 2008-04-01 @ 11:22:14 留言

    明显没有看我的blog
    这个错误和提示,我们已经出现过一次了,哈哈

  3. 小荷 于 2008-04-01 @ 20:11:40 留言

    re NinGoo:你们alertlog是怎么做监控的呢,是grep ORA的关键字吗?

    re David.Guo:你的网站偶也常去的,不过,不可能每篇文章都记得住呀~~

  4. David.Guo 于 2008-04-02 @ 07:39:52 留言

    to NinGoo,俺们的db是电话通知,呵呵

  5. yanggq 于 2009-05-04 @ 23:03:58 留言

    没有24*7monitor scripts?真可怜的数据库…

RSS 为此帖反馈评论 · 反向跟踪 网站

留条评论