rman备份发现坏块之后的处理
今天收到某省的备份发生失败的告警:
……
四月 16 16:30:19 ur_bak01: NetWorker savegroup: (alert) urmdborafull completed, total 2 client(s), 0 Hostname(s) Unresolved, 1 Failed, 1 Succeeded. (ur_mdb01 Failed)
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on t2 channel at 03/20/2009 13:49:09
ORA-19566: exceeded limit of 0 corrupt blocks for file /dev/vg_mdb02/rdata_2g_050
ORA-000060: Deadlock detected. More info in file /oracle/app/oracle/admin/uradt/udump/uradt_ora_3035.trc.
***
Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107)
Fractured block found during backing up datafile
Data in bad block -
type: 0 format: 0 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x00000000
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of blocknum=251107, file=/dev/vg_mdb02/rdata_2g_050. found same corrupt data
Thu Apr 16 16:31:04 2009
……
四月 16 16:30:19 ur_bak01: NetWorker savegroup: (alert) urmdborafull completed, total 2 client(s), 0 Hostname(s) Unresolved, 1 Failed, 1 Succeeded. (ur_mdb01 Failed)
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on t2 channel at 03/20/2009 13:49:09
ORA-19566: exceeded limit of 0 corrupt blocks for file /dev/vg_mdb02/rdata_2g_050
ORA-000060: Deadlock detected. More info in file /oracle/app/oracle/admin/uradt/udump/uradt_ora_3035.trc.
***
Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107)
Fractured block found during backing up datafile
Data in bad block -
type: 0 format: 0 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x00000000
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of blocknum=251107, file=/dev/vg_mdb02/rdata_2g_050. found same corrupt data
Thu Apr 16 16:31:04 2009
……
用dbv检查发现有至少有45个坏块:
[oracle@ur_mdb01 /oracle$]dbv file=/dev/vg_mdb02/rdata_2g_050 BLOCKSIZE=8192
DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 09:55:07 2009
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050
Page 251107 is influx - most likely media corrupt
***
Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107)
Fractured block found during dbv:
Data in bad block -
type: 0 format: 0 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x00000000
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Page 251108 is marked corrupt
***
Corrupt block relative dba: 0x1a43d4e4 (file 105, block 251108)
Bad header found during dbv:
Data in bad block -
type: 181 format: 6 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x00000000
check value in block header: 0x0, block checksum disabled
spare1: 0x7, spare2: 0xc, spare3: 0x0
***
……
Corrupt block relative dba: 0x1a43d56f (file 105, block 251247)
Bad header found during dbv:
Data in bad block -
type: 65 format: 5 rdba: 0x527002c2
last change scn: 0x3131.02063033 seq: 0x30 flg: 0x31
consistency value in tail: 0x3635032d
check value in block header: 0x180, block checksum disabled
spare1: 0x50, spare2: 0x72, spare3: 0x430
***
DBVERIFY - Verification complete
Total Pages Examined : 262016
Total Pages Processed (Data) : 60240
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 568
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 201163
Total Pages Marked Corrupt : 45
Total Pages Influx : 11
Highest block SCN : 10816042273 (2.2226107681)
DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 09:55:07 2009
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050
Page 251107 is influx - most likely media corrupt
***
Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107)
Fractured block found during dbv:
Data in bad block -
type: 0 format: 0 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x00000000
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Page 251108 is marked corrupt
***
Corrupt block relative dba: 0x1a43d4e4 (file 105, block 251108)
Bad header found during dbv:
Data in bad block -
type: 181 format: 6 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x00000000
check value in block header: 0x0, block checksum disabled
spare1: 0x7, spare2: 0xc, spare3: 0x0
***
……
Corrupt block relative dba: 0x1a43d56f (file 105, block 251247)
Bad header found during dbv:
Data in bad block -
type: 65 format: 5 rdba: 0x527002c2
last change scn: 0x3131.02063033 seq: 0x30 flg: 0x31
consistency value in tail: 0x3635032d
check value in block header: 0x180, block checksum disabled
spare1: 0x50, spare2: 0x72, spare3: 0x430
***
DBVERIFY - Verification complete
Total Pages Examined : 262016
Total Pages Processed (Data) : 60240
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 568
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 201163
Total Pages Marked Corrupt : 45
Total Pages Influx : 11
Highest block SCN : 10816042273 (2.2226107681)
经检查,发现这些坏块上没有任何数据对象:
方法1.可以找到这个数据文件所在的表空间,这个表空间下的所有对象:如果都是表,可以对这些表都做一次全部扫描,发现没有报错。如果都是索引,可以都走一次iff,没有报错的话,说明对象都不在坏块上。
方法2.用select * from dba_extents where file_id=坏块的file_id and 坏块的block_id between block_id and block_id+blocks-1;如果所有的坏块执行这个sql检查出来的结果都是返回0行,说明没有对象在坏块上。
我们用格式化的方式来清除这样的坏块:
--预估剩余空间大小:
SQL> select tablespace_name,file_id,sum(bytes)/1024/1024 size_m from dba_free_space
2 where file_id=105 group by tablespace_name,file_id;
TABLESPACE_NAME FILE_ID SIZE_M
------------------------------ ---------- ----------
LOG_P8 105 1643
SQL>
--创建一个表在这个表空间上,用于做数据块的格式化:
SQL> create table for_fix(n number,c varchar2(4000)) nologging tablespace LOG_P8;
Table created.
SQL>
SQL>
SQL>
--填充满这个表:
SQL> alter table for_fix allocate extent(datafile '/dev/vg_mdb02/rdata_2g_050' size 1643m);
Table altered.
SQL>
SQL> insert into for_fix select rownum,rpad('dedwedew',3900) from dba_objects;
7285 rows created.
SQL> /
7285 rows created.
SQL> /
……
SQL> insert into for_fix select * from for_fix where rownum<=1000000;
1000000 rows created.
Elapsed: 00:00:06.96
SQL> /
1000000 rows created.
Elapsed: 00:00:03.50
SQL> /
10000 rows created.
Elapsed: 00:00:14.03
……
SQL> insert into for_fix select * from for_fix where rownum<=50000;
50000 rows created.
Elapsed: 00:00:39.28
SQL>
SQL>
……
SQL> insert into for_fix select * from for_fix where rownum<=1;
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.01
SQL> /
insert into for_fix select * from for_fix where rownum<=1
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.FOR_FIX by 128 in tablespace LOG_P8
Elapsed: 00:00:00.00
SQL>
SQL>
SQL> commit;
Commit complete.
Elapsed: 00:00:00.07
SQL>
SQL>
--记住,这里一定要做checkpoint,将所有的数据刷新到数据文件,不然不能保证所有的坏块被格式化:
SQL> alter system checkpoint;
System altered.
Elapsed: 00:00:04.25
SQL> /
System altered.
Elapsed: 00:00:00.15
SQL> /
System altered.
Elapsed: 00:00:00.14
SQL> select tablespace_name,file_id,sum(bytes)/1024/1024 size_m from dba_free_space
2 where file_id=105 group by tablespace_name,file_id;
TABLESPACE_NAME FILE_ID SIZE_M
------------------------------ ---------- ----------
LOG_P8 105 1643
SQL>
--创建一个表在这个表空间上,用于做数据块的格式化:
SQL> create table for_fix(n number,c varchar2(4000)) nologging tablespace LOG_P8;
Table created.
SQL>
SQL>
SQL>
--填充满这个表:
SQL> alter table for_fix allocate extent(datafile '/dev/vg_mdb02/rdata_2g_050' size 1643m);
Table altered.
SQL>
SQL> insert into for_fix select rownum,rpad('dedwedew',3900) from dba_objects;
7285 rows created.
SQL> /
7285 rows created.
SQL> /
……
SQL> insert into for_fix select * from for_fix where rownum<=1000000;
1000000 rows created.
Elapsed: 00:00:06.96
SQL> /
1000000 rows created.
Elapsed: 00:00:03.50
SQL> /
10000 rows created.
Elapsed: 00:00:14.03
……
SQL> insert into for_fix select * from for_fix where rownum<=50000;
50000 rows created.
Elapsed: 00:00:39.28
SQL>
SQL>
……
SQL> insert into for_fix select * from for_fix where rownum<=1;
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.00
SQL> /
1 row created.
Elapsed: 00:00:00.01
SQL> /
insert into for_fix select * from for_fix where rownum<=1
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.FOR_FIX by 128 in tablespace LOG_P8
Elapsed: 00:00:00.00
SQL>
SQL>
SQL> commit;
Commit complete.
Elapsed: 00:00:00.07
SQL>
SQL>
--记住,这里一定要做checkpoint,将所有的数据刷新到数据文件,不然不能保证所有的坏块被格式化:
SQL> alter system checkpoint;
System altered.
Elapsed: 00:00:04.25
SQL> /
System altered.
Elapsed: 00:00:00.15
SQL> /
System altered.
Elapsed: 00:00:00.14
填充满这个表后,所以的块都被得到格式化,包括坏块也被格式化了。此时我们用dbv再次检查:
[oracle@ur_mdb01 /oracle$]dbv file=/dev/vg_mdb02/rdata_2g_050 BLOCKSIZE=8192
DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 13:30:43 2009
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050
DBVERIFY - Verification complete
Total Pages Examined : 262016
Total Pages Processed (Data) : 260715
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 1181
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 120
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 10816271785 (2.2226337193)
[oracle@ur_mdb01 /oracle$]
DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 13:30:43 2009
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050
DBVERIFY - Verification complete
Total Pages Examined : 262016
Total Pages Processed (Data) : 260715
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 1181
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 120
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 10816271785 (2.2226337193)
[oracle@ur_mdb01 /oracle$]
坏块已经处理掉了,测试rman备份正常。
· 【文章发布信息】发表于: 2009-04-20 @ 14:44:25 · ||分类: ..experience, Working case



CopyRight ©
paulyibinyi 于 2009-04-20 @ 20:34:19 留言 :
这种坏快是不是游离坏块
小荷 于 2009-04-21 @ 02:05:00 留言 :
re paulyibinyi:是没有任何对象在上面的坏块。
paulyibinyi 于 2009-04-23 @ 16:50:53 留言 :
to 小荷 好的知道了 谢谢
匿名 于 2009-08-03 @ 16:54:41 留言 :
为什么不用rman的blockrecover命令修复呢?
小荷 于 2009-08-03 @ 21:16:39 留言 :
re 匿名:blockrecover在9i中只能在mount状态下使用,这个系统是7×24的在线系统,因此只能采用这个方式了。
yyyll 于 2010-04-01 @ 22:49:15 留言 :
blockrecover是可以online进行的。
zhangjunhua 于 2010-05-26 @ 00:40:23 留言 :
无备份这应该算是唯一的恢复方法