rman备份发现坏块之后的处理


今天收到某省的备份发生失败的告警:

……
四月 16 16:30:19 ur_bak01: NetWorker savegroup: (alert) urmdborafull completed, total 2 client(s), 0 Hostname(s) Unresolved, 1 Failed, 1 Succeeded. (ur_mdb01 Failed)
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on t2 channel at 03/20/2009 13:49:09
ORA-19566: exceeded limit of 0 corrupt blocks for file /dev/vg_mdb02/rdata_2g_050
ORA-000060: Deadlock detected. More info in file /oracle/app/oracle/admin/uradt/udump/uradt_ora_3035.trc.
***
Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107)
Fractured block found during backing up datafile
Data in bad block -
 type: 0 format: 0 rdba: 0x00000000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 consistency value in tail: 0x00000000
 check value in block header: 0x0, block checksum disabled
 spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of blocknum=251107, file=/dev/vg_mdb02/rdata_2g_050. found same corrupt data
Thu Apr 16 16:31:04 2009
……

用dbv检查发现有至少有45个坏块:

[oracle@ur_mdb01 /oracle$]dbv file=/dev/vg_mdb02/rdata_2g_050 BLOCKSIZE=8192
 
DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 09:55:07 2009
 
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
 
DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050
Page 251107 is influx - most likely media corrupt
***
Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107)
Fractured block found during dbv:
Data in bad block -
 type: 0 format: 0 rdba: 0x00000000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 consistency value in tail: 0x00000000
 check value in block header: 0x0, block checksum disabled
 spare1: 0x0, spare2: 0x0, spare3: 0x0
***
 
Page 251108 is marked corrupt
***
Corrupt block relative dba: 0x1a43d4e4 (file 105, block 251108)
Bad header found during dbv:
Data in bad block -
 type: 181 format: 6 rdba: 0x00000000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 consistency value in tail: 0x00000000
 check value in block header: 0x0, block checksum disabled
 spare1: 0x7, spare2: 0xc, spare3: 0x0
***
 
……
 
Corrupt block relative dba: 0x1a43d56f (file 105, block 251247)
Bad header found during dbv:
Data in bad block -
 type: 65 format: 5 rdba: 0x527002c2
 last change scn: 0x3131.02063033 seq: 0x30 flg: 0x31
 consistency value in tail: 0x3635032d
 check value in block header: 0x180, block checksum disabled
 spare1: 0x50, spare2: 0x72, spare3: 0x430
***
 
 
 
DBVERIFY - Verification complete
 
Total Pages Examined         : 262016
Total Pages Processed (Data) : 60240
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 568
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 201163
Total Pages Marked Corrupt   : 45
Total Pages Influx           : 11
Highest block SCN            : 10816042273 (2.2226107681)

经检查,发现这些坏块上没有任何数据对象:

方法1.可以找到这个数据文件所在的表空间,这个表空间下的所有对象:如果都是表,可以对这些表都做一次全部扫描,发现没有报错。如果都是索引,可以都走一次iff,没有报错的话,说明对象都不在坏块上。
方法2.用select * from dba_extents where file_id=坏块的file_id and 坏块的block_id between block_id and block_id+blocks-1;如果所有的坏块执行这个sql检查出来的结果都是返回0行,说明没有对象在坏块上。

我们用格式化的方式来清除这样的坏块:

--预估剩余空间大小:
SQL> select tablespace_name,file_id,sum(bytes)/1024/1024 size_m from dba_free_space
 
2  where file_id=105 group by tablespace_name,file_id;
 
TABLESPACE_NAME                   FILE_ID     SIZE_M
----------------------------
-- ---------- ----------
LOG_P8                                105       1643
 
SQL>
 
 
--创建一个表在这个表空间上,用于做数据块的格式化:
SQL> create table for_fix(n number,c varchar2(4000)) nologging tablespace LOG_P8;
 
Table created.
 
SQL>
SQL>
SQL>
 
--填充满这个表:
SQL> alter table for_fix allocate extent(datafile '/dev/vg_mdb02/rdata_2g_050' size 1643m);
 
Table altered.
 
SQL>
SQL> insert into for_fix select rownum,rpad('dedwedew',3900) from dba_objects;
 
7285 rows created.
 
SQL> /
 
7285 rows created.
 
SQL> /
 
……
 
SQLinsert into for_fix select * from for_fix where rownum<=1000000;
 
1000000 rows created.
 
Elapsed: 00:00:06.96
SQL> /
 
1000000 rows created.
 
Elapsed: 00:00:03.50
SQL> /
 
10000 rows created.
 
Elapsed: 00:00:14.03
……
 
SQL> insert into for_fix select * from for_fix where rownum<=50000;
 
50000 rows created.
 
Elapsed: 00:00:39.28
SQL>
SQL>
 
……
 
SQL> insert into for_fix select * from for_fix where rownum<=1;
 
1 row created.
 
Elapsed: 00:00:00.00
SQL> /
 
1 row created.
 
Elapsed: 00:00:00.00
SQL> /
 
1 row created.
 
Elapsed: 00:00:00.00
SQL> /
 
1 row created.
 
Elapsed: 00:00:00.00
SQL> /
 
1 row created.
 
Elapsed: 00:00:00.01
SQL> /
insert into for_fix select * from for_fix where rownum<=1
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.FOR_FIX by 128 in tablespace LOG_P8
 
 
Elapsed: 00:00:00.00
SQL>
SQL>
SQL> commit;
 
Commit complete.
 
Elapsed: 00:00:00.07
SQL>
SQL>
 
--记住,这里一定要做
checkpoint,将所有的数据刷新到数据文件,不然不能保证所有的坏块被格式化:
SQL> alter system checkpoint;
 
System altered.
 
Elapsed: 00:00:04.25
SQL> /
 
System altered.
 
Elapsed: 00:00:00.15
SQL> /
 
System altered.
 
Elapsed: 00:00:00.14

填充满这个表后,所以的块都被得到格式化,包括坏块也被格式化了。此时我们用dbv再次检查:

[oracle@ur_mdb01 /oracle$]dbv file=/dev/vg_mdb02/rdata_2g_050 BLOCKSIZE=8192
 
DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 13:30:43 2009
 
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
 
DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050
 
 
DBVERIFY - Verification complete
 
Total Pages Examined         : 262016
Total Pages Processed (Data) : 260715
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1181
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 120
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 10816271785 (2.2226337193)
[oracle@ur_mdb01 /oracle$]

坏块已经处理掉了,测试rman备份正常。

· 【文章发布信息】发表于: 2009-04-20 @ 14:44:25 · ||分类: ..experience, Working case

7 条评论 »

  1. paulyibinyi 于 2009-04-20 @ 20:34:19 留言

    这种坏快是不是游离坏块

  2. 小荷 于 2009-04-21 @ 02:05:00 留言

    re paulyibinyi:是没有任何对象在上面的坏块。

  3. paulyibinyi 于 2009-04-23 @ 16:50:53 留言

    to 小荷 好的知道了 谢谢

  4. 匿名 于 2009-08-03 @ 16:54:41 留言

    为什么不用rman的blockrecover命令修复呢?

  5. 小荷 于 2009-08-03 @ 21:16:39 留言

    re 匿名:blockrecover在9i中只能在mount状态下使用,这个系统是7×24的在线系统,因此只能采用这个方式了。

  6. yyyll 于 2010-04-01 @ 22:49:15 留言

    blockrecover是可以online进行的。

  7. zhangjunhua 于 2010-05-26 @ 00:40:23 留言

    无备份这应该算是唯一的恢复方法

RSS 为此帖反馈评论 · 反向跟踪 网站

留条评论