3小时的sql调优到3分钟。
今天接到这样一个问题,某省的报表系统的一个某个处理进程在前几天处理的速度突然变慢,而且从应用的log上还看到1555的报错:
[2009-02-25 01:36:55] [200010000000007] [5]: servid(015027016400) not found in map_servid_servattr!!!
[2009-02-25 03:24:55] [200010000600008] [0]: 从光标中fetch数据出错 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6$" too small
当时第一个反应就是加大undo表空间大小和undo retention参数。但是,之前的程序跑的还比较正常,没经常性的报1555,现在突然执行很久都不出结果来,很大程度我就怀疑是执行计划发生了改变。
问了应用的同学,该程序是一个etl程序,先到报表系统取去时间值,根据时间到核心数据库查订购关系的历史,做完之后,update一个config表。大致的程序如下:
stmt1=select to_char(trunc(incdatatime),'yyyymmdd') old_date from dasc_subs_dataexport_task_cfg where taskname=1
type2=S2
stmt2=select \
to_char(actiondate,'yyyymmddhh24miss') actiondate,nvl(actionplace,5) actionplace,icpid, \
subsid, actionid,servid,\
decode(SUBSCAT, 1,0, 2,1, 3,1, 4,0, 5,1, 6,1, 7,2, 8,2, 9,3, 10,3) SUBSCAT,\
chargesubsid,nvl(channelid,'9999') channelid \
from subscription_history \
where ( actionid='SCS' or actionid='SGS' or actionid='ADS' ) and \
( actiondate >= to_date(:old_date,'yyyymmdd') ) and \
( actiondate < to_date(:old_date,'yyyymmdd')+1 ) and \
( subscat between 1 and 10 ) and \
( length(SUBSID) = 11 ) and \
( ( chargesubsid is not null ) and ( length(chargesubsid) = 11 ) )
type3=T1
stmt3=update dasc_time_config set config_value=sysdate where config_name='SUBSDATAEXP_SUBSCRIPTION'
而在alertlog中,报错1555的正是中间的那段sql:
ORA-01555 caused by SQL statement below (Query Duration=12053 sec, SCN: 0x0002.b7d94829):
Wed Feb 25 03:24:49 2009
select to_char(actiondate,'yyyymmddhh24miss') actiondate,nvl(actionplace,5) actionplace,icpid, subsid, actionid,servid, decode(SUBSCAT, 1,0, 2,1, 3,1, 4,0, 5,1, 6,1, 7,2,8,2, 9,3, 10,3) SUBSCAT, chargesubsid,nvl(channelid,'9999') channelid from subscription_history where ( actionid='SCS' or actionid='SGS' or actionid='ADS' ) and ( actiondate >= to_date(:old_date,'yyyymmdd') ) and ( actiondate < to_date(:old_date,'yyyymmdd')+1 ) and ( subscat between 1 and 10 ) and ( length(SUBSID) = 11 ) and ( ( chargesubsid is not null ) and ( length(chargesubsid) = 11 ) )
Wed Feb 25 03:34:57 2009
Thread 1 advanced to log sequence 64166
从时间上,该sql从0点开始,跑到了3点45,而undo retention是设置了3小时,已经超过了retention的值。除了这次执行时间在3小时,在应用的log中还发现从2月26日开始就有多次出现执行时间在3小时以上的。开始检查这个表的相关结构和sql的执行计划:
这个表是分区表,按照actiondate进行rang分区,每月一个分区,用于存放历史记录,目前已经有95G多。
sql的执行计划为:
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 89 | | |
|* 1 | FILTER | | | | | | |
| 2 | PARTITION RANGE ITERATOR | | | | | KEY | KEY |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| SUBSCRIPTION_HISTORY | 1 | 65 | 89 | KEY | KEY |
|* 4 | INDEX FULL SCAN | IDX_SUBNHISTORY_CHARGESUBSID | 441K| | 88 | KEY | KEY |
---------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_DATE(:Z,'yyyymmdd')<TO_DATE(:Z,'yyyymmdd')+1)
3 - filter(("SUBSCRIPTION_HISTORY"."ACTIONID"='ADS' OR "SUBSCRIPTION_HISTORY"."ACTIONID"='SCS' OR
"SUBSCRIPTION_HISTORY"."ACTIONID"='SGS') AND "SUBSCRIPTION_HISTORY"."ACTIONDATE">=TO_DATE(:Z,'yyyymmdd') AND
"SUBSCRIPTION_HISTORY"."ACTIONDATE"<TO_DATE(:Z,'yyyymmdd')+1 AND "SUBSCRIPTION_HISTORY"."SUBSCAT">=1 AND
"SUBSCRIPTION_HISTORY"."SUBSCAT"<=10 AND LENGTH("SUBSCRIPTION_HISTORY"."SUBSID")=11)
4 - filter("SUBSCRIPTION_HISTORY"."CHARGESUBSID" IS NOT NULL AND LENGTH("SUBSCRIPTION_HISTORY"."CHARGESUBSID")=11)
Note: cpu costing is off
22 rows selected.
SQL>
我们发现是走索引的index full scan。
尝试带入变量,查看其执行计划:
2 select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
3 nvl(actionplace, 5) actionplace,
icpid,
subsid,
actionid,
servid,
decode(SUBSCAT,
4 5 6 7 8 9 1,
0,
2,
1,
3,
1,
4,
10 11 12 13 14 15 16 0,
5,
1,
6,
1,
7,
2,
17 18 19 20 21 22 23 8,
2,
9,
3,
10,
3) SUBSCAT,
24 25 26 27 28 29 chargesubsid,
nvl(channelid, '9999') channelid
from subscription_history
where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
and (actiondate >= to_date('20090225', 'yyyymmdd'))
30 31 32 33 34 and (actiondate < to_date('20090225', 'yyyymmdd') + 1)
and (subscat between 1 and 10)
and (length(SUBSID) = 11)
and ((chargesubsid is not null) and (length(chargesubsid) = 11)) 35 36 37
38 /
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 2 | | |
|* 1 | TABLE ACCESS FULL | SUBSCRIPTION_HISTORY | 1 | 65 | 2 | 52 | 52 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(("SUBSCRIPTION_HISTORY"."ACTIONID"='ADS' OR
"SUBSCRIPTION_HISTORY"."ACTIONID"='SCS' OR "SUBSCRIPTION_HISTORY"."ACTIONID"='SGS') AND
"SUBSCRIPTION_HISTORY"."ACTIONDATE">=TO_DATE('2009-02-25 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
AND "SUBSCRIPTION_HISTORY"."ACTIONDATE"<TO_DATE('2009-02-26 00:00:00', 'yyyy-mm-dd
hh24:mi:ss') AND "SUBSCRIPTION_HISTORY"."SUBSCAT">=1 AND "SUBSCRIPTION_HISTORY"."SUBSCAT"<=10
AND LENGTH("SUBSCRIPTION_HISTORY"."SUBSID")=11 AND "SUBSCRIPTION_HISTORY"."CHARGESUBSID" IS
NOT NULL AND LENGTH("SUBSCRIPTION_HISTORY"."CHARGESUBSID")=11)
Note: cpu costing is off
20 rows selected.
SQL>
发现根据程序带入的变量,是走full table scan的!!
看上去,走tsc的cost比走ifs的cost小很多,实际效果如何,我们再来通过set timing on 和autotrace来看看:
SQL>
SQL> ---- 不使用绑定变量的:
SQL> select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
2 nvl(actionplace, 5) actionplace,
icpid,
subsid,
actionid,
servid,
decode(SUBSCAT,
3 4 5 6 7 8 1,
0,
2,
1,
3,
1,
9 10 11 12 13 14 4,
0,
5,
1,
6,
1,
7,
15 16 17 18 19 20 21 2,
8,
2,
9,
3,
22 23 24 25 26 10,
3) SUBSCAT,
chargesubsid,
nvl(channelid, '9999') channelid
from subscription_history
27 28 29 30 31 where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
and (actiondate >= to_date('20090224', 'yyyymmdd'))
and (actiondate < to_date('20090224', 'yyyymmdd') + 1)
and (subscat between 1 and 10)
and (length(SUBSID) = 11)
and ((chargesubsid is not null) and (length(chargesubsid) = 11)) 32 33 34 35 36
37 /
301951 rows selected.
Elapsed: 00:03:25.08
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=65)
1 0 TABLE ACCESS (FULL) OF 'SUBSCRIPTION_HISTORY' (Cost=2 Card
=1 Bytes=65)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
622076 consistent gets
468647 physical reads
0 redo size
17199853 bytes sent via SQL*Net to client
222086 bytes received via SQL*Net from client
20132 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
301951 rows processed
SQL>
SQL>
SQL> ---- 使用绑定变量:
SQL>
SQL> set autotrace trace
SQL> var old_date VARCHAR2(20);
SQL> exec :old_date:=20090224;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
SQL> select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
2 nvl(actionplace, 5) actionplace,
icpid,
subsid,
actionid,
servid,
decode(SUBSCAT,
3 4 5 6 7 8 1,
0,
2,
1,
3,
1,
9 10 11 12 13 14 4,
0,
5,
1,
6,
1,
7,
15 16 17 18 19 20 21 2,
8,
2,
9,
3,
10,
3) SUBSCAT,
22 23 24 25 26 27 28 chargesubsid,
nvl(channelid, '9999') channelid
from subscription_history
where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
and (actiondate >= to_date(:old_date, 'yyyymmdd'))
and (actiondate < to_date(:old_date, 'yyyymmdd') + 1)
29 30 31 32 33 34 and (subscat between 1 and 10)
and (length(SUBSID) = 11)
and ((chargesubsid is not null) and (length(chargesubsid) = 11)) 35 36
37 /
301951 rows selected.
Elapsed: 00:03:31.43
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=89 Card=1 Bytes=65)
1 0 FILTER
2 1 PARTITION RANGE (ITERATOR)
3 2 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'SUBSCRIPTION_H
ISTORY' (Cost=89 Card=1 Bytes=65)
4 3 INDEX (FULL SCAN) OF 'IDX_SUBNHISTORY_CHARGESUBSID'
(NON-UNIQUE) (Cost=88 Card=441874)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
623306 consistent gets
432928 physical reads
0 redo size
17199853 bytes sent via SQL*Net to client
222086 bytes received via SQL*Net from client
20132 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
301951 rows processed
SQL>
SQL>
SQL>
SQL> set autotrace off
我们发现无论使用绑定变量走ifs还是不使用绑定变量走tsc,二者在时间上消耗都是类似的,基本都在三分半钟。为什么会跑3个小时还没出来呢?难道是etl程序的问题?
开始通过oradebug开始trace,一边让应用开始跑etl的压缩进程,一边在数据库端查trace。发现trace中的等待基本都是:
WAIT #1: nam='db file sequential read' ela= 7172 p1=826 p2=11416 p3=1
WAIT #1: nam='db file sequential read' ela= 234 p1=826 p2=11417 p3=1
WAIT #1: nam='db file sequential read' ela= 6315 p1=683 p2=120892 p3=1
WAIT #1: nam='db file sequential read' ela= 461 p1=826 p2=11418 p3=1
WAIT #1: nam='db file sequential read' ela= 234 p1=826 p2=11419 p3=1
WAIT #1: nam='db file sequential read' ela= 7028 p1=585 p2=102779 p3=1
WAIT #1: nam='db file sequential read' ela= 8525 p1=587 p2=85577 p3=1
WAIT #1: nam='db file sequential read' ela= 278 p1=587 p2=85578 p3=1
WAIT #1: nam='db file sequential read' ela= 287 p1=587 p2=85579 p3=1
WAIT #1: nam='db file sequential read' ela= 6794 p1=719 p2=126644 p3=1
WAIT #1: nam='db file sequential read' ela= 1400 p1=587 p2=85580 p3=1
WAIT #1: nam='db file sequential read' ela= 5880 p1=742 p2=112742 p3=1
WAIT #1: nam='db file sequential read' ela= 259 p1=742 p2=112743 p3=1
WAIT #1: nam='db file sequential read' ela= 9805 p1=742 p2=112744 p3=1
WAIT #1: nam='db file sequential read' ela= 416 p1=742 p2=112745 p3=1
WAIT #1: nam='db file sequential read' ela= 168 p1=742 p2=112746 p3=1
WAIT #1: nam='db file sequential read' ela= 4134 p1=722 p2=81027 p3=1
WAIT #1: nam='db file sequential read' ela= 10307 p1=759 p2=114703 p3=1
WAIT #1: nam='db file sequential read' ela= 6342 p1=759 p2=115072 p3=1
WAIT #1: nam='db file sequential read' ela= 8258 p1=783 p2=91289 p3=1
WAIT #1: nam='db file sequential read' ela= 284 p1=783 p2=91290 p3=1
WAIT #1: nam='db file sequential read' ela= 225 p1=783 p2=91291 p3=1
WAIT #1: nam='db file sequential read' ela= 5672 p1=829 p2=4691 p3=1
WAIT #1: nam='db file sequential read' ela= 370 p1=829 p2=4692 p3=1
WAIT #1: nam='db file sequential read' ela= 538 p1=829 p2=4693 p3=1
WAIT #1: nam='db file sequential read' ela= 2279 p1=718 p2=95053 p3=1
WAIT #1: nam='db file sequential read' ela= 417 p1=829 p2=4694 p3=1
WAIT #1: nam='db file sequential read' ela= 168 p1=829 p2=4695 p3=1
WAIT #1: nam='db file sequential read' ela= 5722 p1=454 p2=112829 p3=1
WAIT #1: nam='db file sequential read' ela= 286 p1=454 p2=112830 p3=1
WAIT #1: nam='db file sequential read' ela= 331 p1=454 p2=112831 p3=1
……
从trace来看,etl确实是走了ifs。等了15分钟左右吧,该等待事件仍然在继续,看来,走ifs,可能是要3个小时以上了,而不是我在sqlplus中测试的3分半钟。
于是叫应用把etl进程先停掉了,为sql加上full的hints,再开始跑,开始trace……
WAIT #1: nam='db file scattered read' ela= 4888 p1=820 p2=2498 p3=16
WAIT #1: nam='db file sequential read' ela= 326 p1=820 p2=2514 p3=1
WAIT #1: nam='db file scattered read' ela= 636 p1=820 p2=2516 p3=5
WAIT #1: nam='db file sequential read' ela= 321 p1=820 p2=2522 p3=1
WAIT #1: nam='db file scattered read' ela= 487 p1=820 p2=2524 p3=3
WAIT #1: nam='db file sequential read' ela= 314 p1=820 p2=2528 p3=1
WAIT #1: nam='db file scattered read' ela= 1295 p1=820 p2=2530 p3=16
WAIT #1: nam='db file scattered read' ela= 1017 p1=820 p2=2546 p3=10
WAIT #1: nam='db file scattered read' ela= 1084 p1=820 p2=2557 p3=6
WAIT #1: nam='db file scattered read' ela= 3544 p1=820 p2=2564 p3=16
WAIT #1: nam='db file scattered read' ela= 1712 p1=820 p2=2580 p3=16
WAIT #1: nam='db file scattered read' ela= 588 p1=820 p2=2596 p3=4
WAIT #1: nam='db file scattered read' ela= 987 p1=820 p2=2601 p3=9
WAIT #1: nam='db file scattered read' ela= 1137 p1=820 p2=2611 p3=13
WAIT #1: nam='db file scattered read' ela= 1302 p1=820 p2=2625 p3=16
WAIT #1: nam='db file scattered read' ela= 613 p1=820 p2=2641 p3=4
WAIT #1: nam='db file scattered read' ela= 1836 p1=820 p2=2646 p3=16
WAIT #1: nam='db file scattered read' ela= 507 p1=820 p2=2662 p3=3
WAIT #1: nam='db file scattered read' ela= 1550 p1=820 p2=2666 p3=16
WAIT #1: nam='db file scattered read' ela= 1151 p1=820 p2=2682 p3=7
WAIT #1: nam='db file scattered read' ela= 387 p1=820 p2=2690 p3=2
WAIT #1: nam='db file scattered read' ela= 4222 p1=820 p2=2693 p3=16
WAIT #1: nam='db file scattered read' ela= 406 p1=820 p2=2709 p3=2
WAIT #1: nam='db file sequential read' ela= 377 p1=820 p2=2712 p3=1
WAIT #1: nam='db file scattered read' ela= 2832 p1=820 p2=2714 p3=16
WAIT #1: nam='db file scattered read' ela= 383 p1=820 p2=2730 p3=2
WAIT #1: nam='db file scattered read' ela= 520 p1=820 p2=2733 p3=4
WAIT #1: nam='db file scattered read' ela= 495 p1=820 p2=2738 p3=3
WAIT #1: nam='db file scattered read' ela= 517 p1=820 p2=2742 p3=4
WAIT #1: nam='db file scattered read' ela= 1191 p1=820 p2=2747 p3=8
……
时间,3分钟跑完!!
在这里,sql优化的对策基本已经找到,就是走tsc。但是,为什么在2月26日之后会突然执行计划变差,变成ifs?为什么我在sqlplus上测试的走ifs也只要3分班钟,而etl跑的走ifs的却要超过3个小时?
为了找个答案,我再做了一次10053的trace,去看看oracle是怎么样选择执行路径的?
Dump file /oracle/app/oracle/admin/gxmisc/udump/gxmisc_ora_2495.trc
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.6.0 - Production
ORACLE_HOME = /oracle/app/oracle/product/9.2.0
System name: HP-UX
Node name: gx_db01
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: gxmisc
Redo thread mounted by this instance: 1
Oracle process number: 120
Unix process pid: 2495, image: oracle@gx_db01 (TNS V1-V3)
*** 2009-02-26 19:54:28.245
*** SESSION ID:(218.27134) 2009-02-26 19:54:28.202
QUERY
BEGIN :old_date:=20090224; END;
*** 2009-02-26 19:54:45.186
QUERY
select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
nvl(actionplace, 5) actionplace,
icpid,
subsid,
actionid,
servid,
decode(SUBSCAT,
1,
0,
2,
1,
3,
1,
4,
0,
5,
1,
6,
1,
7,
2,
8,
2,
9,
3,
10,
3) SUBSCAT,
chargesubsid,
nvl(channelid, '9999') channelid
from subscription_history
where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
and (actiondate >= to_date(:old_date, 'yyyymmdd'))
and (actiondate < to_date(:old_date, 'yyyymmdd') + 1)
and (subscat between 1 and 10)
and (length(SUBSID) = 11)
and ((chargesubsid is not null) and (length(chargesubsid) = 11))
***************************************
SINGLE TABLE ACCESS PATH
Column: ACTIONID Col#: 4 Part#: 51 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: ACTIONID Col#: 4 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: ACTIONDATE Col#: 6 Part#: 51 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: ACTIONDATE Col#: 6 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: SUBSCAT Col#: 20 Part#: 51 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: SUBSCAT Col#: 20 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: CHARGESUBS Col#: 21 Part#: 51 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: CHARGESUBS Col#: 21 Table: SUBSCRIPTION_HISTORY Alias: SUBSCRIPTION_HISTORY
NO STATISTICS (using defaults)
NDV: 3 NULLS: 0 DENS: 3.9024e-01
NO HISTOGRAM: #BKT: 0 #VAL: 0
TABLE: SUBSCRIPTION_HISTORY ORIG CDN: 82 ROUNDED CDN: 1 CMPTD CDN: 0
Access path: tsc Resc: 2 Resp: 2
Access path: index (no sta/stp keys)
Index: IDX_SUBNHISTORY_CHARGESUBSID
TABLE: SUBSCRIPTION_HISTORY
RSC_CPU: 0 RSC_IO: 26
IX_SEL: 1.0000e+00 TB_SEL: 1.0000e+00
Access path: index (no sta/stp keys)
Index: IDX_SUBNHISTORY_HISTORYID
TABLE: SUBSCRIPTION_HISTORY
RSC_CPU: 0 RSC_IO: 26
IX_SEL: 1.0000e+00 TB_SEL: 1.0000e+00
Access path: index (no sta/stp keys)
Index: IDX_SUBNHISTORY_MID
TABLE: SUBSCRIPTION_HISTORY
RSC_CPU: 0 RSC_IO: 26
IX_SEL: 1.0000e+00 TB_SEL: 1.0000e+00
Access path: index (no sta/stp keys)
Index: IDX_SUBNHISTORY_SERVID
TABLE: SUBSCRIPTION_HISTORY
RSC_CPU: 0 RSC_IO: 26
IX_SEL: 1.0000e+00 TB_SEL: 1.0000e+00
Access path: index (no sta/stp keys)
Index: IDX_SUBNHISTORY_SUBSID
TABLE: SUBSCRIPTION_HISTORY
RSC_CPU: 0 RSC_IO: 26
IX_SEL: 1.0000e+00 TB_SEL: 1.0000e+00
BEST_CST: 2.00 PATH: 2 Degree: 1
***************************************
GENERAL PLANS
***********************
Join order[1]: SUBSCRIPTION_HISTORY[SUBSCRIPTION_HISTORY]#0
Best so far: TABLE#: 0 CST: 2 CDN: 1 BYTES: 65
Final - All Rows Plan:
JOIN ORDER: 1
CST: 2 CDN: 1 RSC: 2 RSP: 2 BYTES: 65
IO-RSC: 2 IO-RSP: 2 CPU-RSC: 0 CPU-RSP: 0
*** 2009-02-26 20:01:07.102
QUERY
alter session set events '10053 trace name context off'
oracle@gx_db01:/oracle/app/oracle/admin/gxmisc/udump >
令我感到惊奇的是:这个明明在sqlplus中显示走ifs的语句,在10053trace出来的文件中竟然看到是tsc的cost最小,就是还是走tsc的!!
于是这应该能够解释,为什么我用etl走ifs要3小时以上,而我在sqlplus中看到也是走ifs,但是只要3分半钟。——因为其实它也是走了tsc,不过在执行计划中看到走ifs了。不过,这样的情况之前都没遇到过,难道是个bug?
· 【文章发布信息】发表于: 2009-03-02 @ 14:57:53 · ||分类: ..experience, Working case



CopyRight ©
安佰胜 于 2009-03-02 @ 15:11:29 留言 :
这个恶了。。。。。。
棉花糖ONE 于 2009-03-02 @ 18:39:55 留言 :
你这表和索引的统计信息应该都不准确吧,看cost是没有意义的,你最后那个真实的执行计划应该是重用了share pool中的,而你的set autot trace看到的执行计划是这时候生成的,并不是真实的执行计划
棉花糖ONE 于 2009-03-02 @ 18:42:16 留言 :
etl的程序还用绑定变量,很强大很邪恶
木匠Charlie 于 2009-03-03 @ 02:37:59 留言 :
To 作者小荷,
提一些意见,不要不爱听. 良药苦口. 嘻嘻.
其中 棉花糖ONE 已经提到了一些,
* 没有表/分区统计信息, 分析10053 trace 没有指导意义.
* 数据仓库/批处理环境, 大的查询使用常量好一些, 可以帮助优化器得到有效SQL执行路径; case by case, 仍需分析应用逻辑后使用.
* 文章组织有些乱, 建议先写结果,再写测试案例
* 无法重现 测试结果, 给出完整测试案例
* test SQL和 production SQL运行环境不一样, 使用下面的查询显示实际运行的production SQL, 可能不支持 9.2, 赶紧升级吧.
SELECT t.*
FROM v$sql s, table(DBMS_XPLAN.DISPLAY_CURSOR(s.sql_id, s.child_number)) t
WHERE sql_text LIKE ‘%my_sql_text%’ and sql_text NOT LIKE ‘%v$sql%’;
To 棉花糖ONE,
你的内功深厚呀, 到处留下真知灼见. ^_^
How to contact you? e.g. email, MSN.
Thanks,
-MJ
小荷 于 2009-03-03 @ 14:01:36 留言 :
re 木匠:
谢谢你的意见,以后在写的时候,我会注意理清文章的组织。这次写文章的时候确实思路有点乱了,写这个文章的本意是抛砖引玉(还当了一回标题党,嘿嘿),希望大家能讨论这个结果是否是bug,我在处理的过程中哪里还分析的不对。
统计信息是比较早,05年7月之后就没有被分析过了。
要使用常量,作为一名维护人员来说比较困难的,涉及到修改代码甚至是业务逻辑。
上述的所有sql和执行计划是生产环境上的。
DISPLAY_CURSOR在9206上是没有的。数据库版本的升级,目前已经纳入考虑范围内了,一方面测试部已经开始测试,另一方面要等集团公司下文通知。
木匠Charlie 于 2009-03-03 @ 14:16:26 留言 :
有容乃大,分享知识,共同进步.
TOP(Christian Antognini的Troubleshooting Oracle Performance)里面讲了如何显示Oracle 9.2 v$sql_plan里面的执行计划, 你自己看看, 201~203页.
Display execution plan in the library cache.
(我眼下主要在Oracle 11.1.0.7做应用开发, 所以对9.2已经提不起兴趣了).
小荷 于 2009-03-03 @ 22:14:30 留言 :
re 木匠:呵呵,好书!谢谢推荐哈~~~
hoorace 于 2009-03-05 @ 09:05:48 留言 :
不能不佩服一下写sql的水平,现网的sql效率其实没有传说中的那么低滴。
onlyring 于 2009-09-20 @ 01:50:26 留言 :
Is this helpful to you?
http://www.oracle.com/technology/oramag/oracle/08-jan/o18asktom.html