3小时的sql调优到3分钟。


今天接到这样一个问题,某省的报表系统的一个某个处理进程在前几天处理的速度突然变慢,而且从应用的log上还看到1555的报错:

[2009-02-25 00:02:27] [200010000000006] [5]: 任务1执行失败!!!
[2009-02-25 01:36:55] [200010000000007] [5]: servid(015027016400) not found in map_servid_servattr!!!
[2009-02-25 03:24:55] [200010000600008] [0]: 从光标中fetch数据出错 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6$" too small

当时第一个反应就是加大undo表空间大小和undo retention参数。但是,之前的程序跑的还比较正常,没经常性的报1555,现在突然执行很久都不出结果来,很大程度我就怀疑是执行计划发生了改变。

问了应用的同学,该程序是一个etl程序,先到报表系统取去时间值,根据时间到核心数据库查订购关系的历史,做完之后,update一个config表。大致的程序如下:

type1=T1
stmt1=select to_char(trunc(incdatatime),'yyyymmdd') old_date from dasc_subs_dataexport_task_cfg where taskname=1
 
type2=S2
stmt2=select  \
        to_char(actiondate,'yyyymmddhh24miss') actiondate,nvl(actionplace,5) actionplace,icpid, \
        subsid, actionid,servid,\
        decode(SUBSCAT, 1,0, 2,1, 3,1, 4,0, 5,1, 6,1, 7,2, 8,2, 9,3, 10,3) SUBSCAT,\
        chargesubsid,nvl(channelid,'9999') channelid \
        from subscription_history \
        where ( actionid='SCS' or actionid='SGS' or actionid='ADS' ) and \
              ( actiondate >= to_date(:old_date,'yyyymmdd') ) and \
              ( actiondate
<  to_date(:old_date,'yyyymmdd')+1and \
              (
subscat between 1 and 10 ) and \
              (
length(SUBSID) = 11 ) and \
              ( (
chargesubsid is not null ) and ( length(chargesubsid) = 11 )  )
 
type3=T1
stmt3=update dasc_time_config set config_value=sysdate where config_name='SUBSDATAEXP_SUBSCRIPTION'

而在alertlog中,报错1555的正是中间的那段sql:

Wed Feb 25 03:24:49 2009
ORA-01555 caused by SQL statement below (Query Duration=12053 sec, SCN: 0x0002.b7d94829):
Wed Feb 25 03:24:49 2009
select   to_char(actiondate,'yyyymmddhh24miss') actiondate,nvl(actionplace,5) actionplace,icpid,  subsid, actionid,servid, decode(SUBSCAT, 1,0, 2,1, 3,1, 4,0, 5,1, 6,1, 7,2,8,2, 9,3, 10,3) SUBSCAT, chargesubsid,nvl(channelid,'9999') channelid  from subscription_history  where ( actionid='SCS' or actionid='SGS' or actionid='ADS' ) and  ( actiondate >= to_date(:old_date,'yyyymmdd') ) and  ( actiondate
<  to_date(:old_date,'yyyymmdd')+1and  ( subscat between 1 and 10 ) and  ( length(SUBSID) = 11 ) and  ( ( chargesubsid is not null ) and ( length(chargesubsid) = 11 )  )
Wed Feb 25 03:34:57 2009
Thread 1 advanced to log sequence 64166

从时间上,该sql从0点开始,跑到了3点45,而undo retention是设置了3小时,已经超过了retention的值。除了这次执行时间在3小时,在应用的log中还发现从2月26日开始就有多次出现执行时间在3小时以上的。开始检查这个表的相关结构和sql的执行计划:
这个表是分区表,按照actiondate进行rang分区,每月一个分区,用于存放历史记录,目前已经有95G多。
sql的执行计划为:

SQL> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--
 

-------------------------------------------------------------------------------------------------------------------
--
| Id  | Operation                           |  Name                         | Rows  | Bytes | Cost  | Pstart| Pstop |

-------------------------------------------------------------------------------------------------------------------
--
|   0 | SELECT STATEMENT                    |                               |     1 |    65 |    89 |       |       |

|* 
1FILTER                             |                               |       |       |       |       |       |
|   
2 |   PARTITION RANGE ITERATOR          |                               |       |       |       |   KEY |   KEY |
|* 
3 |    TABLE ACCESS BY LOCAL INDEX ROWID| SUBSCRIPTION_HISTORY          |     1 |    65 |    89 |   KEY |   KEY |
|* 
4 |     INDEX FULL SCAN                 | IDX_SUBNHISTORY_CHARGESUBSID  |   441K|       |    88 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------
--
 

Predicate Information (identified by operation id):
-------------------------------------------------
--
 

  
1 - filter(TO_DATE(:Z,'yyyymmdd')<TO_DATE(:Z,'yyyymmdd')+1)
  
3 - filter(("SUBSCRIPTION_HISTORY"."ACTIONID"='ADS' OR "SUBSCRIPTION_HISTORY"."ACTIONID"='SCS' OR
              
"SUBSCRIPTION_HISTORY"."ACTIONID"='SGS') AND "SUBSCRIPTION_HISTORY"."ACTIONDATE">=TO_DATE(:Z,'yyyymmdd') AND
              
"SUBSCRIPTION_HISTORY"."ACTIONDATE"<TO_DATE(:Z,'yyyymmdd')+1 AND "SUBSCRIPTION_HISTORY"."SUBSCAT">=1 AND
              
"SUBSCRIPTION_HISTORY"."SUBSCAT"<=10 AND LENGTH("SUBSCRIPTION_HISTORY"."SUBSID")=11)
  
4 - filter("SUBSCRIPTION_HISTORY"."CHARGESUBSID" IS NOT NULL AND LENGTH("SUBSCRIPTION_HISTORY"."CHARGESUBSID")=11)
 
Note: cpu costing is off
 
22 rows selected.
 
SQL>

我们发现是走索引的index full scan。

尝试带入变量,查看其执行计划:

SQL> explain plan for
 
2  select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
 
3         nvl(actionplace, 5) actionplace,
      
icpid,
      
subsid,
      
actionid,
      
servid,
      
decode(SUBSCAT,
 
4    5    6    7    8    9                1,
              
0,
              
2,
              
1,
              
3,
              
1,
              
4,
 
10   11   12   13   14   15   16                0,
              
5,
              
1,
              
6,
              
1,
              
7,
              
2,
 
17   18   19   20   21   22   23                8,
              
2,
              
9,
              
3,
              
10,
              
3) SUBSCAT,
 
24   25   26   27   28   29         chargesubsid,
      
nvl(channelid, '9999') channelid
 
from subscription_history
 
where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
  
and (actiondate >= to_date('20090225', 'yyyymmdd'))
 
30   31   32   33   34     and (actiondate < to_date('20090225', 'yyyymmdd') + 1)
  
and (subscat between 1 and 10)
  
and (length(SUBSID) = 11)
  
and ((chargesubsid is not null) and (length(chargesubsid) = 11)) 35   36   37 
 
38  /
 
Explained.
 
SQLselect * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--
 

--------------------------------------------------------------------------------------------
--
| Id  | Operation            |  Name                 | Rows  | Bytes | Cost  | Pstart| Pstop |

--------------------------------------------------------------------------------------------
--
|   0 | SELECT STATEMENT     |                       |     1 |    65 |     2 |       |       |

|* 
1TABLE ACCESS FULL   | SUBSCRIPTION_HISTORY  |     1 |    65 |     2 |    52 |    52 |
--------------------------------------------------------------------------------------------
--
 

Predicate Information (identified by operation id):
-------------------------------------------------
--
 

  
1 - filter(("SUBSCRIPTION_HISTORY"."ACTIONID"='ADS' OR
              
"SUBSCRIPTION_HISTORY"."ACTIONID"='SCS' OR "SUBSCRIPTION_HISTORY"."ACTIONID"='SGS') AND
              
"SUBSCRIPTION_HISTORY"."ACTIONDATE">=TO_DATE('2009-02-25 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
              
AND "SUBSCRIPTION_HISTORY"."ACTIONDATE"<TO_DATE('2009-02-26 00:00:00', 'yyyy-mm-dd
              hh24:mi:ss
') AND "SUBSCRIPTION_HISTORY"."SUBSCAT">=1 AND "SUBSCRIPTION_HISTORY"."SUBSCAT"<=10
              
AND LENGTH("SUBSCRIPTION_HISTORY"."SUBSID")=11 AND "SUBSCRIPTION_HISTORY"."CHARGESUBSID" IS
              
NOT NULL AND LENGTH("SUBSCRIPTION_HISTORY"."CHARGESUBSID")=11)
 
Note: cpu costing is off
 
20 rows selected.
 
SQL>

发现根据程序带入的变量,是走full table scan的!!

看上去,走tsc的cost比走ifs的cost小很多,实际效果如何,我们再来通过set timing on 和autotrace来看看:

SQL> set autotrace traceonly
SQL>
SQL> ---- 不使用绑定变量的:
SQL> select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
 
2         nvl(actionplace, 5) actionplace,
      
icpid,
      
subsid,
      
actionid,
      
servid,
      
decode(SUBSCAT,
 
3    4    5    6    7    8                1,
              
0,
              
2,
              
1,
              
3,
              
1,
 
9   10   11   12   13   14                4,
              
0,
              
5,
              
1,
              
6,
              
1,
              
7,
 
15   16   17   18   19   20   21                2,
              
8,
              
2,
              
9,
              
3,
 
22   23   24   25   26                10,
              
3) SUBSCAT,
      
chargesubsid,
      
nvl(channelid, '9999') channelid
 
from subscription_history
 
27   28   29   30   31   where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
  
and (actiondate >= to_date('20090224', 'yyyymmdd'))
  
and (actiondate < to_date('20090224', 'yyyymmdd') + 1)
  
and (subscat between 1 and 10)
  
and (length(SUBSID) = 11)
  
and ((chargesubsid is not null) and (length(chargesubsid) = 11)) 32   33   34   35   36 
 
37  /
 
301951 rows selected.
 
Elapsed: 00:03:25.08
 
Execution Plan
--------------------------------------------------------
--
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=65)

  
1    0   TABLE ACCESS (FULL) OF 'SUBSCRIPTION_HISTORY' (Cost=2 Card
          =
1 Bytes=65)
 
 
 
 
 
Statistics
--------------------------------------------------------
--
          0  recursive calls

          
0  db block gets
    
622076  consistent gets
    
468647  physical reads
          
0  redo size
  
17199853  bytes sent via SQL*Net to client
    
222086  bytes received via SQL*Net from client
      
20132  SQL*Net roundtrips to/from client
          
0  sorts (memory)
          
0  sorts (disk)
    
301951  rows processed
 
SQL
SQL>
SQL> ---- 使用绑定变量:
SQL>
SQL> set autotrace trace   
SQL> var old_date VARCHAR2(20);
SQL> exec :old_date:=20090224;
 
PL/SQL procedure successfully completed.
 
Elapsed: 00:00:00.01
SQL> select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
 
2         nvl(actionplace, 5) actionplace,
      
icpid,
      
subsid,
      
actionid,
      
servid,
      
decode(SUBSCAT,
 
3    4    5    6    7    8                1,
              
0,
              
2,
              
1,
              
3,
              
1,
 
9   10   11   12   13   14                4,
              
0,
              
5,
              
1,
              
6,
              
1,
              
7,
 
15   16   17   18   19   20   21                2,
              
8,
              
2,
              
9,
              
3,
              
10,
              
3) SUBSCAT,
 
22   23   24   25   26   27   28         chargesubsid,
      
nvl(channelid, '9999') channelid
 
from subscription_history
 
where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
  
and (actiondate >= to_date(:old_date, 'yyyymmdd'))
  
and (actiondate < to_date(:old_date, 'yyyymmdd') + 1)
 
29   30   31   32   33   34     and (subscat between 1 and 10)
  
and (length(SUBSID) = 11)
  
and ((chargesubsid is not null) and (length(chargesubsid) = 11)) 35   36 
 
37  /
 
301951 rows selected.
 
Elapsed: 00:03:31.43
 
Execution Plan
--------------------------------------------------------
--
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=89 Card=1 Bytes=65)

  
1    0   FILTER
  
2    1     PARTITION RANGE (ITERATOR)
  
3    2       TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'SUBSCRIPTION_H
          ISTORY
' (Cost=89 Card=1 Bytes=65)
 
  
4    3         INDEX (FULL SCAN) OF 'IDX_SUBNHISTORY_CHARGESUBSID'
          
(NON-UNIQUE) (Cost=88 Card=441874)
 
 
 
 
 
Statistics
--------------------------------------------------------
--
          0  recursive calls

          
0  db block gets
    
623306  consistent gets
    
432928  physical reads
          
0  redo size
  
17199853  bytes sent via SQL*Net to client
    
222086  bytes received via SQL*Net from client
      
20132  SQL*Net roundtrips to/from client
          
0  sorts (memory)
          
0  sorts (disk)
    
301951  rows processed
 
SQL>     
SQL>
SQL>
SQL> set autotrace off

我们发现无论使用绑定变量走ifs还是不使用绑定变量走tsc,二者在时间上消耗都是类似的,基本都在三分半钟。为什么会跑3个小时还没出来呢?难道是etl程序的问题?

开始通过oradebug开始trace,一边让应用开始跑etl的压缩进程,一边在数据库端查trace。发现trace中的等待基本都是:

WAIT #1: nam='db file sequential read' ela= 10189 p1=826 p2=11415 p3=1
WAIT #1: nam='db file sequential read' ela= 7172 p1=826 p2=11416 p3=1
WAIT #1: nam='db file sequential read' ela= 234 p1=826 p2=11417 p3=1
WAIT #1: nam='db file sequential read' ela= 6315 p1=683 p2=120892 p3=1
WAIT #1: nam='db file sequential read' ela= 461 p1=826 p2=11418 p3=1
WAIT #1: nam='db file sequential read' ela= 234 p1=826 p2=11419 p3=1
WAIT #1: nam='db file sequential read' ela= 7028 p1=585 p2=102779 p3=1
WAIT #1: nam='db file sequential read' ela= 8525 p1=587 p2=85577 p3=1
WAIT #1: nam='db file sequential read' ela= 278 p1=587 p2=85578 p3=1
WAIT #1: nam='db file sequential read' ela= 287 p1=587 p2=85579 p3=1
WAIT #1: nam='db file sequential read' ela= 6794 p1=719 p2=126644 p3=1
WAIT #1: nam='db file sequential read' ela= 1400 p1=587 p2=85580 p3=1
WAIT #1: nam='db file sequential read' ela= 5880 p1=742 p2=112742 p3=1
WAIT #1: nam='db file sequential read' ela= 259 p1=742 p2=112743 p3=1
WAIT #1: nam='db file sequential read' ela= 9805 p1=742 p2=112744 p3=1
WAIT #1: nam='db file sequential read' ela= 416 p1=742 p2=112745 p3=1
WAIT #1: nam='db file sequential read' ela= 168 p1=742 p2=112746 p3=1
WAIT #1: nam='db file sequential read' ela= 4134 p1=722 p2=81027 p3=1
WAIT #1: nam='db file sequential read' ela= 10307 p1=759 p2=114703 p3=1
WAIT #1: nam='db file sequential read' ela= 6342 p1=759 p2=115072 p3=1
WAIT #1: nam='db file sequential read' ela= 8258 p1=783 p2=91289 p3=1
WAIT #1: nam='db file sequential read' ela= 284 p1=783 p2=91290 p3=1
WAIT #1: nam='db file sequential read' ela= 225 p1=783 p2=91291 p3=1
WAIT #1: nam='db file sequential read' ela= 5672 p1=829 p2=4691 p3=1
WAIT #1: nam='db file sequential read' ela= 370 p1=829 p2=4692 p3=1
WAIT #1: nam='db file sequential read' ela= 538 p1=829 p2=4693 p3=1
WAIT #1: nam='db file sequential read' ela= 2279 p1=718 p2=95053 p3=1
WAIT #1: nam='db file sequential read' ela= 417 p1=829 p2=4694 p3=1
WAIT #1: nam='db file sequential read' ela= 168 p1=829 p2=4695 p3=1
WAIT #1: nam='db file sequential read' ela= 5722 p1=454 p2=112829 p3=1
WAIT #1: nam='db file sequential read' ela= 286 p1=454 p2=112830 p3=1
WAIT #1: nam='db file sequential read' ela= 331 p1=454 p2=112831 p3=1
……

从trace来看,etl确实是走了ifs。等了15分钟左右吧,该等待事件仍然在继续,看来,走ifs,可能是要3个小时以上了,而不是我在sqlplus中测试的3分半钟。

于是叫应用把etl进程先停掉了,为sql加上full的hints,再开始跑,开始trace……

WAIT #1: nam='db file scattered read' ela= 980 p1=820 p2=2488 p3=7
WAIT #1: nam='db file scattered read' ela= 4888 p1=820 p2=2498 p3=16
WAIT #1: nam='db file sequential read' ela= 326 p1=820 p2=2514 p3=1
WAIT #1: nam='db file scattered read' ela= 636 p1=820 p2=2516 p3=5
WAIT #1: nam='db file sequential read' ela= 321 p1=820 p2=2522 p3=1
WAIT #1: nam='db file scattered read' ela= 487 p1=820 p2=2524 p3=3
WAIT #1: nam='db file sequential read' ela= 314 p1=820 p2=2528 p3=1
WAIT #1: nam='db file scattered read' ela= 1295 p1=820 p2=2530 p3=16
WAIT #1: nam='db file scattered read' ela= 1017 p1=820 p2=2546 p3=10
WAIT #1: nam='db file scattered read' ela= 1084 p1=820 p2=2557 p3=6
WAIT #1: nam='db file scattered read' ela= 3544 p1=820 p2=2564 p3=16
WAIT #1: nam='db file scattered read' ela= 1712 p1=820 p2=2580 p3=16
WAIT #1: nam='db file scattered read' ela= 588 p1=820 p2=2596 p3=4
WAIT #1: nam='db file scattered read' ela= 987 p1=820 p2=2601 p3=9
WAIT #1: nam='db file scattered read' ela= 1137 p1=820 p2=2611 p3=13
WAIT #1: nam='db file scattered read' ela= 1302 p1=820 p2=2625 p3=16
WAIT #1: nam='db file scattered read' ela= 613 p1=820 p2=2641 p3=4
WAIT #1: nam='db file scattered read' ela= 1836 p1=820 p2=2646 p3=16
WAIT #1: nam='db file scattered read' ela= 507 p1=820 p2=2662 p3=3
WAIT #1: nam='db file scattered read' ela= 1550 p1=820 p2=2666 p3=16
WAIT #1: nam='db file scattered read' ela= 1151 p1=820 p2=2682 p3=7
WAIT #1: nam='db file scattered read' ela= 387 p1=820 p2=2690 p3=2
WAIT #1: nam='db file scattered read' ela= 4222 p1=820 p2=2693 p3=16
WAIT #1: nam='db file scattered read' ela= 406 p1=820 p2=2709 p3=2
WAIT #1: nam='db file sequential read' ela= 377 p1=820 p2=2712 p3=1
WAIT #1: nam='db file scattered read' ela= 2832 p1=820 p2=2714 p3=16
WAIT #1: nam='db file scattered read' ela= 383 p1=820 p2=2730 p3=2
WAIT #1: nam='db file scattered read' ela= 520 p1=820 p2=2733 p3=4
WAIT #1: nam='db file scattered read' ela= 495 p1=820 p2=2738 p3=3
WAIT #1: nam='db file scattered read' ela= 517 p1=820 p2=2742 p3=4
WAIT #1: nam='db file scattered read' ela= 1191 p1=820 p2=2747 p3=8
……

时间,3分钟跑完!!

在这里,sql优化的对策基本已经找到,就是走tsc。但是,为什么在2月26日之后会突然执行计划变差,变成ifs?为什么我在sqlplus上测试的走ifs也只要3分班钟,而etl跑的走ifs的却要超过3个小时?

为了找个答案,我再做了一次10053的trace,去看看oracle是怎么样选择执行路径的?

oracle@gx_db01:/oracle/app/oracle/admin/gxmisc/udump > cat gxmisc_ora_2495.trc
Dump file /oracle/app/oracle/admin/gxmisc/udump/gxmisc_ora_2495.trc
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.6.0 - Production
ORACLE_HOME = /oracle/app/oracle/product/9.2.0
System name:    HP-UX
Node name:      gx_db01
Release:        B.11.11
Version:        U
Machine:        9000/800
Instance name: gxmisc
Redo thread mounted by this instance: 1
Oracle process number: 120
Unix process pid: 2495, image: oracle@gx_db01 (TNS V1-V3)
 
*** 2009-02-26 19:54:28.245
*** SESSION ID:(218.27134) 2009-02-26 19:54:28.202
QUERY
BEGIN :old_date:=20090224; END;
*** 2009-02-26 19:54:45.186
QUERY
select to_char(actiondate, 'yyyymmddhh24miss') actiondate,
       nvl(actionplace, 5) actionplace,
       icpid,
       subsid,
       actionid,
       servid,
       decode(SUBSCAT,
              1,
              0,
              2,
              1,
              3,
              1,
              4,
              0,
              5,
              1,
              6,
              1,
              7,
              2,
              8,
              2,
              9,
              3,
              10,
              3) SUBSCAT,
       chargesubsid,
       nvl(channelid, '9999') channelid
  from subscription_history
 where (actionid = 'SCS' or actionid = 'SGS' or actionid = 'ADS')
   and (actiondate >= to_date(:old_date, 'yyyymmdd'))
   and (actiondate
< to_date(:old_date, 'yyyymmdd') + 1)
  
and (subscat between 1 and 10)
  
and (length(SUBSID) = 11)
  
and ((chargesubsid is not null) and (length(chargesubsid) = 11))
***************************************
SINGLE TABLE ACCESS PATH
Column:   ACTIONID  Col#: 4      Part#: 51     Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column:   ACTIONID  Col#: 4      Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: ACTIONDATE  Col#: 6      Part#: 51     Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: ACTIONDATE  Col#: 6      Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column:    SUBSCAT  Col#: 20     Part#: 51     Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column:    SUBSCAT  Col#: 20     Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: CHARGESUBS  Col#: 21     Part#: 51     Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
Column: CHARGESUBS  Col#: 21     Table: SUBSCRIPTION_HISTORY   Alias: SUBSCRIPTION_HISTORY
    
NO STATISTICS  (using defaults)
    
NDV: 3         NULLS: 0         DENS: 3.9024e-01
    
NO HISTOGRAM: #BKT: 0 #VAL: 0
 
TABLE: SUBSCRIPTION_HISTORY     ORIG CDN: 82  ROUNDED CDN: 1  CMPTD CDN: 0
 
Access path: tsc  Resc:  2  Resp:  2
 
Access path: index (no sta/stp keys)
      
Index: IDX_SUBNHISTORY_CHARGESUBSID
 
TABLE: SUBSCRIPTION_HISTORY
      
RSC_CPU: 0   RSC_IO: 26
 
IX_SEL:  1.0000e+00  TB_SEL:  1.0000e+00
 
Access path: index (no sta/stp keys)
      
Index: IDX_SUBNHISTORY_HISTORYID
 
TABLE: SUBSCRIPTION_HISTORY
      
RSC_CPU: 0   RSC_IO: 26
 
IX_SEL:  1.0000e+00  TB_SEL:  1.0000e+00
 
Access path: index (no sta/stp keys)
      
Index: IDX_SUBNHISTORY_MID
 
TABLE: SUBSCRIPTION_HISTORY
      
RSC_CPU: 0   RSC_IO: 26
 
IX_SEL:  1.0000e+00  TB_SEL:  1.0000e+00
 
Access path: index (no sta/stp keys)
      
Index: IDX_SUBNHISTORY_SERVID
 
TABLE: SUBSCRIPTION_HISTORY
      
RSC_CPU: 0   RSC_IO: 26
 
IX_SEL:  1.0000e+00  TB_SEL:  1.0000e+00
 
Access path: index (no sta/stp keys)
      
Index: IDX_SUBNHISTORY_SUBSID
 
TABLE: SUBSCRIPTION_HISTORY
      
RSC_CPU: 0   RSC_IO: 26
 
IX_SEL:  1.0000e+00  TB_SEL:  1.0000e+00
 
BEST_CST: 2.00  PATH: 2  Degree:  1
***************************************
GENERAL PLANS
***********************
Join order[1]:  SUBSCRIPTION_HISTORY[SUBSCRIPTION_HISTORY]#0
Best so far: TABLE#: 0  CST:          2  CDN:          1  BYTES:         65
Final - All Rows Plan:
 
JOIN ORDER: 1
 
CST: 2  CDN: 1  RSC: 2  RSP: 2  BYTES: 65
 
IO-RSC: 2  IO-RSP: 2  CPU-RSC: 0  CPU-RSP: 0
***
2009-02-26 20:01:07.102
QUERY
alter session set events '10053 trace name context off'
oracle@gx_db01:/oracle/app/oracle/admin/gxmisc/udump >

令我感到惊奇的是:这个明明在sqlplus中显示走ifs的语句,在10053trace出来的文件中竟然看到是tsc的cost最小,就是还是走tsc的!!

于是这应该能够解释,为什么我用etl走ifs要3小时以上,而我在sqlplus中看到也是走ifs,但是只要3分半钟。——因为其实它也是走了tsc,不过在执行计划中看到走ifs了。不过,这样的情况之前都没遇到过,难道是个bug?

· 【文章发布信息】发表于: 2009-03-02 @ 14:57:53 · ||分类: ..experience, Working case

9 条评论 »

  1. 安佰胜 于 2009-03-02 @ 15:11:29 留言

    这个恶了。。。。。。

  2. 棉花糖ONE 于 2009-03-02 @ 18:39:55 留言

    你这表和索引的统计信息应该都不准确吧,看cost是没有意义的,你最后那个真实的执行计划应该是重用了share pool中的,而你的set autot trace看到的执行计划是这时候生成的,并不是真实的执行计划

  3. 棉花糖ONE 于 2009-03-02 @ 18:42:16 留言

    etl的程序还用绑定变量,很强大很邪恶

  4. 木匠Charlie 于 2009-03-03 @ 02:37:59 留言

    To 作者小荷,

    提一些意见,不要不爱听. 良药苦口. 嘻嘻.

    其中 棉花糖ONE 已经提到了一些,

    * 没有表/分区统计信息, 分析10053 trace 没有指导意义.
    * 数据仓库/批处理环境, 大的查询使用常量好一些, 可以帮助优化器得到有效SQL执行路径; case by case, 仍需分析应用逻辑后使用.
    * 文章组织有些乱, 建议先写结果,再写测试案例
    * 无法重现 测试结果, 给出完整测试案例
    * test SQL和 production SQL运行环境不一样, 使用下面的查询显示实际运行的production SQL, 可能不支持 9.2, 赶紧升级吧.

    SELECT t.*
    FROM v$sql s, table(DBMS_XPLAN.DISPLAY_CURSOR(s.sql_id, s.child_number)) t
    WHERE sql_text LIKE ‘%my_sql_text%’ and sql_text NOT LIKE ‘%v$sql%’;

    To 棉花糖ONE,

    你的内功深厚呀, 到处留下真知灼见. ^_^

    How to contact you? e.g. email, MSN.

    Thanks,
    -MJ

  5. 小荷 于 2009-03-03 @ 14:01:36 留言

    re 木匠:
    谢谢你的意见,以后在写的时候,我会注意理清文章的组织。这次写文章的时候确实思路有点乱了,写这个文章的本意是抛砖引玉(还当了一回标题党,嘿嘿),希望大家能讨论这个结果是否是bug,我在处理的过程中哪里还分析的不对。
    统计信息是比较早,05年7月之后就没有被分析过了。
    要使用常量,作为一名维护人员来说比较困难的,涉及到修改代码甚至是业务逻辑。
    上述的所有sql和执行计划是生产环境上的。
    DISPLAY_CURSOR在9206上是没有的。数据库版本的升级,目前已经纳入考虑范围内了,一方面测试部已经开始测试,另一方面要等集团公司下文通知。

  6. 木匠Charlie 于 2009-03-03 @ 14:16:26 留言

    有容乃大,分享知识,共同进步.

    TOP(Christian Antognini的Troubleshooting Oracle Performance)里面讲了如何显示Oracle 9.2 v$sql_plan里面的执行计划, 你自己看看, 201~203页.
    Display execution plan in the library cache.
    (我眼下主要在Oracle 11.1.0.7做应用开发, 所以对9.2已经提不起兴趣了).

  7. 小荷 于 2009-03-03 @ 22:14:30 留言

    re 木匠:呵呵,好书!谢谢推荐哈~~~

  8. hoorace 于 2009-03-05 @ 09:05:48 留言

    不能不佩服一下写sql的水平,现网的sql效率其实没有传说中的那么低滴。

  9. onlyring 于 2009-09-20 @ 01:50:26 留言

    Is this helpful to you?
    http://www.oracle.com/technology/oramag/oracle/08-jan/o18asktom.html

RSS 为此帖反馈评论 · 反向跟踪 网站

留条评论