操作系统内核参数限制导致ora-4030
最近同事遇到一个问题,某个报表程序报错ora-4030的错误:
通过trace,发现是以下语句的问题:
- declare
- type subs_mobile_no_arr is table of trt_subs_GD_1li_wap_S.subs_mobile_no%type index by pls_integer;
- type icp_id_arr is table of trt_subs_GD_1li_wap_S.icp_id%type index by pls_integer;
- type icp_code_arr is table of trt_subs_GD_1li_wap_S.icp_code%type index by pls_integer;
- type channel_id_arr is table of trt_subs_GD_1li_wap_S.channel_id%type index by pls_integer;
- type serv_id_arr is table of trt_subs_GD_1li_wap_S.serv_id%type index by pls_integer;
- type brand_id_arr is table of trt_subs_GD_1li_wap_S.brand_id%type index by pls_integer;
- type fee_mobile_no_arr is table of trt_subs_GD_1li_wap_S.fee_mobile_no%type index by pls_integer;
- type action_id_arr is table of trt_subs_GD_1li_wap_S.action_id%type index by pls_integer;
- type crt_his_date_arr is table of trt_subs_GD_1li_wap_S.crt_his_date%type index by pls_integer;
- type city_id_arr is table of trt_subs_GD_1li_wap_S.city_id%type index by pls_integer;
- type attr_arr is table of trt_subs_GD_1li_wap_S.attr%type index by pls_integer;
- type srh_channel_id_arr is table of trt_subs_GD_1li_wap_S.srh_channel_id%type index by pls_integer;
- arr_subs_mobile_no subs_mobile_no_arr;
- arr_icp_id icp_id_arr;
- arr_icp_code icp_code_arr;
- arr_channel_id channel_id_arr;
- arr_serv_id serv_id_arr;
- arr_brand_id brand_id_arr;
- arr_fee_mobile_no fee_mobile_no_arr;
- arr_action_id action_id_arr;
- arr_crt_his_date crt_his_date_arr;
- arr_city_id city_id_arr;
- arr_attr attr_arr;
- arr_srh_channel_id srh_channel_id_arr;
- begin
- select subs_mobile_no,
- icp_id,
- icp_code,
- channel_id,
- serv_id,
- brand_id,
- fee_mobile_no,
- decode(action_id, 'SCS', 'C', 'ADS', 'D') action_id,
- crt_his_date,
- city_id,
- attr,
- srh_channel_id bulk collect
- into arr_subs_mobile_no,
- arr_icp_id,
- arr_icp_code,
- arr_channel_id,
- arr_serv_id,
- arr_brand_id,
- arr_fee_mobile_no,
- arr_action_id,
- arr_crt_his_date,
- arr_city_id,
- arr_attr,
- arr_srh_channel_id
- from trt_subs_GD_1li_wap_S
- where action_id = 'SCS'
- or action_id = 'ADS';
- if arr_subs_mobile_no.last > 0 then
- arr_srh_channel_id srh_channel_id_arr;
- begin
- select subs_mobile_no,
- icp_id,
- icp_code,
- channel_id,
- serv_id,
- brand_id,
- fee_mobile_no,
- decode(action_id, 'SCS', 'C', 'ADS', 'D') action_id,
- crt_his_date,
- city_id,
- attr,
- srh_channel_id bulk collect
- into arr_subs_mobile_no,
- arr_icp_id,
- arr_icp_code,
- arr_channel_id,
- arr_serv_id,
- arr_brand_id,
- arr_fee_mobile_no,
- arr_action_id,
- arr_crt_his_date,
- arr_city_id,
- arr_attr,
- arr_srh_channel_id
- from trt_subs_GD_1li_wap_S
- where action_id = 'SCS'
- or action_id = 'ADS';
- if arr_subs_mobile_no.last > 0 then
- forall i in arr_subs_mobile_no.first .. arr_subs_mobile_no.last
- update f_wap_subs_GD_p_s c
- set c.city_id = arr_city_id(i),
- c.icp_id = arr_icp_id(i),
- c.icp_code = arr_icp_code(i),
- c.subs_cancel_channel_id = arr_channel_id(i),
- c.brand_id = arr_brand_id(i),
- c.subs_status_id = arr_action_id(i),
- c.fee_mobile_no = arr_fee_mobile_no(i),
- c.usubs_date = arr_crt_his_date(i),
- c.usubs_day = to_char(arr_crt_his_date(i),
- 'YYYYMMDD'),
- c.idx_day = to_char(arr_crt_his_date(i),
- 'YYYYMMDD'),
- c.is_subs_valid = 0,
- c.flow_time = sysdate,
- c.srh_channel_id = arr_srh_channel_id(i)
- where subs_mobile_no = arr_subs_mobile_no(i)
- and serv_id = arr_serv_id(i)
- and is_subs_valid = 1;
- end if;
- end;
这个sql在执行的时候,pga的内存被迅速的消耗,通过v$sesstat看到这个session每次使用pga快到2G时,就报错了。而且每次都是这样。所以一开始,我们怀疑是不是pga不够,增加pga大小为3G继续观察程序的运行情况,结果发现还是一样的报错。
在metalink上发现类似的有个bug:Bug 4994859: ORA-04030 WHILE EXECUTING A PROCEDURE CONTAINING BULK COLLECT。于是打算分2步走:
1、向开发建议修改程序,不使用BULK COLLECT(上面sql的85行处)
2、由于bug是在Sun Solaris SPARC (64-bit)上的9207版本,我们的是HP UX的9208版本,可能不是这个bug。所以向oracle开SR进行确认。
oracle回复:
- Basically, an ORA-4030 indicates a limit has been reached with respect to process private memory allocation. An ORA-4030 error is an error in the PGA; Memory limitation related to an Oracle rdbms server process is reached.
- Typical causes:
- * OS Memory limit reached such as physical memory and/or swap/virtual paging
- * OS limits reached such as kernel or user shell limits that limit overall, user level or process level memory usage
- * OS limit on PGA memory size due to SGA attach address <<Note 262540.1>> Relocate SGABEG on 64-bit Oracle
- * Oracle internal limit example Bug 3130972
- * Application design causing limits to be reached
- * Bug – space leaks, heap leaks
于是用RDA收集主机信息,发现有一个参数设置:
--------
Ulimit
=====
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 2097152 <----
stack(kbytes) 131072
memory(kbytes) unlimited
coredump(blocks) 4194303
nofiles(descriptors) 4096
从这里我们看到ulimit -d参数被限制了,大小为2G,我们通过修改2个内核参数从而达到修改这个参数的目的:
1、将maxdsiz从2147483648 (2G) 调整到4187593114(4G)
2、将maxdsiz_64bit从2147483648 (2G)调整6442450944 (6G).
ulimit -d自动的变成:
再次运行原来的应用程序,发现不再报错。
· 【文章发布信息】发表于: 2010-02-28 @ 12:13:34 · ||分类: ..experience, Working case



CopyRight ©
匿名 于 2010-03-05 @ 11:22:44 留言 :
最好将程序中的 “批量绑定” 修改为游标方式。
曾经遇到类似问题, 将PGA设置非常大, ulimits改为无限大,都不能解决问题, 最后让项目组把代码改成游标,问题终于解决。