Skip to content

Oracle 排障配置与调整 - 6. page

遭遇ORA-29913,KUP-04084

数据库告警日志报错

ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04084: The ORACLE_DATAPUMP access driver does not support the ROWID column.

是数据库在收集统计信息时候出现.环境是rac.原因是11g的oracle在访问外部表的时候不支持在dmp格式下以rowid的方式访问.这里数据量不大,把外部表改为堆表后解决.其他方面问题可以参考以下文献.

Get “KUP-04084” error message when viewing an external table in the schema browser’s data tab
说明
I am getting the following error message when trying to view an external table in the schema browser’s data tab.

ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04084: The ORACLE_DATAPUMP access driver does not support the ROWID column.

原因
The query the schema browser is running in the background to return the data is asking to also return the ROWID, which is not supported by Oracle for external tables.

解决办法
WORKAROUND:
1) Highlight your external table on the left hand side of the schema browser.
2) Select the data tab on the right hand side. (you will get the error)
3) Click OK on the error.
4) Click the ‘View/Edit Query’ button the data tab’s tool bar. The button is the second from the left on the toolbar and looks like a piece of paper with a magnifying glass.
5) Remove ‘ROWID,’ from the statement in the window and click OK.

This statement should stay modified until you close the schema browser, so you will need to do this each time you open a new schema browser and want to view your external table. I will let you know what development says once I hear back from them.

其他信息
This is only an issue in Oracle 11gR2 databases.

11g数据库自带 自动化任务的管理

接连几个客户的数据库自动化处理任务中收集统计信息部分出了问题,客户询问我这方面的事情,我整理了一翻如下的文档,可以了解11g,10g的自动任务构成,以及如何管理数据库自带的管理任务.

FAQ: Automatic Statistics Collection (文档 ID 1233203.1)
New 11g Default Jobs (文档 ID 755838.1)
How to Benefit from Automatic Maintenance Tasks Following the Removal of the GATHER_STATS_JOB in 11g? (文档 ID 743507.1)
DBMS_AUTO_TASK_ADMIN

在11g中,optimizer stats 的收集是通过自动维护任务实现

SQL> select client_name,task_name from dba_autotask_task;

CLIENT_NAME                              TASK_NAME
---------------------------------------- ------------------------------
auto space advisor                       auto_space_advisor_prog
auto optimizer stats collection          gather_stats_prog

SQL> SELECT CLIENT_NAME,
  2         STATUS
  3  FROM   DBA_AUTOTASK_CLIENT
  4  WHERE  CLIENT_NAME = 'auto optimizer stats collection';

CLIENT_NAME                              STATUS
---------------------------------------- --------
auto optimizer stats collection          ENABLED

如不想让这些自动维护任务去自动运行,你可以通过通过DBMS_AUTO_TASK_ADMIN.disable或enable配置.

可以将所有的自动维护任务全部disable或enable,也可以只针对某个task进行设置,比如:

BEGIN
DBMS_AUTO_TASK_ADMIN.disable( client_name => ‘auto space advisor’,
operation => NULL,
window_name => NULL);
END;
/

BEGIN
DBMS_AUTO_TASK_ADMIN.enable( client_name => ‘auto space advisor’,
operation => NULL,
window_name => NULL);
END;
/

skip_unusable_indexes参数使用建议

​SKIP_UNUSABLE_INDEXES的使用与索引失效是相关的,该参数10g开始引入,11g默认为TRUE.
当为TRUE时候,如果数据库中存在usable状态的索引,则会自动忽略该索引生成新的执行计划(不走该索引,也不提示该索引的异常);当为False时候,则会报错.我所运维的数据库在一些关键系统中,会将此参数设成False,让系统及时发现索引的异常以便及时去介入修复.
环境各有所异,设置值也可依据实际情况设置.如果sql使用了hint或者涉及到唯一索引的对应DML,该参数会失效.

该参数的一些使用场景可以参考如下的测试:

创建测试表和索引

SQL> conn test/test
已连接。
SQL> drop table a;
表已删除。
SQL> create table a(id number);
表已创建。
SQL> create unique index idx_a_id on a(id);
索引已创建。
SQL> declare
  2  begin
  3  for a in 1..1000 loop
  4  insert into a(id) values(a);
  5  end loop;
  6  end;
  7  /
PL/SQL 过程已成功完成。
SQL> commit;
提交完成。
SQL> show parameter SKIP_UNUSABLE_INDEXES;
NAME                                 TYPE        VALUE
------------------------------------ ----------- -------------------
skip_unusable_indexes                boolean     TRUE
SQL> select * from a where id=1;

执行计划
----------------------------------------------------------
Plan hash value: 277080427
------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |     1 |    13 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| IDX_A_ID |     1 |    13 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=1)

统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
        124  redo size
        402  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

修改skip_unusable_indexes为false

SQL> alter system set skip_unusable_indexes=false scope=memory;
系统已更改。
将索引修改为不可用
SQL> alter index idx_a_id unusable;
索引已更改。
出现错误提示索引不可用
SQL> select * from a where id=1;
select * from a where id=1
*
第 1 行出现错误:
ORA-01502: 索引 'TEST.IDX_A_ID' 或这类索引的分区处于不可用状态

将skip_unusable_indexes修改为true

SQL> alter system set skip_unusable_indexes=true scope=memory;
系统已更改。

对于查询操作此时该sql能够正常运行,但是此时进行的是全表扫描

SQL> select * from a where id=1;

执行计划
----------------------------------------------------------
Plan hash value: 2248738933
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     4 |    52 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| A    |     4 |    52 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("ID"=1)

统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        402  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

使用hint强制使用索引,此时会提示索引无效

SQL> select /*+index(a)*/ * from a where id=1;
select /*+index(a)*/ * from a where id=1
*
第 1 行出现错误:
ORA-01502: 索引 'TEST.IDX_A_ID' 或这类索引的分区处于不可用状态
--插入操作会出错
SQL> insert into a values(1002);
insert into a values(1002)
*
第 1 行出现错误:
ORA-01502: 索引 'TEST.IDX_A_ID' 或这类索引的分区处于不可用状态
SQL> delete from a where id=1;
delete from a where id=1
*
第 1 行出现错误:
ORA-01502: 索引 'TEST.IDX_A_ID' 或这类索引的分区处于不可用状态

SQL>

解决方法,重建索引

SQL> alter index test.idx_a_id rebuild;
索引已更改。
SQL> select /*+index(a)*/ * from a where id=1;

执行计划
----------------------------------------------------------
Plan hash value: 277080427
------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |     1 |    13 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| IDX_A_ID |     1 |    13 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=1)

统计信息
----------------------------------------------------------
         15  recursive calls
          0  db block gets
          5  consistent gets
          1  physical reads
          0  redo size
        402  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> drop index test.idx_a_id;
索引已删除。
SQL> create index test.idx_a_id on a(id);
索引已创建。
SQL> alter index test.idx_a_id unusable;
索引已更改。
SQL> insert into a values(1002);
已创建 1 行。
SQL> commit;

测试证明SKIP_UNUSABLE_INDEXES对于使用hint强制使用索引的语句和唯一索引的插入、删除语句却不能生效。

该测试摘自互联网,同时做了一些修改.

ORA-63999 data file suffered media failure 导致实例Crash

KCF: read, write or open error, block=0xb79ab online=1
        file=85 '/dev/vgpmesdb12/rLV_FEM_PRD_I02'
        error=27063 txt: 'HPUX-ia64 Error: 11: Resource temporarily unavailable
Additional information: -1
Additional information: 8192'
Encountered write error

*** 2015-06-17 20:06:22.714
DDE rules only execution for: ORA 1110
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
Successfully dispatched
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (SUCCESS, 0 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----
error 63999 detected in background process
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 85 (block # 752043)
ORA-01110: data file 85: '/dev/vgpmesdb12/rLV_FEM_PRD_I02'
ORA-27063: number of bytes read/written is incorrect
HPUX-ia64 Error: 11: Resource temporarily unavailable
Additional information: -1
Additional information: 8192
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+544<-kjzdssdmp()+400<-kjzduptcctx()+432<-kjzdicrshnfy()+128<-$cold_ksuitm()+5872<-$cold_ksbrdp()+2704<-opirip()+1296<-opidrv()+1152<-sou2o()+256<-opimai_real()+352<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80
----- End of Abridged Call Stack Trace -----

*** 2015-06-17 20:06:23.172
DBW1 (ospid: 5833): terminating the instance due to error 63999
ksuitm: waiting up to [5] seconds before killing DIAG(5807)

数据库文件所在设备的部分io错误,导致实例宕机。由于报错ORA-27063: number of bytes read/written is incorrect,跟踪下来是初步怀疑坏块导致,通过效验后未发现坏块,HPUX-ia64 Error: 11: Resource temporarily unavailable的错误引入考虑,初步怀疑是hp的系统内部在做一些操作时候导致/dev/vgpmesdb12/rLV_FEM_PRD_I02设备无法被访问到。
随后在官方找到相似bug 16884689 : DATABASE CRASH DUE TO ORA-27063 HPUX-IA64 ERROR: 11.从整体的诊断看,问题的原因还是因为出现了io问题导致的,而且集群内部是在发生io问题后才发现数据库资源的问题,所以需判断是否hp系统或者io系统各模块的问题.与发现的bug不同场景.

文件io错误时候实例重启受_datafile_write_errors_crash_instance控制影响。

参考:
1.


Description

This fix introduces a notable change in behaviour in that
from 11.2.0.2 onwards an I/O write error to a datafile will
now crash the instance.

Before this fix I/O errors to datafiles not in the system tablespace
offline the respective datafiles when the database is in archivelog mode.
This behavior is not always desirable. Some customers would prefer
that the instance crash due to a datafile write error.

This fix introduces a new hidden parameter to control if the instance
should crash on a write error or not:
 _datafile_write_errors_crash_instance



With this fix:
 If _datafile_write_errors_crash_instance = TRUE (default) then
  any write to a datafile which fails due to an IO error causes
  an instance crash.

 If _datafile_write_errors_crash_instance = FALSE then the behaviour
  reverts to the previous behaviour (before this fix) such that
  a write error to a datafile offlines the file (provided the DB is
  in archivelog mode and the file is not in SYSTEM tablespace in
  which case the instance is aborted)

2.

This is due to a problem with the I/O subsystem.
Issues of this nature are common when there is a problem in the I/O subsystem.
This can include, but is not limited to:

2.1 A bad sector on disk
2.2 An I/O card that is starting to fail
2.3 A bad array cable
2.4 An interruption in network connectivity, in the case of NFS mounts
2.5 Could also be caused by a OS level bug.
etc.
Review the OS Messages file as this will almost certainly reflect errors (for example   Error for Command: write(10) )

11g的deferred_segment_creation参数影响空表段创建导致报错一例 ORA-01455

用10g的EXP连接到11g进行数据导出时,遇到如下错误:

EXP-00056: 遇到 ORACLE 错误 1455
ORA-01455: 转换列溢出整数数据类型
EXP-00000: 导出终止失败

原因是Oracle11g增加了参数deferred_segment_creation,参数默认值是TRUE,这样,新建的表无记录时,会导致段创建延迟的,连DDL定义也无法获取,所以报错.

解决方法:执行以下步骤

 select 'alter table '||table_name||' allocate extent;' from dba_tables WHERE SEGMENT_CREATED='NO';

获取结果后,执行结果的alter table table_name allocate extent;语句

2. 分别执行以下的语句可以查看相关的空行段是否还存在。

select * from user_indexes WHERE SEGMENT_CREATED='NO';
select * from user_lobs where segment_created='NO';

一般这样处理后,oracle会生成。临时设置deferred_segment_creation为FALSE只会对后面创建的表有效,已经创建的表需要执行alter table TAB_XX allocate extent;才能处理exp。