- 数据库高效优化:架构、规范与SQL技巧
- 马立和 高振娇 韩锋
- 2126字
- 2020-08-27 05:47:11
7.1 表
“表”是大家最为熟知的一个对象。它也是保存数据的实体。Oracle数据库支持多种表的类型。大家最为常见的就是堆表,它也是适用范围最广的一种表。此外,还支持索引组织表、簇表等。除了按照表的结构分类外,还可以根据表的组织形式、用途等进行分类,比如常见的分区表、临时表等。
下面我们将从最常见的堆表开始介绍。
1.堆表
堆表是Oracle默认的表类型,也是最常用的表类型。除非有特殊原因要使用其他表类型,否则都使用堆表类型。对于堆表来说,最常见的影响性能的因素就是表的规模。这一点很容易理解,规模越大,扫描的块数越多,当然成本也就越高。前面两章提到,如果对表进行全表扫描,会扫描高水位线以下的所有块。这也解释了为什么删除数据后,扫描表仍然很慢。下面通过一个示例说明。
SQL> create table t1 as select * from dba_objects; //表已创建 SQL> insert into t1 select * from t1; //已创建 18865 行 SQL> insert into t1 select * from t1; //已创建 37730 行 SQL> insert into t1 select * from t1; //已创建 75460 行 SQL> insert into t1 select * from t1; //已创建 150920 行 SQL> commit; //提交完成 //这里我们构造一张大表,并插入了几十万条记录 SQL> set serveroutput on SQL> exec show_space('t1','auto'); Total Blocks............................4096 Total Bytes.............................33554432 Unused Blocks...........................0 Unused Bytes............................0 Last Used Ext FileId....................4 Last Used Ext BlockId...................6144 Last Used Block.........................128 PL/SQL 过程已成功完成。 /* 这里我们调用了一个存储过程(附录中会详细说明)。通过这个存储过程,我们可以观察到表的高水位线信息。对于上面这个示例,高水位线的位置在Total Blocks – Unused Blocks = 4096 */ SQL> set autotracetraceonly SQL> select count(*) from t1; ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1115 (2)| 00:00:14 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 4559K| 1115 (2)| 00:00:14 | ------------------------------------------------------------------- 统计信息 ---------------------------------------------------------- 5903 consistent gets 4059 physical reads //通过上面执行的SQL语句可见,这个查询语句大约要执行5000多次逻辑读操作 SQL> delete from t1; //已删除301840行 SQL> commit; //提交完成 SQL> exec show_space('t1','auto'); Total Blocks............................4096 Total Bytes.............................33554432 Unused Blocks...........................0 Unused Bytes............................0 Last Used Ext FileId....................4 Last Used Ext BlockId...................6144 Last Used Block.........................128 //PL/SQL 过程已成功完成 //删除操作后,我们通过观察发现高水位线没有变化 SQL> select count(*) from t1; ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1096 (1)| 00:00:14 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 1 | 1096 (1)| 00:00:14 | ------------------------------------------------------------------- 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 5166 consistent gets 4026 physical reads //删除之后执行查询,仍然需要5000多次的逻辑读操作 SQL> truncate table t1; //表被截断 SQL> set autotrace off SQL> exec show_space('t1','auto'); Total Blocks............................8 Total Bytes.............................65536 Unused Blocks...........................5 Unused Bytes............................40960 Last Used Ext FileId....................4 Last Used Ext BlockId...................2096 Last Used Block.........................3 //PL/SQL 过程已成功完成 //截断表后,高水位线明显降低了 SQL> select count(*) from t1; ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 1 | 2 (0)| 00:00:01 | ------------------------------------------------------------------- 统计信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets 1 physical reads /* 高水位线降低后,再次执行查询语句,可见其逻辑读非常小。这也说明了降低高水位线对全表扫描的影响 */
2.索引组织表
索引组织表,顾名思义,就是存储在一个索引结构中的表,也就是以B+树结构存储。换句话说,在索引组织表中,索引就是数据,数据就是索引,两者合二为一。索引组织表的好处并不在于解决磁盘空间的占用,而是可以减少I/O,进而减少访问缓冲区缓存。
下面我们通过一个示例说明普通堆表与索引组织表的访问对比。
SQL> create table t_normal( aint,bint,c varchar2(100)); //表已创建 SQL> create index idx_normal_a on t_normal(a); //索引已创建 SQL> insert into t_normal select rownum,object_id,object_name from dba_objects; //已创建 18867 行 SQL> commit; //提交完成 SQL> create table t_iot(a int,bint,c varchar2(100),primary key(a)) organization index; //表已创建 SQL> insert into t_iot select rownum,object_id,object_name from dba_objects; //已创建 18869 行 SQL> commit; //提交完成 //上面分别创建了普通表和索引组织表,并插入了相同数据 SQL> set autotracetraceonly SQL> select * from t_normal where a=1000; -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 78 | 1 (0)| 00:00 | 1 | TABLE ACCESS BY INDEX ROWID | T_NORMAL | 1 | 78 | 1 (0)| 00:00 |* 2 | INDEX RANGE SCAN | IDX_NORMAL_A | 74 | | 1 (0)| 00:00 -------------------------------------------------------------------------------- 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads //从上面输出可见,在堆表中访问这条记录需要4个逻辑读操作。从执行计划可见,需要一个回表查询 SQL> select * from t_iot where a=1000; -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 78 | 1 (0)| 00:00:01 | |* 1 | INDEX UNIQUE SCAN | SYS_IOT_TOP_21676 | 1 | 78 | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------- 统计信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 2 consistent gets 0 physical reads /* 从上面输出可见,在索引组织表中访问这条记录需要2个逻辑读,相较堆表访问大大减少了。从执行计划可见,可以直接访问,不需要回表;或者说,就是通过索引直接访问数据 */
3.分区表
分区表是Oracle数据库中应对大规模数据量的一种很好的解决方案。其基本原理很简单,就是将大的对象分解为若干个小对象。当访问表时,根据分区策略,可以精确地定位到小对象(单个分区),进而提高访问效率。需要说明的是,分区表中的每个分区也是一个独立的堆表,只是逻辑上看起来是一张完整的表。
下面通过一个示例说明普通表与分区表的访问区别。
[hf@testdb] SQL> create table t_part 2 ( 3 owner varchar2(30), 4 object_namevarchar2(128), 5 object_id number, 6 created date 7 ) 8 partition by range (created) 9 ( 10 partition part_201305 values less than(to_date('2013-06-01','yyyy-mm-dd')), 11 partition part_201306 values less than(to_date('2013-07-01','yyyy-mm-dd')), 12 partition part_201307 values less than(to_date('2013-08-01','yyyy-mm-dd')), 13 partition part_201308 values less than(to_date('2013-09-01','yyyy-mm-dd')), 14 partition part_201309 values less than(to_date('2013-10-01','yyyy-mm-dd')), 15 partition part_201310 values less than(to_date('2013-11-01','yyyy-mm-dd')), 16 partition part_201311 values less than(to_date('2013-12-01','yyyy-mm-dd')), 17 partition part_201312 values less than(to_date('2014-01-01','yyyy-mm-dd')), 18 partition part_201401 values less than(to_date('2014-02-01','yyyy-mm-dd')), 19 partition part_201402 values less than(to_date('2014-03-01','yyyy-mm-dd')), 20 partition part_201403 values less than(to_date('2014-04-01','yyyy-mm-dd')), 21 partition part_201404 values less than(to_date('2014-05-01','yyyy-mm-dd')), 22 partition part_201405 values less than(to_date('2014-06-01','yyyy-mm-dd')), 23 partitionpart_max values less than(maxvalue) 24 ); Table created. [hf@testdb] SQL> create table t_normal 2 ( 3 owner varchar2(30), 4 object_namevarchar2(128), 5 object_id number, 6 created date 7 ); Table created. [hf@testdb] SQL> insert into t_normal select owner,object_name,object_id,created from sys.dba_objects; 86299 rows created. [hf@testdb] SQL> commit; Commit complete. [hf@testdb] SQL> insert into t_part select owner,object_name,object_id,created from sys.dba_objects; 86299 rows created. [hf@testdb] SQL> commit; Commit complete. //上面创建了两张表,并插入了数万条记录 [hf@testdb] SQL> exec dbms_stats.gather_table_stats('hf', 't_normal'); PL/SQL procedure successfully completed. [hf@testdb] SQL> exec dbms_stats.gather_table_stats('hf', 't_part'); PL/SQL procedure successfully completed. [hf@testdb] SQL> select * from t_part where created between to_date('2013-11-01', 'yyyy-mm-dd') and to_date('2013-11-30','yyyy-mm-dd'); -------------------------------------------------------------------------------- | Id | Operation | Name |Rows|Bytes| Cost(%CPU)| Time |Pstart|Pstop -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 105 | 2 (0)| 00:00:01| | | 1 | PARTITION RANGE SINGLE | | 1 | 105 | 2 (0)| 00:00:01| 7| 7 |* 2 | TABLE ACCESS FULL |T_PART| 1 | 105 | 2 (0)| 00:00:01| 7| 7 -------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 0 consistent gets 0 physical reads //从分区表的访问可见,没有消耗逻辑读。原因是该分区内没有数据,所以无须读取 [hf@testdb] SQL> select * from t_normal where created between to_date('2013-11-01', 'yyyy-mm-dd') and to_date('2013-11-30','yyyy-mm-dd'); ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 75 | 3300 | 171 (1)| 00:00:03 | |* 1 | TABLE ACCESS FULL | T_NORMAL | 75 | 3300 | 171 (1)| 00:00:03 | ------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 12 recursive calls 0 db block gets 643 consistent gets 0 physical reads /* 如果直接对表进行访问,则需要600多次逻辑读。由此可见,通过分区访问可以更精确地定位数据,减少访问规模 */