- 数据库高效优化:架构、规范与SQL技巧
- 马立和 高振娇 韩锋
- 1253字
- 2020-08-27 05:47:11
案例5 COUNT(*)到底能有多快
1.案例说明
一个大表的COUNT究竟能有多快?除类似物化视图的做法,我们所能做到的极限能有多快?这不是一个真实的案例,而是根据笔者在网上发的一篇帖子整理而来。通过对一条SQL,采用多种方式持续优化过程,表明SQL优化的手段随着优化者掌握的技能增多,其可能存在的手段也在不断增多。
(1)数据准备
数据准备的代码如下:
create table t2 select * from dba_objects; insert into t2 select * from t2; ... select count(*) from t2; =>102400000 --数据量有1亿多条 select bytes/1024/1024 from user_segments where segment_name='T2'; => 10972 --数据对象超过10GB
(2)全表扫描
全表扫描的代码如下(共用124秒):
select count(*) from t2; Elapsed: 00:02:04.09 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 381K (1) | 01:16:19 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T2 | 102M| 381K (1)| 01:16:19 | ------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 1400379 consistent gets 1068862 physical reads
由上可知,全表扫描耗时较长。
(3)主键索引
主键索引的代码如下:
alter table t2 add constraint pk_t2 primary key(id); execdbms_stats.gather_index_stats('hf', 'pk_t2', estimate_percent =>10); select count(*) from t2; Elapsed: 00:00:33.18 ----------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU) | Time | ----------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 64271 (2) | 00:12:52 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| PK_T2 | 102M| 64271 (2) | 00:12:52 | ----------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 228654 consistent gets 205137 physical reads
通过引入索引,执行计划变成索引快速全扫描,由于扫描块数较少,因此耗时也大大减少,共用33秒,快多了。
(4)常数索引
常数索引的代码如下:
create index idx_0 on t2(0); execdbms_stats.gather_index_stats('hf', 'idx_0', estimate_percent =>10); select count(*) from t2; Elapsed: 00:00:28.92 ----------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU) | Time | ----------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 49601 (2) | 00:09:56 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| IDX_0 | 102M| 49601 (2) | 00:09:56 | ----------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 185899 consistent gets 167726 physical reads
常数索引在存储密度上要高于普通字段索引,因此扫描块数更少,耗时也更少,共用29秒。
(5)常数压缩索引
常数压缩索引的代码如下:
create index idx_0 on t2(0) compress; execdbms_stats.gather_index_stats('hf', 'idx_0', estimate_percent =>10); select count(*) from t2; Elapsed: 00:00:27.85 ----------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ----------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 43812 (3)| 00:08:46 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| IDX_0 | 102M| 43812 (3)| 00:08:46 | ----------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 157636 consistent gets //压缩后,减少了 141651 physical reads
索引压缩进一步减少了扫描规模,耗时缩减到27秒。
(6)位图索引
位图索引的代码如下:
create bitmap index idx_status2 on t2(status); execdbms_stats.gather_index_stats('hf', 'idx_status2', estimate_percent=> 10); select count(*) from t2; Elapsed: 00:00:00.9 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost(%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2262 (1) | 00:00:28 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | BITMAP CONVERSION COUNT | | 102M | 2262 (1)| 00:00:28 | | 3 | BITMAP INDEX FAST FULL SCAN | IDX_STATUS2 | | | | -------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2502 consistent gets //大大减少 351 physical reads
位图索引不同于B树索引,其存储密度更高。这里是采用status字段,如果使用常数索引,其规模将更小。这种手段用时0.9秒,这是质的飞跃。
(7)位图索引+并行
alter index idx_status2 parallel 8; select count(*) from t2; Elapsed: 00:00:00.03 -------------------------------------------------------------------------------- | Id | Operation | Name |Rows |Time | TQ |IN-OUT|PQ Distrib| -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 |00:00:27| | | | | 1 | SORT AGGREGATE | | 1 | | | | | | 2 | PX COORDINATOR | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | |Q1,00| P->S | QC (RAND)| | 4 | SORT AGGREGATE | | 1 | |Q1,00| PCWP | | | 5 | PX BLOCK ITERATOR | | 102M|00:00:27|Q1,00| PCWC | | | 6 | BITMAP CONVERSION COUNT | | 102M|00:00:27|Q1,00| PCWP | | | 7 | BITMAP INDEX FAST FULL SCAN |IDX_STATUS2 | | |Q1,00| PCWP | | -------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 265 recursive calls 3 db block gets 3059 consistent gets 0 physical reads
并行技术可以加快执行速度。一致性读有所增加,但并行还是能加快整体运行速度,这种手段耗时0.03秒,竟然又快了不少。
(8)分析结论
·位图索引可以按很高密度存储数据,因此往往比B树索引小很多,但前提是基数比较小。
·位图索引是保存空值的,因此可以在COUNT中利用。
·众所周知,位图索引不太适合OLTP类型数据库。该实例仅为了测试展示。
2.给我们的启示
优化没有止境,对数据库了解越多,你能想到的方法就越多。