一、测试环境
操作系统:CentOS 7.8(虚拟机)
Postgresql版本:10.11.3
CPU:Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz 1核心 1线程
内存:4G
硬盘:30G
postgresql.conf中的shared_buffers = 1GB
二、分区表产生的背景
随着使用时间的增加,数据库中的数据量也不断增加,因此数据库查询越来越慢。
加速数据库的方法很多,如添加特定的索引,将日志目录换到单独的磁盘分区,调整数据库引擎的参数等。这些方法都能将数据库的查询性能提高到一定程度。
对于许多应用数据库来说,许多数据是历史数据并且随着时间的推移它们的重要性逐渐降低。如果能找到一个办法将这些可能不太重要的数据隐藏,数据库查询速度将会大幅提高。可以通过DELETE来达到此目的,但同时这些数据就永远不可用了。
因此,需要一个高效的把历史数据从当前查询中隐藏起来并且不造成数据丢失的方法。本文即将介绍的数据库表分区即能达到此效果。
三、分区表结构图
数据库表把一个打得物理表分成若干个小的物理表,并使得这些小物理表在逻辑上可以被当成一张表来使用。
- 主表 / 父表 / Master Table 该表是创建子表的模板。它是一个正常的普通表,但正常情况下它并不储存任何数据。
- 子表 / 分区表 / Child Table / Partition Table 这些表继承并属于一个主表。子表中存储所有的数据。主表与分区表属于一对多的关系,也就是说,一个主表包含多个分区表,而一个分区表只从属于一个主表。
四、PostgreSQL各个版本的分区表功能
分区表在不同的文档描述中使用了多个名词:原生分区 = 内置分区表 = 分区表。
PostgreSQL 9.x 之前的版本提供了一种“手动”方式使用分区表的方式,需要使用继承 + 触发器的来实现分区表,步骤较为繁琐,需要定义附表、子表、子表的约束、创建子表索引,创建分区删除、修改,触发器等。
PostgreSQL 10.x 开始提供了内置分区表(内置是相对于 10.x 之前的手动方式)。内置分区简化了操作,将部分操作内置,最终简单三步就能够创建分区表。但是只支持范围分区(RANGE)和列表分区(LIST)。
PostgreSQL 11.x 版本添加了对 HASH 分区。
range分区表
1、分区主表
create table order_range(
sensor_coding varchar(64) not null,
means_point_id INT4,
value FLOAT4 not null,
time TIMESTAMP not null
) partition by range ( time );
分区主表不能建立全局约束,使用partition by range(xxx)说明分区的方式,xxx可以是多个字段,表达式……,具体见 官网。
2、分区子表
create table order_range_202101 partition of order_range(
sensor_coding not null,
means_point_id,
value not null,
time not null
) for values from ('2021-01-01') to ('2021-02-01');
create table order_range_202102 partition of order_range(
sensor_coding not null,
means_point_id,
value not null,
time not null
) for values from ('2021-02-01') to ('2021-03-01');
bridge=# \d+ order_range
Table "public.order_range"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
sensor_coding | character varying(64) | | not null | | extended | |
means_point_id | integer | | | | plain | |
value | real | | not null | | plain | |
time | timestamp without time zone | | not null | | plain | |
Partition key: RANGE ("time")
Partitions: order_range_202101 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2021-02-01 00:00:00'),
order_range_202102 FOR VALUES FROM ('2021-02-01 00:00:00') TO ('2021-03-01 00:00:00')
说明:
- 建立分区表时必需指定主表。
- 分区表和主表的 列数量,定义必须完全一致。
- 分区表的列可以单独增加Default值,或约束。
- 当用户向主表插入数据库时,系统自动路由到对应的分区,如果没有找到对应分区,则抛出错误。
- 指定分区约束的值(范围,LIST值),范围,LIST不能重叠,重叠的路由会卡壳。
- 指定分区的列必需设置成not null,如建立主表时没设置系统会自动加上。
- Range分区范围为 >=最小值 and <最大值……
- 不支持通过更新的方法把数据从一个区移动到另外一个区,这样做会报错。如果要这样做的话需要删除原来的记录,再INSERT一条新的记录。
- 修改主表的字段名,字段类型时,会自动同时修改所有的分区。
- TRUNCATE 主表时,会清除所有继承表分区的记录,如果要清除单个分区,请对分区进行操作。
- DROP主表时会把所有子表一起给DROP掉,如果drop单个分区,请对分区进行操作。
- 使用psql能查看分区表的详细定义。
list分区表
1、分区主表
create table order_list(
sensor_coding varchar(64),
means_point_id INT4,
area varchar(32),
value FLOAT4,
time TIMESTAMP
) partition by list ( area);
2、分区子表
create table order_list_gd partition of order_list for values in ('广东') partition by range(area);
create table order_list_fj partition of order_list for values in ('福建') partition by range(area);
多级分区表
先按地区分区,再按日期分区
1、主表
create table order_range_list(
sensor_coding varchar(64),
means_point_id INT4,
area varchar(32),
value FLOAT4,
time TIMESTAMP
) partition by list ( area);
2、一级分区表
create table order_range_list_gd partition of order_range_list for values in ('广东') partition by range(time);
create table order_range_list_fj partition of order_range_list for values in ('福建') partition by range(time);
3、二级分区表
create table order_range_list_gd_201701 partition of order_range_list_gd(
sensor_coding, means_point_id, area, value, time
)for values from ('2021-01-01') to ('2021-02-01');
create table order_range_list_gd_201702 partition of order_range_list_gd(
sensor_coding, means_point_id, area, value, time
) for values from ('2021-02-01') to ('2021-03-01');
create table order_range_list_fj_201701 partition of order_range_list_fj(
sensor_coding, means_point_id, area, value, time
) for values from ('2021-01-01') to ('2021-02-01');
create table order_range_list_fj_201702 partition of order_range_list_fj(
sensor_coding, means_point_id, area, value, time
) for values from ('2021-02-01') to ('2021-03-01');
直接操作分区也要受分区规则的约束
insert into order_range_202101 (sensor_coding,means_point_id,value,time) values('CH001',1,2.9,'2021-02-02');
ERROR: new row for relation "order_range_202101" violates partition constraint
DETAIL: Failing row contains (CH001, 1, 2.9, 2021-02-02 00:00:00).
INSERT提示违反了分区约束
使用ALTER TABLE xxx ATTACH[DETACH] PARTITION 增加或删除分区
1、移除分区
录入两条测试数据
insert into order_range (sensor_coding,means_point_id,value,time) values('CH001',1,2.9,('2021-02-02'::date+(random()*31)::integer));
insert into order_range (sensor_coding,means_point_id,value,time) values('CH001',1,3.1,('2021-02-02'::date+(random()*31)::integer));
bridge=# select * from order_range;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH001 | 1 | 2.9 | 2021-02-25 00:00:00
CH001 | 1 | 3.1 | 2021-02-10 00:00:00
(2 rows)
删除分区
alter table order_range detach partition order_range_202102;
查看确认分区没了
bridge=# \d+ order_range;
Table "public.order_range"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
sensor_coding | character varying(64) | | not null | | extended | |
means_point_id | integer | | | | plain | |
value | real | | not null | | plain | |
time | timestamp without time zone | | not null | | plain | |
Partition key: RANGE ("time")
Partitions: order_range_202101 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2021-02-01 00:00:00')
数据也查不出来了
bridge=# select * from order_range;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+------
(0 rows)
但分区表还在
bridge=# select * from order_range_202102;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH001 | 1 | 2.9 | 2021-02-25 00:00:00
CH001 | 1 | 3.1 | 2021-02-10 00:00:00
(2 rows)
2、添加分区
alter table order_range attach partition order_range_202102 FOR VALUES FROM ('2021-02-01') TO ('2021-03-01');
分区表又添加回来了,加回来时可以修改其约束范围,但数据必需在约束的规则范围内。
bridge=# \d+ order_range
Table "public.order_range"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
sensor_coding | character varying(64) | | not null | | extended | |
means_point_id | integer | | | | plain | |
value | real | | not null | | plain | |
time | timestamp without time zone | | not null | | plain | |
Partition key: RANGE ("time")
Partitions: order_range_202101 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2021-02-01 00:00:00'),
order_range_202102 FOR VALUES FROM ('2021-02-01 00:00:00') TO ('2021-03-01 00:00:00')
bridge=# select * from order_range;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH001 | 1 | 2.9 | 2021-02-25 00:00:00
CH001 | 1 | 3.1 | 2021-02-10 00:00:00
(2 rows)
添加外部表作为分区表
增加一个新库,建立需要的外部表
#新建数据库
create database bridge_fdw;
#进入数据库
\c bridge_fdw
#创建表
create table order_range_fdw(
sensor_coding varchar(64) not null,
means_point_id INT4,
value FLOAT4 not null,
time TIMESTAMP not null
);
#录入一条测试数据
insert into order_range_fdw(sensor_coding,means_point_id,value,time)values
('CH002',2, 4.4,('2021-01-01'::date-(random()*31)::integer));
#查看数据
bridge_fdw=# select * from order_range_fdw;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH002 | 2 | 4.4 | 2020-12-28 00:00:00
(1 row)
在bridge库中增加外部表order_range_fdw
#切换库
\c bridge
#手动安装扩展
create extension postgres_fdw;
#建立外部服务器
create server foreign_server foreign data wrapper postgres_fdw options (host '127.0.0.1', port '5432', dbname 'bridge_fdw');
#建立外部服务器用户标识
create user mapping for postgres server foreign_server options (user 'postgres', password 'xxxxxxx');
#建立外部表
create foreign table order_range_fdw(
sensor_coding varchar(64) not null,
means_point_id INT4,
value FLOAT4 not null,
time TIMESTAMP not null
) server foreign_server
options (schema_name 'public', table_name 'order_range_fdw');
#查询数据
bridge=# select * from order_range_fdw;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH002 | 2 | 4.4 | 2020-12-28 00:00:00
(1 row)
将外部表作为分区表添加到order_range下
#添加分区表
alter table order_range attach partition order_range_fdw for values from ('2020-01-01') to ('2021-01-01');
查询
#查看order_range下的所有分区表
bridge=# \d+ order_range
Table "public.order_range"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
sensor_coding | character varying(64) | | not null | | extended | |
means_point_id | integer | | | | plain | |
value | real | | not null | | plain | |
time | timestamp without time zone | | not null | | plain | |
Partition key: RANGE ("time")
Partitions: order_range_202101 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2021-02-01 00:00:00'),
order_range_202102 FOR VALUES FROM ('2021-02-01 00:00:00') TO ('2021-03-01 00:00:00'),
order_range_fdw FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2021-01-01 00:00:00')
#查询数据
bridge=# select * from order_range where time<'2021-01-01';
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH002 | 2 | 4.4 | 2020-12-28 00:00:00
(1 row)
#查看执行计划
bridge=# explain select * from order_range where time<'2021-01-01';
QUERY PLAN
---------------------------------------------------------------------------------
Append (cost=100.00..118.44 rows=147 width=162)
-> Foreign Scan on order_range_fdw (cost=100.00..118.44 rows=147 width=162)
(2 rows)
测试能不能更新数据
bridge=# insert into order_range(sensor_coding,means_point_id,value,time)values ('CH002',2, 4.6,('2021-01-01'::date-(random()*31)::integer));
ERROR: cannot route inserted tuples to a foreign table
bridge=# update order_range set time='2020-12-12' where time = '2020-12-28';
UPDATE 1
bridge=# select * from order_range where time<'2021-01-01';
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH002 | 2 | 4.4 | 2020-12-12 00:00:00
(1 row)
bridge=# delete from order_range where time='2020-12-12';
DELETE 1
bridge=# select * from order_range where time<'2021-01-01';
sensor_coding | means_point_id | value | time
---------------+----------------+-------+------
(0 rows)
插入数据时不能路由到外部表,源码中只是提示 /* We do not yet have a way to insert into a foreign partition */
下面再说说使用外部表作为分区表还有一些问题
1、无法约束向分区表插入约束外的数据
bridge=# \d+ order_range
Table "public.order_range"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
sensor_coding | character varying(64) | | not null | | extended | |
means_point_id | integer | | | | plain | |
value | real | | not null | | plain | |
time | timestamp without time zone | | not null | | plain | |
Partition key: RANGE ("time")
Partitions: order_range_202101 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2021-02-01 00:00:00'),
order_range_202102 FOR VALUES FROM ('2021-02-01 00:00:00') TO ('2021-03-01 00:00:00'),
order_range_fdw FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2021-01-01 00:00:00')
bridge=# insert into order_range_fdw(sensor_coding,means_point_id,value,time)values
('CH002',2, 5.9,'2021-01-01');
bridge=# select * from order_range;
sensor_coding | means_point_id | value | time
---------------+----------------+-------+---------------------
CH001 | 1 | 2.9 | 2021-02-11 00:00:00
CH001 | 1 | 3.1 | 2021-02-23 00:00:00
CH002 | 2 | 5.9 | 2021-01-01 00:00:00
(3 rows)
bridge=# select * from order_range where time='2020-01-01';
sensor_coding | means_point_id | value | time
---------------+----------------+-------+------
(0 rows)
如果这样操作会导致数据查询出现不匹配。
2、sql执行时无法下推
Sql执行无法下推的话对于聚集函数的执行存在很大的性能问题,使用时一定要特别的注意,如下所示
bridge=# delete from order_range_fdw;
DELETE 1
bridge=# insert into order_range_fdw(sensor_coding,means_point_id,value,time)values
('CH002',2, 5.9,'2020-01-01');
INSERT 0 1
bridge=# insert into order_range_fdw(sensor_coding,means_point_id,value,time)values
('CH002',2, 6.6,'2020-02-01');
INSERT 0 1
#访问order_range
bridge=# explain analyze select count(1) from order_range where time<'2021-01-01';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=178.27..178.28 rows=1 width=8) (actual time=0.189..0.189 rows=1 loops=1)
-> Append (cost=100.00..175.42 rows=1138 width=0) (actual time=0.187..0.187 rows=2 loops=1)
-> Foreign Scan on order_range_fdw (cost=100.00..175.42 rows=1138 width=0) (actual time=0.187..0.187 rows=2 loops=1)
Planning time: 0.124 ms
Execution time: 0.472 ms
(5 rows)
#直接访问外部表
bridge=# explain analyze select count(1) from order_range_fdw where time<'2021-01-01';
QUERY PLAN
----------------------------------------------------------------------------------------------
Foreign Scan (cost=102.84..155.54 rows=1 width=8) (actual time=0.207..0.207 rows=1 loops=1)
Relations: Aggregate on (public.order_range_fdw)
Planning time: 0.082 ms
Execution time: 0.363 ms
(4 rows)
3、sql查询需要访问的分区表中包含了“外部分区表”和“非外部分区表”时, 无法使用Parallel Seq Scan,如下所示
#插入100W数据到分区表中
INSERT INTO order_range ( sensor_coding, means_point_id, VALUE, TIME )
SELECT
'CH'||round( random() * 1000 ),
round(random() * 100),
random() * 10 - 5,
'2021-01-01'::DATE+( random( ) * 58 ) :: INTEGER
FROM
generate_series ( 1, 1000000 );
#访问所有分区表
bridge=# explain select count(1) from order_range;
QUERY PLAN
--------------------------------------------------------------------------------------
Aggregate (cost=19090.92..19090.93 rows=1 width=8)
-> Append (cost=0.00..16582.39 rows=1003413 width=0)
-> Seq Scan on order_range_202101 (cost=0.00..8609.21 rows=525921 width=0)
-> Seq Scan on order_range_202102 (cost=0.00..7760.79 rows=474079 width=0)
-> Foreign Scan on order_range_fdw (cost=100.00..212.39 rows=3413 width=0)
(5 rows)
#只访问"非外部分区表"
bridge=# explain select count(1) from order_range where time>='2021-01-01';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=14840.27..14840.28 rows=1 width=8)
-> Gather (cost=14840.06..14840.27 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=13840.06..13840.07 rows=1 width=8)
-> Append (cost=0.00..12595.05 rows=498004 width=0)
-> Parallel Seq Scan on order_range_202101 (cost=0.00..6089.17 rows=219134 width=0)
Filter: ("time" >= '2021-01-01 00:00:00'::timestamp without time zone)
-> Parallel Seq Scan on order_range_202102 (cost=0.00..6505.88 rows=278870 width=0)
Filter: ("time" >= '2021-01-01 00:00:00'::timestamp without time zone)
(9 rows)
五、建立测试业务表
下面模拟一个传感器数据表
--非分区表
create table tb_data_all(
sensor_coding varchar(64) not null,
value FLOAT4 not null,
time date not null
);
create index tb_data_all_coding_idx on tb_data_all using btree(sensor_coding);
create index tb_data_all_time_idx on tb_data_all using btree(time);
--分区表,生成12个分区,一个月份一个表
create table tb_data(
sensor_coding varchar(64) not null,
value FLOAT4 not null,
time date not null
) partition by range (time);
create table tb_data_202101 partition of tb_data(sensor_coding,value,time) for values from ('2021-01-01') to ('2021-02-01');
create index tb_data_202101_time_idx on tb_data_202101 using btree(time);
create index tb_data_202101_coding_idx on tb_data_202101 using btree(sensor_coding);
create table tb_data_202102 partition of tb_data(sensor_coding,value,time) for values from ('2021-02-01') to ('2021-03-01');
create index tb_data_202102_time_idx on tb_data_202102 using btree(time);
create index tb_data_202102_coding_idx on tb_data_202102 using btree(sensor_coding);
create table tb_data_202103 partition of tb_data(sensor_coding,value,time) for values from ('2021-03-01') to ('2021-04-01');
create index tb_data_202103_time_idx on tb_data_202103 using btree(time);
create index tb_data_202103_coding_idx on tb_data_202103 using btree(sensor_coding);
create table tb_data_202104 partition of tb_data(sensor_coding,value,time) for values from ('2021-04-01') to ('2021-05-01');
create index tb_data_202104_time_idx on tb_data_202104 using btree(time);
create index tb_data_202104_coding_idx on tb_data_202104 using btree(sensor_coding);
create table tb_data_202105 partition of tb_data(sensor_coding,value,time) for values from ('2021-05-01') to ('2021-06-01');
create index tb_data_202105_time_idx on tb_data_202105 using btree(time);
create index tb_data_202105_coding_idx on tb_data_202105 using btree(sensor_coding);
create table tb_data_202106 partition of tb_data(sensor_coding,value,time) for values from ('2021-06-01') to ('2021-07-01');
create index tb_data_202106_time_idx on tb_data_202106 using btree(time);
create index tb_data_202106_coding_idx on tb_data_202106 using btree(sensor_coding);
create table tb_data_202107 partition of tb_data(sensor_coding,value,time) for values from ('2021-07-01') to ('2021-08-01');
create index tb_data_202107_time_idx on tb_data_202107 using btree(time);
create index tb_data_202107_coding_idx on tb_data_202107 using btree(sensor_coding);
create table tb_data_202108 partition of tb_data(sensor_coding,value,time) for values from ('2021-08-01') to ('2021-09-01');
create index tb_data_202108_time_idx on tb_data_202108 using btree(time);
create index tb_data_202108_coding_idx on tb_data_202108 using btree(sensor_coding);
create table tb_data_202109 partition of tb_data(sensor_coding,value,time) for values from ('2021-09-01') to ('2021-10-01');
create index tb_data_202109_time_idx on tb_data_202109 using btree(time);
create index tb_data_202109_coding_idx on tb_data_202109 using btree(sensor_coding);
create table tb_data_202110 partition of tb_data(sensor_coding,value,time) for values from ('2021-10-01') to ('2021-11-01');
create index tb_data_202110_time_idx on tb_data_202110 using btree(time);
create index tb_data_202110_coding_idx on tb_data_202110 using btree(sensor_coding);
create table tb_data_202111 partition of tb_data(sensor_coding,value,time) for values from ('2021-11-01') to ('2021-12-01');
create index tb_data_202111_time_idx on tb_data_202111 using btree(time);
create index tb_data_202111_coding_idx on tb_data_202111 using btree(sensor_coding);
create table tb_data_202112 partition of tb_data(sensor_coding,value,time) for values from ('2021-12-01') to ('2022-01-01');
create index tb_data_202112_time_idx on tb_data_202112 using btree(time);
create index tb_data_202112_coding_idx on tb_data_202112 using btree(sensor_coding);
六、性能测试
生成测试数据1000W条记录(尽可能平均分布)
#数据库查看sql的执行时间功能开启
\timing on
copy(
SELECT
'CH'||round(random() * 50),
random() * 10 - 5,
'2021-01-01'::DATE+(random() * 356) :: INTEGER
FROM
generate_series (1, 10000000)
) to '/tmp/data.csv'
1、非分区表数据导入测试
bridge=# copy tb_data_all(sensor_coding,value,time) from '/tmp/data.csv';
COPY 10000000
Time: 82998.526 ms (01:22.999)
2、分区表数据导入测试
bridge=# copy tb_data(sensor_coding,value,time) from '/tmp/data.csv';
COPY 10000000
Time: 105434.381 ms (01:45.434)
结论:数据导入时分区表性能相对较差,所以大数据量导入时最好直接导成分区表数据,然后直接对分区表进行操作。
查询某一天的数据--直接从cache里取数据
--非分区表
bridge=# explain (analyze,buffers) select * from tb_data_all where time='2021-06-01';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tb_data_all (cost=1053.96..56058.56 rows=36842 width=154) (actual time=5.993..102.906 rows=28047 loops=1)
Recheck Cond: ("time" = '2021-06-01'::date)
Heap Blocks: exact=21866
Buffers: shared hit=15 read=21954 written=2
-> Bitmap Index Scan on tb_data_all_time_idx (cost=0.00..1044.75 rows=36842 width=0) (actual time=3.323..3.323 rows=28047 loops=1)
Index Cond: ("time" = '2021-06-01'::date)
Buffers: shared read=103
Planning time: 0.067 ms
Execution time: 103.956 ms
(9 rows)
Time: 104.475 ms
--分区表
bridge=# explain (analyze,buffers) select * from tb_data where time='2021-06-01';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=68.19..2488.95 rows=1003 width=154) (actual time=1.715..9.844 rows=28047 loops=1)
Buffers: shared hit=4658
-> Bitmap Heap Scan on tb_data_202106 (cost=68.19..2488.95 rows=1003 width=154) (actual time=1.715..8.431 rows=28047 loops=1)
Recheck Cond: ("time" = '2021-06-01'::date)
Heap Blocks: exact=4555
Buffers: shared hit=4658
-> Bitmap Index Scan on tb_data_202106_time_idx (cost=0.00..67.94 rows=1003 width=0) (actual time=1.239..1.239 rows=28047 loops=1)
Index Cond: ("time" = '2021-06-01'::date)
Buffers: shared hit=103
Planning time: 0.218 ms
Execution time: 11.008 ms
(11 rows)
Time: 11.510 ms
结论:分区表的Planning time时间明显比非分区表要高,但比起Execution time基本可以忽略。
查询某个时间范围的数据
1、时间范围落在同一个分区内
--非分区表
bridge=# explain (analyze,buffers) select * from tb_data_all where time>'2021-06-01' and time < '2021-07-01';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tb_data_all (cost=21100.98..93037.85 rows=823858 width=12) (actual time=84.850..366.697 rows=815085 loops=1)
Recheck Cond: (("time" > '2021-06-01'::date) AND ("time" < '2021-07-01'::date))
Heap Blocks: exact=54055
Buffers: shared read=56963 written=134
-> Bitmap Index Scan on tb_data_all_time_idx (cost=0.00..20895.02 rows=823858 width=0) (actual time=75.516..75.516 rows=815085 loops=1)
Index Cond: (("time" > '2021-06-01'::date) AND ("time" < '2021-07-01'::date))
Buffers: shared read=2908
Planning time: 0.061 ms
Execution time: 391.173 ms
(9 rows)
Time: 391.509 ms
--分区表
bridge=# explain (analyze,buffers) select * from tb_data where time>'2021-06-01' and time < '2021-07-01';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..17204.98 rows=814859 width=12) (actual time=0.022..123.207 rows=815085 loops=1)
Buffers: shared hit=192 read=4366
-> Seq Scan on tb_data_202106 (cost=0.00..17204.98 rows=814859 width=12) (actual time=0.021..82.032 rows=815085 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-07-01'::date))
Rows Removed by Filter: 28047
Buffers: shared hit=192 read=4366
Planning time: 0.292 ms
Execution time: 147.448 ms
(8 rows)
Time: 148.031 ms
在同一个分区内查询优势明显。
2、跨分区查询
--非分区表
bridge=# explain (analyze,buffers) select * from tb_data_all where time>'2021-06-01' and time < '2021-12-01';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Seq Scan on tb_data_all (cost=0.00..209507.61 rows=5123213 width=12) (actual time=14.182..974.184 rows=5110991 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Rows Removed by Filter: 4889009
Buffers: shared hit=15910 read=43669
Planning time: 0.059 ms
Execution time: 1142.604 ms
(6 rows)
Time: 1142.927 ms (00:01.143)
--分区表
bridge=# explain (analyze,buffers) select * from tb_data where time>'2021-06-01' and time < '2021-12-01';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..104866.57 rows=5110765 width=12) (actual time=0.018..832.956 rows=5110991 loops=1)
Buffers: shared hit=508 read=27273
-> Seq Scan on tb_data_202106 (cost=0.00..17204.98 rows=814859 width=12) (actual time=0.017..100.094 rows=815085 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Rows Removed by Filter: 28047
Buffers: shared hit=188 read=4370
-> Seq Scan on tb_data_202107 (cost=0.00..17779.23 rows=871282 width=12) (actual time=0.014..90.054 rows=871282 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Buffers: shared hit=64 read=4646
-> Seq Scan on tb_data_202108 (cost=0.00..17759.83 rows=870322 width=12) (actual time=0.011..116.802 rows=870322 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Buffers: shared hit=64 read=4641
-> Seq Scan on tb_data_202109 (cost=0.00..17186.94 rows=842263 width=12) (actual time=0.012..82.560 rows=842263 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Buffers: shared hit=64 read=4489
-> Seq Scan on tb_data_202110 (cost=0.00..17741.64 rows=869443 width=12) (actual time=0.013..82.646 rows=869443 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Buffers: shared hit=64 read=4636
-> Seq Scan on tb_data_202111 (cost=0.00..17193.94 rows=842596 width=12) (actual time=0.011..82.794 rows=842596 loops=1)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-12-01'::date))
Buffers: shared hit=64 read=4491
Planning time: 0.337 ms
Execution time: 1005.739 ms
(23 rows)
Time: 1006.434 ms (00:01.006)
跨分区查询,大约在跨一半分区时性能相当。
查询某个月里某个传感器数据--直接从cache里取数据
1、数据都落在所在分区,并且数据量极大
--非分区表
bridge=# explain (analyze,buffers) select * from tb_data_all where time>'2021-06-01' and time < '2021-07-01' and sensor_coding='CH25';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
--
Bitmap Heap Scan on tb_data_all (cost=25394.10..61795.58 rows=16203 width=12) (actual time=88.549..147.860 rows=16415 loops=1)
Recheck Cond: (((sensor_coding)::text = 'CH25'::text) AND ("time" > '2021-06-01'::date) AND ("time" < '2021-07-01'::date))
Heap Blocks: exact=14211
Buffers: shared read=17813
-> BitmapAnd (cost=25394.10..25394.10 rows=16203 width=0) (actual time=86.520..86.520 rows=0 loops=1)
Buffers: shared read=3602
-> Bitmap Index Scan on tb_data_all_coding_idx (cost=0.00..4490.73 rows=196573 width=0) (actual time=19.394..19.394 rows=199359 loops=1
)
Index Cond: ((sensor_coding)::text = 'CH25'::text)
Buffers: shared read=694
-> Bitmap Index Scan on tb_data_all_time_idx (cost=0.00..20895.02 rows=823858 width=0) (actual time=64.585..64.585 rows=815085 loops=1)
Index Cond: (("time" > '2021-06-01'::date) AND ("time" < '2021-07-01'::date))
Buffers: shared read=2908
Planning time: 0.092 ms
Execution time: 148.409 ms
(14 rows)
Time: 148.840 ms
--分区表
bridge=# explain (analyze,buffers) select * from tb_data where time>'2021-06-01' and time < '2021-07-01' and sensor_coding='CH25';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=403.81..5285.93 rows=17900 width=12) (actual time=1.859..8.190 rows=16415 loops=1)
Buffers: shared hit=4518
-> Bitmap Heap Scan on tb_data_202106 (cost=403.81..5285.93 rows=17900 width=12) (actual time=1.858..7.374 rows=16415 loops=1)
Recheck Cond: ((sensor_coding)::text = 'CH25'::text)
Filter: (("time" > '2021-06-01'::date) AND ("time" < '2021-07-01'::date))
Rows Removed by Filter: 561
Heap Blocks: exact=4454
Buffers: shared hit=4518
-> Bitmap Index Scan on tb_data_202106_coding_idx (cost=0.00..399.33 rows=18521 width=0) (actual time=1.355..1.355 rows=16976 loops=1)
Index Cond: ((sensor_coding)::text = 'CH25'::text)
Buffers: shared hit=64
Planning time: 0.319 ms
Execution time: 8.688 ms
(13 rows)
Time: 9.530 ms
这是分区表最大的优势体现了,性能提升不是一般的大。
大家可以自己造用1000000个传感器造一样的1千万条数据,测试数据都落在所在分区,并且数据量较少的情况下两者查询速度的区别。按照别人的实验结论,在返回记录极少的情况下由于分布表的Planning time开销较大,所以非分区表有优势。
索引维护
--非分区表
bridge=# reindex index tb_data_all_time_idx;
REINDEX
Time: 10687.386 ms (00:10.687)
--分区表
bridge=# reindex index tb_data_202106_time_idx;
REINDEX
Time: 712.289 ms
这个也是分区表的优势,可以针对某个分区的索引进行重建。
删除整个分区数据
--非分区表
bridge=# delete from tb_data_all where time >='2021-06-01' and time <='2021-07-01';
DELETE 871425
Time: 5873.425 ms (00:05.873)
--分区表
bridge=# truncate table tb_data_202106;
TRUNCATE TABLE
Time: 19.743 ms
这个也是分区表的优势,可以对某个分区直接truncate。
全表扫描
--非分区表
bridge=# explain analyse select count(1) from tb_data_all;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
-
Finalize Aggregate (cost=112662.55..112662.56 rows=1 width=8) (actual time=1270.763..1270.763 rows=1 loops=1)
-> Gather (cost=112662.33..112662.54 rows=2 width=8) (actual time=1270.756..1270.805 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=111662.33..111662.34 rows=1 width=8) (actual time=1248.389..1248.389 rows=1 loops=3)
-> Parallel Seq Scan on tb_data_all (cost=0.00..101245.67 rows=4166667 width=0) (actual time=4.939..704.202 rows=3042858 loops=3)
Planning time: 0.053 ms
Execution time: 1270.865 ms
(8 rows)
Time: 1271.136 ms (00:01.271)
--分区表
bridge=# explain analyse select count(1) from tb_data;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
-----
Finalize Aggregate (cost=98219.24..98219.25 rows=1 width=8) (actual time=1571.950..1571.950 rows=1 loops=1)
-> Gather (cost=98219.02..98219.23 rows=2 width=8) (actual time=1566.493..1572.109 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=97219.02..97219.03 rows=1 width=8) (actual time=1535.437..1535.437 rows=1 loops=3)
-> Append (cost=0.00..87677.62 rows=3816562 width=0) (actual time=0.305..1141.568 rows=3052289 loops=3)
-> Parallel Seq Scan on tb_data_202101 (cost=0.00..8195.30 rows=356730 width=0) (actual time=0.303..67.581 rows=285384 loop
s=3)
-> Parallel Seq Scan on tb_data_202102 (cost=0.00..7534.25 rows=327925 width=0) (actual time=0.028..82.660 rows=262340 loop
s=3)
-> Parallel Seq Scan on tb_data_202103 (cost=0.00..8335.15 rows=362815 width=0) (actual time=0.026..78.525 rows=290252 loop
s=3)
-> Parallel Seq Scan on tb_data_202104 (cost=0.00..8063.74 rows=350974 width=0) (actual time=0.029..56.413 rows=280779 loop
s=3)
-> Parallel Seq Scan on tb_data_202105 (cost=0.00..8347.05 rows=363305 width=0) (actual time=0.031..63.793 rows=290644 loop
s=3)
-> Parallel Seq Scan on tb_data_202106 (cost=0.00..22.00 rows=1200 width=0) (actual time=0.000..0.000 rows=0 loops=3)
-> Parallel Seq Scan on tb_data_202107 (cost=0.00..8340.34 rows=363034 width=0) (actual time=0.024..65.825 rows=290427 loop
s=3)
-> Parallel Seq Scan on tb_data_202108 (cost=0.00..8331.34 rows=362634 width=0) (actual time=0.028..64.876 rows=290107 loop
s=3)
-> Parallel Seq Scan on tb_data_202109 (cost=0.00..8062.43 rows=350943 width=0) (actual time=0.028..42.131 rows=280754 loop
s=3)
-> Parallel Seq Scan on tb_data_202110 (cost=0.00..8322.68 rows=362268 width=0) (actual time=0.026..49.744 rows=289814 loop
s=3)
-> Parallel Seq Scan on tb_data_202111 (cost=0.00..8065.82 rows=351082 width=0) (actual time=0.027..68.692 rows=280865 loop
s=3)
-> Parallel Seq Scan on tb_data_202112 (cost=0.00..6057.52 rows=263652 width=0) (actual time=0.011..57.623 rows=210921 loop
s=3)
Planning time: 0.284 ms
Execution time: 1572.157 ms
(20 rows)
Time: 1572.769 ms (00:01.573)
全扫描时分区表落后,但还基本上能接受。
增加新的分区并导入数据
--生成新的分区数据
copy(
SELECT sensor_coding,value,time+31 FROM tb_data_202112
) to '/tmp/202201.csv';
--建立新的分区
create table tb_data_202201 partition of tb_data(sensor_coding,value,time) for values from ('2022-01-01') to ('2022-02-01');
create index tb_data_202201_time_idx on tb_data_202201 using btree(time);
create index tb_data_202201_coding_idx on tb_data_202201 using btree(sensor_coding);
--非分区表
bridge=# copy tb_data_all(sensor_coding,value,time) from '/tmp/202201.csv';
COPY 632764
Time: 4019.644 ms (00:04.020)
--分区表
bridge=# copy tb_data_202201(sensor_coding,value,time) from '/tmp/202201.csv';
COPY 632764
Time: 3840.127 ms (00:03.840)
对某个分区直接导入还是有为优势的,但是如果直接导入分区表主表反而会慢一点。
------结论------
从当前的测试来看,postgresql内置分区表还是很给力的。