现象
记录一个生产问题,DBA在执行下面这个GRANT
操作的时候,偶发的会出现这个报错
[guqi@intel175 ~]$ psql -p 51300
psql (14devel)
Type "help" for help.
guqi=# grant select on all tables in schema public to guqi ;
ERROR: tuple concurrently updated
guqi=# \q
分析
GRANT
操作不会更新表内的数据,应该只是会更新系统表内的行。换言之,当执行GRANT
的时候,有些数据表在系统表内的行被别人更新了。设置下errverbose
,发现是在更新pg_class
的relacl
属性
[guqi@intel175 ~]$ psql -p 51300
psql (14devel)
Type "help" for help.
guqi=#\errverbose
guqi=# grant select on all tables in schema public to guqi ;
^CCancel request sent
ERROR: canceling statement due to user request
CONTEXT: while updating tuple (6,6) in relation "pg_class"
guqi=# select * from pg_class where ctid = '(6,6)';
oid | relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples
| relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasrules |
relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relr
ewrite | relfrozenxid | relminmxid | relacl | reloptions | relpartbound
-------+---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------
+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+-------------+-
---------------+----------------+----------------+---------------------+----------------+--------------+----------------+-----
-------+--------------+------------+--------+------------+--------------
16458 | tb1 | 2200 | 16460 | 0 | 10 | 2 | 16462 | 0 | 0 | -1
| 0 | 0 | f | f | p | r | 2 | 0 | f |
f | f | f | f | t | d | f |
0 | 580 | 1 | | |
(1 row)
同时查看pg_stat_activity
,发现有TRUNCATE
在运行,与内核开发的小伙伴沟通了下,truncate
在内部会重建表,重建会导致表的物理存储发生变化。基于这一点,我测试了下
# 窗口1
guqi=# begin;
BEGIN
guqi=*# truncate tb1 ;
TRUNCATE TABLE
-- 暂不提交
# 窗口2
guqi=# grant select on all tables in schema public to guqi ;
-- 操作会处于锁等状态
# 窗口1
guqi=*# commit;
COMMIT
# 窗口2
guqi=# grant select on all tables in schema public to guqi ;
ERROR: tuple concurrently updated
按照上述顺序执行,100%触发,而且通过跟踪tb1
在pg_class
内的ctid
值,发现TRUNCATE
前后确实变了,那么基本原因就是这个了。
解决的话,把GRANT
和TRUNCATE
分开执行,不要并发。