按照先前几节对WAL文件结构的介绍,我们可以自行写一个解析事务日志的小程序用于查看日志文件中的内容,不过PG已经帮我们考虑到了,PG提供了dump事务日志的工具:pg_waldump.
注:pg_waldump for PG 10.x+,如果是PG 9.x或以下版本,则使用pg_xlogdump.
一、pg_waldump简介
在Linux下执行,使用--help查看帮助.
[xdb@localhost pg_wal]$ pg_waldump --help
pg_waldump decodes and displays PostgreSQL write-ahead logs for debugging.
Usage:
pg_waldump [OPTION]... [STARTSEG [ENDSEG]]
Options:
-b, --bkp-details output detailed information about backup blocks
-e, --end=RECPTR stop reading at WAL location RECPTR
-f, --follow keep retrying after reaching end of WAL
-n, --limit=N number of records to display
-p, --path=PATH directory in which to find log segment files or a
directory with a ./pg_wal that contains such files
(default: current directory, ./pg_wal, $PGDATA/pg_wal)
-r, --rmgr=RMGR only show records generated by resource manager RMGR;
use --rmgr=list to list valid resource manager names
-s, --start=RECPTR start reading at WAL location RECPTR
-t, --timeline=TLI timeline from which to read log records
(default: 1 or the value used in STARTSEG)
-V, --version output version information, then exit
-x, --xid=XID only show records with transaction ID XID
-z, --stats[=record] show statistics instead of records
(optionally, show per-record statistics)
-?, --help show this help, then exit
[xdb@localhost pg_wal]$
-b, --bkp-details output detailed information about backup blocks
输出backup blocks即full-write-page的详细信息.
-e, --end=RECPTR stop reading at WAL location RECPTR
搜索在此LSN偏移处结束
-f, --follow keep retrying after reaching end of WAL
在到达WAL末尾时仍继续尝试
-n, --limit=N number of records to display
XLOG Record的输出个数
-p, --path=PATH directory in which to find log segment files or a
directory with a ./pg_wal that contains such files
(default: current directory, ./pg_wal, $PGDATA/pg_wal)
在哪个目录下寻找WAL segment files
-r, --rmgr=RMGR only show records generated by resource manager RMGR;
use --rmgr=list to list valid resource manager names
只显示指定的RMGR的XLOG Record
-s, --start=RECPTR start reading at WAL location RECPTR
在此LSN偏移处开始搜索
-t, --timeline=TLI timeline from which to read log records
(default: 1 or the value used in STARTSEG)
指定的时间线timeline
-V, --version output version information, then exit
输出版本信息,然后退出
-x, --xid=XID only show records with transaction ID XID
只输出指定的事务ID的XLOG Record
-z, --stats[=record] show statistics instead of records
(optionally, show per-record statistics)
输出统计信息
-?, --help show this help, then exit
输出帮助信息
二、pg_waldump使用
下面是测试机上pg_wal目录中的文件
[xdb@localhost pg_wal]$ ll
total 98332
-rw-------. 1 xdb xdb 16777216 Dec 20 12:02 000000010000000100000048
-rw-------. 1 xdb xdb 16777216 Dec 19 16:47 000000010000000100000049
-rw-------. 1 xdb xdb 16777216 Dec 19 16:47 00000001000000010000004A
-rw-------. 1 xdb xdb 16777216 Dec 19 16:47 00000001000000010000004B
-rw-------. 1 xdb xdb 16777216 Dec 19 16:47 00000001000000010000004C
-rw-------. 1 xdb xdb 16777216 Dec 19 16:47 00000001000000010000004D
drwx------. 2 xdb xdb 6 Nov 16 15:48 archive_status
输出文件000000010000000100000048最开始的4个XLOG Record
命令:pg_waldump -p ./ -s 1/48000000 -n 4
[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48000000 -n 4
rmgr: Heap len (rec/tot): 77/ 77, tx: 1964, lsn: 1/48000070, prev 1/47FFFFF8, desc: INSERT off 117, blkref #0: rel 1663/16402/17028 blk 1110
rmgr: Heap len (rec/tot): 77/ 77, tx: 1964, lsn: 1/480000C0, prev 1/48000070, desc: INSERT off 7, blkref #0: rel 1663/16402/17031 blk 1111
rmgr: Heap len (rec/tot): 77/ 77, tx: 1964, lsn: 1/48000110, prev 1/480000C0, desc: INSERT off 8, blkref #0: rel 1663/16402/17031 blk 1111
rmgr: Heap len (rec/tot): 77/ 77, tx: 1964, lsn: 1/48000160, prev 1/48000110, desc: INSERT off 9, blkref #0: rel 1663/16402/17031 blk 1111
注意第一条记录,上一个LSN为1/47FFFFF8(prev 1/47FFFFF8),提示上一page最后一个XLOG Record存储在本页的XLogLongPageHeaderData中,存储的空间大小可以从该XLOG Record的LSN(1/48000070)和XLogLongPageHeaderData的大小(40B)推算获得,有兴趣的同学可以自行计算.
注:LSN的计算请参照PostgreSQL DBA(15) - WAL文件结构
查看Redo point后的XLOG Record
首先使用pg_controldata命令查看Redo point --> 1/484336A0
[xdb@localhost pg_wal]$ pg_controldata
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6624362124887945794
Database cluster state: in production
pg_control last modified: Thu 20 Dec 2018 12:17:39 PM CST
Latest checkpoint location: 1/484336D8
Latest checkpoint's REDO location: 1/484336A0
Latest checkpoint's REDO WAL file: 000000010000000100000048
Latest checkpoint's TimeLineID: 1
...
然后使用pg_waldump查看
命令:pg_waldump -p ./ -s 1/484336A0
[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/484336A0
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 1/484336A0, prev 1/48433668, desc: RUNNING_XACTS nextXid 1971 latestCompletedXid 1970 oldestRunningXid 1971
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 1/484336D8, prev 1/484336A0, desc: CHECKPOINT_ONLINE redo 1/484336A0; tli 1; prev tli 1; fpw true; xid 0:1971; oid 17046; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 1971; online
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 1/48433748, prev 1/484336D8, desc: RUNNING_XACTS nextXid 1971 latestCompletedXid 1970 oldestRunningXid 1971
pg_waldump: FATAL: error in WAL record at 1/48433748: invalid record length at 1/48433780: wanted 24, got 0
[xdb@localhost pg_wal]$
三、参考资料
PG 11 Document:pg_waldump
PostgreSQL 源码解读(109)- WAL#5(相关数据结构)
PostgreSQL DBA(15) - WAL文件结构
PostgreSQL DBA(16) - WAL segment file内部结构