原文链接:https://www.depesz.com/2020/01/29/which-tables-should-be-auto-vacuumed-or-auto-analyzed/,译者(guobo507)水平有限,不足之处,请将就看吧,哈哈。文中所有命令及输出,全部照搬原文,等有时间了我再自己验证的,各位看官请轻喷。
让我们开始吧!
最近,我正在烦恼自动回收无法跟上数据库表变化的情况。为了解决该问题,我最终决定手动对所有表进行 vacuum 和 analyze(手动 vacuum/analyze 比通过 autovacuum 守护程序自动运行更快)。
但是令我感到恼火的是,我对如何检查哪些表正在等待自动清理工作没有好的方法。
所以,我写了这篇文章。
首先,让我们从最基础的地方开始。在确定是否应该进行 VACUUM 和/或 ANALYZE 时,自动清理(Autovacuum)会查看以下参数配置:
- 参数
autovacuum_analyze_scale_factor
- 参数
autovacuum_analyze_threshold
- 参数
autovacuum_vacuum_scale_factor
- 参数
autovacuum_vacuum_threshold
pg_class
视图中的reltuples
列的值(这是通过 vacuum 更新的行数的估计值)。pg_stat_all_tables
视图中n_dead_tup
列中的值(这是尚未清除的死行数)。pg_stat_all_tables
视图的n_mod_since_analyze
列中的值(自上次分析以来已修改的行数)。
在我的测试案例中,配置参数的值为:
=$ SELECT name, setting
FROM pg_settings
WHERE name IN ('autovacuum_analyze_scale_factor', 'autovacuum_analyze_threshold', 'autovacuum_vacuum_scale_factor', 'autovacuum_vacuum_threshold')
ORDER BY name;
name | setting
---------------------------------+---------
autovacuum_analyze_scale_factor | 0.1
autovacuum_analyze_threshold | 50
autovacuum_vacuum_scale_factor | 0.2
autovacuum_vacuum_threshold | 50
(4 ROWS)
规则很简单
- 如果
n_dead_tup
大于reltuples * autovacuum_vacuum_scale_factor + autovacuum_vacuum_threshold
,则运行 vacuum 操作。 - 如果
n_mod_since_analyze
大于reltuples * autovacuum_analyze_scale_factor + autovacuum_analyze_threshold
,则运行 analyze 操作。
让我们做一些数学运算。假设我的表有 10000 行,如果我删除(或更新)10000 x 0.2 + 50 = 2050 行,它将被执行 vacuum 操作。如果我更新(或插入)10000 x 0.1 + 50 = 1050 行,它将被执行 analyze 操作。非常简单,是吧!
有了数学运算,让我们看看这些值实际上是如何变化的。首先我们创建一些测试表:
=$ CREATE TABLE t1 ( id INT4 generated always AS IDENTITY, payload TEXT, PRIMARY KEY (id));
=$ CREATE TABLE t2 ( id INT4 generated always AS IDENTITY, payload TEXT, PRIMARY KEY (id));
=$ CREATE TABLE t3 ( id INT4 generated always AS IDENTITY, payload TEXT, PRIMARY KEY (id));
创建后立即就有明显的统计数据产生:
=$ SELECT c.relname, c.reltuples, s.n_dead_tup, s.n_mod_since_analyze
FROM pg_class c LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
WHERE c.relname ~ '^t[123]$' ORDER BY 1;
relname | reltuples | n_dead_tup | n_mod_since_analyze
---------+-----------+------------+---------------------
t1 | 0 | 0 | 0
t2 | 0 | 0 | 0
t3 | 0 | 0 | 0
(3 ROWS)
OK。让我们加载一些数据到测试表中:
=$ copy t1 (payload) FROM '/usr/share/dict/words';
COPY 102401
=$ copy t2 (payload) FROM '/usr/share/dict/words';
COPY 102401
=$ copy t3 (payload) FROM '/usr/share/dict/words';
COPY 102401
紧接着,在禁用自动真空的情况下,你会发现统计信息会更加有趣:
=$ SELECT c.relname, c.reltuples, s.n_dead_tup, s.n_mod_since_analyze
FROM pg_class c LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
WHERE c.relname ~ '^t[123]$' ORDER BY 1;
relname | reltuples | n_dead_tup | n_mod_since_analyze
---------+-----------+------------+---------------------
t1 | 0 | 0 | 102401
t2 | 0 | 0 | 0
t3 | 0 | 0 | 0
(3 ROWS)
请注意,reltuples
仍显示为 0,因为这个值是通过 vacuum 操作来更新的(当前 autovacuum 被禁用了)。
让我们更新一些行:
=$ UPDATE t1 SET payload = 'new ' || payload WHERE random() < 0.01;
UPDATE 1068
=$ UPDATE t2 SET payload = 'new ' || payload WHERE random() < 0.1;
UPDATE 10135
=$ UPDATE t3 SET payload = 'new ' || payload WHERE random() < 0.5;
UPDATE 51028
=$ SELECT c.relname, c.reltuples, s.n_dead_tup, s.n_mod_since_analyze
FROM pg_class c LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
WHERE c.relname ~ '^t[123]$' ORDER BY 1;
relname | reltuples | n_dead_tup | n_mod_since_analyze
---------+-----------+------------+---------------------
t1 | 0 | 1068 | 103469
t2 | 0 | 10135 | 10135
t3 | 0 | 51028 | 51028
(3 ROWS)
你会发现,n_dead_tup
变为了已更新的行数,并且n_mod_since_analyze
包含了插入的行和更新的行数之和。
最后,我们来测试 DELETE 操作。但是,为了能够清楚地看到它的变化,首先我们对表进行 vacuum/analyze,以便将计数器清零:
=$ vacuum analyze t1, t2, t3;
VACUUM
=$ SELECT c.relname, c.reltuples, s.n_dead_tup, s.n_mod_since_analyze
FROM pg_class c LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
WHERE c.relname ~ '^t[123]$' ORDER BY 1;
relname | reltuples | n_dead_tup | n_mod_since_analyze
---------+-----------+------------+---------------------
t1 | 102401 | 0 | 0
t2 | 102401 | 0 | 0
t3 | 102401 | 0 | 0
(3 ROWS)
=$ DELETE FROM t1 WHERE random() < 0.01;
DELETE 999
=$ DELETE FROM t2 WHERE random() < 0.1;
DELETE 10307
=$ DELETE FROM t3 WHERE random() < 0.5;
DELETE 51087
=$ SELECT c.relname, c.reltuples, s.n_dead_tup, s.n_mod_since_analyze
FROM pg_class c LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
WHERE c.relname ~ '^t[123]$' ORDER BY 1;
relname | reltuples | n_dead_tup | n_mod_since_analyze
---------+-----------+------------+---------------------
t1 | 102401 | 999 | 999
t2 | 102401 | 10307 | 10307
t3 | 102401 | 51087 | 51087
(3 ROWS)
很好,我们发现n_dead_tup
和n_mod_since_analyze
都增加了。
接下来,让我们写一个查询,它将我们显示哪些表正在等待自动 vacuum 操作。
首先,为了将所有参数放在同一个位置,我将编写如下的 CTE:
WITH s AS (
SELECT
current_setting('autovacuum_analyze_scale_factor')::float8 AS analyze_factor,
current_setting('autovacuum_analyze_threshold')::float8 AS analyze_threshold,
current_setting('autovacuum_vacuum_scale_factor')::float8 AS vacuum_factor,
current_setting('autovacuum_vacuum_threshold')::float8 AS vacuum_threshold
)
然后,基于这个 CTE 查询,我可以获得所有表的列表,以及有关是否应该清理或分析的具体信息:
=$ WITH s AS (
SELECT
current_setting('autovacuum_analyze_scale_factor')::float8 AS analyze_factor,
current_setting('autovacuum_analyze_threshold')::float8 AS analyze_threshold,
current_setting('autovacuum_vacuum_scale_factor')::float8 AS vacuum_factor,
current_setting('autovacuum_vacuum_threshold')::float8 AS vacuum_threshold
)
SELECT
n.nspname,
c.relname,
c.oid AS relid,
t.n_dead_tup,
t.n_mod_since_analyze,
c.reltuples * s.vacuum_factor + s.vacuum_threshold AS v_threshold,
c.reltuples * s.analyze_factor + s.analyze_threshold AS a_threshold
FROM
s,
pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
JOIN pg_stat_all_tables t ON c.oid = t.relid
WHERE
c.relkind = 'r'
AND c.relname ~ '^t[123]$'
ORDER BY c.relname;
nspname | relname | relid | n_dead_tup | n_mod_since_analyze | v_threshold | a_threshold
---------+---------+-------+------------+---------------------+-------------+-------------
public | t1 | 17626 | 999 | 999 | 20530.2 | 10290.1
public | t2 | 17636 | 10307 | 112708 | 20530.2 | 10290.1
public | t3 | 17646 | 51087 | 153488 | 20530.2 | 10290.1
(3 ROWS)
我在调试整个查询时包括了relname
的过滤条件(第 23 行),以避免在调试查询时显示此数据库中的所有表。当调试完成后,条件c.relname〜'^ t [123] $'
将被删除。
所以,我们有表名,以及它们的n_dead_tup
,n_mod_since_analyze
以及 vacuum 和 analyze 的阈值。因此,让我们以更好的方式展示这个查询/结果:
=$ WITH s AS (
SELECT
current_setting('autovacuum_analyze_scale_factor')::float8 AS analyze_factor,
current_setting('autovacuum_analyze_threshold')::float8 AS analyze_threshold,
current_setting('autovacuum_vacuum_scale_factor')::float8 AS vacuum_factor,
current_setting('autovacuum_vacuum_threshold')::float8 AS vacuum_threshold
), tt AS (
SELECT
n.nspname,
c.relname,
c.oid AS relid,
t.n_dead_tup,
t.n_mod_since_analyze,
c.reltuples * s.vacuum_factor + s.vacuum_threshold AS v_threshold,
c.reltuples * s.analyze_factor + s.analyze_threshold AS a_threshold
FROM
s,
pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
JOIN pg_stat_all_tables t ON c.oid = t.relid
WHERE
c.relkind = 'r'
)
SELECT
nspname,
relname,
relid,
CASE WHEN n_dead_tup > v_threshold THEN 'VACUUM' ELSE '' END AS do_vacuum,
CASE WHEN n_mod_since_analyze > a_threshold THEN 'ANALYZE' ELSE '' END AS do_analyze
FROM
tt
WHERE
n_dead_tup > v_threshold OR
n_mod_since_analyze > a_threshold
ORDER BY nspname, relname;
nspname | relname | relid | do_vacuum | do_analyze
---------+---------+-------+-----------+------------
public | t2 | 17636 | | ANALYZE
public | t3 | 17646 | VACUUM | ANALYZE
(2 ROWS)
该查询可以存储为视图:
=$ CREATE VIEW autovacuum_queue AS
WITH s AS (
SELECT
current_setting('autovacuum_analyze_scale_factor')::float8 AS analyze_factor,
current_setting('autovacuum_analyze_threshold')::float8 AS analyze_threshold,
current_setting('autovacuum_vacuum_scale_factor')::float8 AS vacuum_factor,
current_setting('autovacuum_vacuum_threshold')::float8 AS vacuum_threshold
), tt AS (
SELECT
n.nspname,
c.relname,
c.oid AS relid,
t.n_dead_tup,
t.n_mod_since_analyze,
c.reltuples * s.vacuum_factor + s.vacuum_threshold AS v_threshold,
c.reltuples * s.analyze_factor + s.analyze_threshold AS a_threshold
FROM
s,
pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
JOIN pg_stat_all_tables t ON c.oid = t.relid
WHERE
c.relkind = 'r'
)
SELECT
nspname,
relname,
relid,
CASE WHEN n_dead_tup > v_threshold THEN 'VACUUM' ELSE '' END AS do_vacuum,
CASE WHEN n_mod_since_analyze > a_threshold THEN 'ANALYZE' ELSE '' END AS do_analyze
FROM
tt
WHERE
n_dead_tup > v_threshold OR
n_mod_since_analyze > a_threshold;
CREATE VIEW
现在,我可以通过此视图,并附加一些其他条件/信息,来展示我希望看到的内容:
=$ SELECT *,
pg_relation_size(relid),
pg_total_relation_size(relid)
FROM autovacuum_queue
ORDER BY pg_total_relation_size(relid) DESC;
nspname | relname | relid | do_vacuum | do_analyze | pg_relation_size | pg_total_relation_size
---------+---------+-------+-----------+------------+------------------+------------------------
public | t3 | 17646 | VACUUM | ANALYZE | 7143424 | 11804672
public | t2 | 17636 | | ANALYZE | 5128192 | 8011776
(2 ROWS)
好了,希望您会发现它真的非常有用。