配置asp网站,微信开发者平台入口,音乐网站开发技术人员配置,现在做一个app大概多少钱一#xff0c;
表和索引的膨胀现象 Postgres SQL 实现的MVCC的机制不同于 oracle #xff0c; mysql innodb 的 undo tablespace 的机制。 表上所用的更新和删除等操作的行为#xff0c;都不会实际的删除或修改#xff0c;而是标记为死元祖 #xff08;dead rows or dead…一
表和索引的膨胀现象 Postgres SQL 实现的MVCC的机制不同于 oracle mysql innodb 的 undo tablespace 的机制。 表上所用的更新和删除等操作的行为都不会实际的删除或修改而是标记为死元祖 dead rows or dead tuples也因此在大表进行长事务增删改的时候表的磁盘使用空间会逐渐变大并且表的读写性能会随着表膨胀的程度加深而逐渐下降。
那么说人话就是表和索引的膨胀会造成两个后果第一是磁盘空间的占用比如某个几百G的大表delete 删除后并不会释放磁盘空间并且在删除的过程中还会引发wal日志的膨胀而数据库服务器的磁盘空间并不是无限的第二个就是会使得表的查询和写入性能下降也就是查询速度降低或者插入/更新速度明显下降。
因此我们在数据库的使用过程中应该避免表膨胀至少是将表膨胀控制在一个合理的可接受的范围内完全的避免表膨胀是确定无疑的不可能。
postgresql数据库对于表膨胀这个问题是有几种处理方式
第一是在postgresql的主配置文件内定义autovacuum也就是让postgresql数据库自己决定何时治理表膨胀
第二手动vacuum 治理表膨胀
第三CLUSTER命令治理表膨胀
第四利用外部插件例如pg_repack 治理表膨胀
第五recreate table or reindex : 相当于重建表和索引。
如果选择重建表的话 是类似于 create table tab_new as select * from tab_old, 然后在 创建相关索引最后进行表名的 rename 切换。还需注意表的权限需要重新赋权。另外这个也是需要应用系统的维护窗口时间的。如果选择重建索引的话 类似于 reindex CONCURRENTLY index_name, 需要注意的是需要2倍的索引存储空间进行online的索引重建。
CLUSTER背后的代码与VACUUM (FULL)相同只是增加了一个排序。因此CLUSTER存在和VACUUM (FULL)一样的问题:
CLUSTER以ACCESS EXCLUSIVE模式锁定表锁定期间阻塞所有操作VACUUM (FULL)也是一样的需要二倍于表的空间进行操作
对于大表来说根据表的数据规模大小很多时候vacuum或者CLUSTER的时候都是几个小时甚至十几个小时而在此期间表被锁住是无法接受的读写都有问题锁表会造成业务的中断。
pg_repack 锁表的时间相对vacuum或者cluster来说是比较少的大概是vacuum的锁表时间的20%因此pg_repack 是一个比较好的选择但pg_repack 的使用仍然是推荐在业务低峰期使用虽然锁表时间大幅减少。
repack 实际上是创建了一张临时表 并在原始表上创建触发器捕获数据变化同步到临时表中 并在临时表中重新创建索引最后进行临时表和原始表的切换。 工作原理和mysql 的 pt-online-schema-change 的工具是十分类似的.
下面将就pg_repack 的部署和基本使用做一个介绍
二
表膨胀治理的时机
在表膨胀治理之前我们需要了解哪些表需要治理在表膨胀治理之后我们需要清楚的知道具体治理了多少表膨胀
监控数据库级别的膨胀Show database bloat - PostgreSQL wiki
SQL语句如下 这个SQL语句比较简略主要关注上图标识的这两行wastedbytes的值越大表明表膨胀越严重
SELECTcurrent_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/ROUND((CASE WHEN otta0 THEN 0.0 ELSE sml.relpages::float/otta END)::numeric,1) AS tbloat,CASE WHEN relpages otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,iname, /*ituples::bigint, ipages::bigint, iotta,*/ROUND((CASE WHEN iotta0 OR ipages0 THEN 0.0 ELSE ipages::float/iotta END)::numeric,1) AS ibloat,CASE WHEN ipages iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (SELECTschemaname, tablename, cc.reltuples, cc.relpages, bs,CEIL((cc.reltuples*((datahdrma-(CASE WHEN datahdr%ma0 THEN ma ELSE datahdr%ma END))nullhdr24))/(bs-20::float)) AS otta,COALESCE(c2.relname,?) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all colsFROM (SELECTma,bs,schemaname,tablename,(datawidth(hdrma-(case when hdr%ma0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,(maxfracsum*(nullhdrma-(case when nullhdr%ma0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2FROM (SELECTschemaname, tablename, hdr, ma, bs,SUM((1-null_frac)*avg_width) AS datawidth,MAX(null_frac) AS maxfracsum,hdr(SELECT 1count(*)/8FROM pg_stats s2WHERE null_frac0 AND s2.schemaname s.schemaname AND s2.tablename s.tablename) AS nullhdrFROM pg_stats s, (SELECT(SELECT current_setting(block_size)::numeric) AS bs,CASE WHEN substring(v,12,3) IN (8.0,8.1,8.2) THEN 27 ELSE 23 END AS hdr,CASE WHEN v ~ mingw32 THEN 8 ELSE 4 END AS maFROM (SELECT version() AS v) AS foo) AS constantsGROUP BY 1,2,3,4,5) AS foo) AS rsJOIN pg_class cc ON cc.relname rs.tablenameJOIN pg_namespace nn ON cc.relnamespace nn.oid AND nn.nspname rs.schemaname AND nn.nspname information_schemaLEFT JOIN pg_index i ON indrelid cc.oidLEFT JOIN pg_class c2 ON c2.oid i.indexrelid
) AS sml
ORDER BY wastedbytes DESC
监控表级别的膨胀
https://github.com/ioguix/pgsql-bloat-estimation/blob/master/table/table_bloat.sql
注稍作修改屏蔽了系统库 该SQL语句执行完后重点关注上图标识的地方就行了blooat_pct 越大表明表膨胀越严重
/* WARNING: executed with a non-superuser role, the query inspect only tables and materialized view (9.3) you are granted to read.
* This query is compatible with PostgreSQL 9.0 and more
*/
SELECT current_database(), schemaname, tblname, bs*tblpages AS real_size,(tblpages-est_tblpages)*bs AS extra_size,CASE WHEN tblpages 0 AND tblpages - est_tblpages 0THEN 100 * (tblpages - est_tblpages)/tblpages::floatELSE 0END AS extra_pct, fillfactor,CASE WHEN tblpages - est_tblpages_ff 0THEN (tblpages-est_tblpages_ff)*bsELSE 0END AS bloat_size,CASE WHEN tblpages 0 AND tblpages - est_tblpages_ff 0THEN 100 * (tblpages - est_tblpages_ff)/tblpages::floatELSE 0END AS bloat_pct, is_na-- , tpl_hdr_size, tpl_data_size, (pst).free_percent (pst).dead_tuple_percent AS real_frag -- (DEBUG INFO)
FROM (SELECT ceil( reltuples / ( (bs-page_hdr)/tpl_size ) ) ceil( toasttuples / 4 ) AS est_tblpages,ceil( reltuples / ( (bs-page_hdr)*fillfactor/(tpl_size*100) ) ) ceil( toasttuples / 4 ) AS est_tblpages_ff,tblpages, fillfactor, bs, tblid, schemaname, tblname, heappages, toastpages, is_na-- , tpl_hdr_size, tpl_data_size, pgstattuple(tblid) AS pst -- (DEBUG INFO)FROM (SELECT( 4 tpl_hdr_size tpl_data_size (2*ma)- CASE WHEN tpl_hdr_size%ma 0 THEN ma ELSE tpl_hdr_size%ma END- CASE WHEN ceil(tpl_data_size)::int%ma 0 THEN ma ELSE ceil(tpl_data_size)::int%ma END) AS tpl_size, bs - page_hdr AS size_per_block, (heappages toastpages) AS tblpages, heappages,toastpages, reltuples, toasttuples, bs, page_hdr, tblid, schemaname, tblname, fillfactor, is_na-- , tpl_hdr_size, tpl_data_sizeFROM (SELECTtbl.oid AS tblid, ns.nspname AS schemaname, tbl.relname AS tblname, tbl.reltuples,tbl.relpages AS heappages, coalesce(toast.relpages, 0) AS toastpages,coalesce(toast.reltuples, 0) AS toasttuples,coalesce(substring(array_to_string(tbl.reloptions, )FROM fillfactor([0-9]))::smallint, 100) AS fillfactor,current_setting(block_size)::numeric AS bs,CASE WHEN version()~mingw32 OR version()~64-bit|x86_64|ppc64|ia64|amd64 THEN 8 ELSE 4 END AS ma,24 AS page_hdr,23 CASE WHEN MAX(coalesce(s.null_frac,0)) 0 THEN ( 7 count(s.attname) ) / 8 ELSE 0::int END CASE WHEN bool_or(att.attname oid and att.attnum 0) THEN 4 ELSE 0 END AS tpl_hdr_size,sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 0) ) AS tpl_data_size,bool_or(att.atttypid pg_catalog.name::regtype)OR sum(CASE WHEN att.attnum 0 THEN 1 ELSE 0 END) count(s.attname) AS is_naFROM pg_attribute AS attJOIN pg_class AS tbl ON att.attrelid tbl.oidJOIN pg_namespace AS ns ON ns.oid tbl.relnamespaceLEFT JOIN pg_stats AS s ON s.schemanamens.nspnameAND s.tablename tbl.relname AND s.inheritedfalse AND s.attnameatt.attnameLEFT JOIN pg_class AS toast ON tbl.reltoastrelid toast.oidWHERE NOT att.attisdroppedAND tbl.relkind in (r,m)AND schemaname not IN(pg_catalog,information_schema,repack)GROUP BY 1,2,3,4,5,6,7,8,9,10ORDER BY 2,3) AS s) AS s2
) AS s3
-- WHERE NOT is_na
-- AND tblpages*((pst).free_percent (pst).dead_tuple_percent)::float4/100 1
ORDER BY schemaname, tblname; 监控索引级别的膨胀
https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat.sql
注稍作修改屏蔽了系统库
-- WARNING: executed with a non-superuser role, the query inspect only index on tables you are granted to read.
-- WARNING: rows with is_na t are known to have bad statistics (name type is not supported).
-- This query is compatible with PostgreSQL 8.2 and after
SELECT current_database(), nspname AS schemaname, tblname, idxname, bs*(relpages)::bigint AS real_size,bs*(relpages-est_pages)::bigint AS extra_size,100 * (relpages-est_pages)::float / relpages AS extra_pct,fillfactor,CASE WHEN relpages est_pages_ffTHEN bs*(relpages-est_pages_ff)ELSE 0END AS bloat_size,100 * (relpages-est_pages_ff)::float / relpages AS bloat_pct,is_na-- , 100-(pst).avg_leaf_density AS pst_avg_bloat, est_pages, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples, relpages -- (DEBUG INFO)
FROM (SELECT coalesce(1 ceil(reltuples/floor((bs-pageopqdata-pagehdr)/(4nulldatahdrwidth)::float)), 0 -- ItemIdData size computed avg size of a tuple (nulldatahdrwidth)) AS est_pages,coalesce(1 ceil(reltuples/floor((bs-pageopqdata-pagehdr)*fillfactor/(100*(4nulldatahdrwidth)::float))), 0) AS est_pages_ff,bs, nspname, tblname, idxname, relpages, fillfactor, is_na-- , pgstatindex(idxoid) AS pst, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples -- (DEBUG INFO)FROM (SELECT maxalign, bs, nspname, tblname, idxname, reltuples, relpages, idxoid, fillfactor,( index_tuple_hdr_bm maxalign - CASE -- Add padding to the index tuple header to align on MAXALIGNWHEN index_tuple_hdr_bm%maxalign 0 THEN maxalignELSE index_tuple_hdr_bm%maxalignEND nulldatawidth maxalign - CASE -- Add padding to the data to align on MAXALIGNWHEN nulldatawidth 0 THEN 0WHEN nulldatawidth::integer%maxalign 0 THEN maxalignELSE nulldatawidth::integer%maxalignEND)::numeric AS nulldatahdrwidth, pagehdr, pageopqdata, is_na-- , index_tuple_hdr_bm, nulldatawidth -- (DEBUG INFO)FROM (SELECT n.nspname, i.tblname, i.idxname, i.reltuples, i.relpages,i.idxoid, i.fillfactor, current_setting(block_size)::numeric AS bs,CASE -- MAXALIGN: 4 on 32bits, 8 on 64bits (and mingw32 ?)WHEN version() ~ mingw32 OR version() ~ 64-bit|x86_64|ppc64|ia64|amd64 THEN 8ELSE 4END AS maxalign,/* per page header, fixed size: 20 for 7.X, 24 for others */24 AS pagehdr,/* per page btree opaque data */16 AS pageopqdata,/* per tuple header: add IndexAttributeBitMapData if some cols are null-able */CASE WHEN max(coalesce(s.null_frac,0)) 0THEN 8 -- IndexTupleData sizeELSE 8 (( 32 8 - 1 ) / 8) -- IndexTupleData size IndexAttributeBitMapData size ( max num filed per index 8 - 1 /8)END AS index_tuple_hdr_bm,/* data len: we remove null values save space using it fractionnal part from stats */sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 1024)) AS nulldatawidth,max( CASE WHEN i.atttypid pg_catalog.name::regtype THEN 1 ELSE 0 END ) 0 AS is_naFROM (SELECT ct.relname AS tblname, ct.relnamespace, ic.idxname, ic.attpos, ic.indkey, ic.indkey[ic.attpos], ic.reltuples, ic.relpages, ic.tbloid, ic.idxoid, ic.fillfactor,coalesce(a1.attnum, a2.attnum) AS attnum, coalesce(a1.attname, a2.attname) AS attname, coalesce(a1.atttypid, a2.atttypid) AS atttypid,CASE WHEN a1.attnum IS NULLTHEN ic.idxnameELSE ct.relnameEND AS attrelnameFROM (SELECT idxname, reltuples, relpages, tbloid, idxoid, fillfactor, indkey,pg_catalog.generate_series(1,indnatts) AS attposFROM (SELECT ci.relname AS idxname, ci.reltuples, ci.relpages, i.indrelid AS tbloid,i.indexrelid AS idxoid,coalesce(substring(array_to_string(ci.reloptions, )from fillfactor([0-9]))::smallint, 90) AS fillfactor,i.indnatts,pg_catalog.string_to_array(pg_catalog.textin(pg_catalog.int2vectorout(i.indkey)), )::int[] AS indkeyFROM pg_catalog.pg_index iJOIN pg_catalog.pg_class ci ON ci.oid i.indexrelidWHERE ci.relam(SELECT oid FROM pg_am WHERE amname btree)AND ci.relpages 0) AS idx_data) AS icJOIN pg_catalog.pg_class ct ON ct.oid ic.tbloidLEFT JOIN pg_catalog.pg_attribute a1 ONic.indkey[ic.attpos] 0AND a1.attrelid ic.tbloidAND a1.attnum ic.indkey[ic.attpos]LEFT JOIN pg_catalog.pg_attribute a2 ONic.indkey[ic.attpos] 0AND a2.attrelid ic.idxoidAND a2.attnum ic.attpos) iJOIN pg_catalog.pg_namespace n ON n.oid i.relnamespaceJOIN pg_catalog.pg_stats s ON s.schemaname n.nspnameAND s.tablename i.attrelnameAND s.attname i.attnameAND schemaname not IN(pg_catalog,information_schema,repack)GROUP BY 1,2,3,4,5,6,7,8,9,10,11) AS rows_data_stats) AS rows_hdr_pdg_stats
) AS relation_stats
ORDER BY nspname, tblname, idxname;
三
pg_repack的部署
pg_repack现在支持到postgresql-16版本下载地址https://github.com/reorg/pg_repack/releases/tag/ver_1.5.0
下载下来的压缩包上传到服务器后进入解压目录
先安装编译依赖
yum install openssl openssl-devel readline readline-devel -y
然后编译三连就好了
make make install
编译日志如下
[rootcentos10 pg_repack-ver_1.5.0]# make
make[1]: Entering directory /root/pg_repack-ver_1.5.0/bin
gcc -stdgnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -O2 pg_repack.o pgut/pgut.o pgut/pgut-fe.o -L/data/pgsql/lib -Wl,--as-needed -Wl,-rpath,/data/pgsql/lib,--enable-new-dtags -L/data/pgsql/lib -lpq -L/data/pgsql/lib -lpgcommon -lpgport -lpthread -lz -lreadline -lrt -lcrypt -ldl -lm -o pg_repack
make[1]: Leaving directory /root/pg_repack-ver_1.5.0/bin
make[1]: Entering directory /root/pg_repack-ver_1.5.0/lib
gcc -stdgnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -O2 -fPIC -DREPACK_VERSION1.5.0 -I. -I./ -I/data/pgsql/include/server -I/data/pgsql/include/internal -D_GNU_SOURCE -c -o repack.o repack.c
gcc -stdgnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -O2 -fPIC -DREPACK_VERSION1.5.0 -I. -I./ -I/data/pgsql/include/server -I/data/pgsql/include/internal -D_GNU_SOURCE -c -o pgut/pgut-spi.o pgut/pgut-spi.c
( echo { global:; gawk /^[^#]/ {printf %s;\n,$1} exports.txt; echo local: *; }; ) exports.list
gcc -stdgnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -O2 -fPIC -shared -Wl,--version-scriptexports.list -o pg_repack.so repack.o pgut/pgut-spi.o -L/data/pgsql/lib -Wl,--as-needed -Wl,-rpath,/data/pgsql/lib,--enable-new-dtags
sed s,REPACK_VERSION,1.5.0,g pg_repack.sql.in \
| sed s,relhasoids,false,g pg_repack--1.5.0.sql;
sed s,REPACK_VERSION,1.5.0,g pg_repack.control.in pg_repack.control
make[1]: Leaving directory /root/pg_repack-ver_1.5.0/lib
make[1]: Entering directory /root/pg_repack-ver_1.5.0/regress
make[1]: Nothing to be done for all.
make[1]: Leaving directory /root/pg_repack-ver_1.5.0/regress
[rootcentos10 pg_repack-ver_1.5.0]# echo $?
0
[rootcentos10 pg_repack-ver_1.5.0]# make install
make[1]: Entering directory /root/pg_repack-ver_1.5.0/bin
/usr/bin/mkdir -p /data/pgsql/bin
/usr/bin/install -c pg_repack /data/pgsql/bin
make[1]: Leaving directory /root/pg_repack-ver_1.5.0/bin
make[1]: Entering directory /root/pg_repack-ver_1.5.0/lib
/usr/bin/mkdir -p /data/pgsql/lib
/usr/bin/mkdir -p /data/pgsql/share/extension
/usr/bin/mkdir -p /data/pgsql/share/extension
/usr/bin/install -c -m 755 pg_repack.so /data/pgsql/lib/pg_repack.so
/usr/bin/install -c -m 644 .//pg_repack.control /data/pgsql/share/extension/
/usr/bin/install -c -m 644 pg_repack--1.5.0.sql pg_repack.control /data/pgsql/share/extension/
make[1]: Leaving directory /root/pg_repack-ver_1.5.0/lib
make[1]: Entering directory /root/pg_repack-ver_1.5.0/regress
make[1]: Nothing to be done for install.
make[1]: Leaving directory /root/pg_repack-ver_1.5.0/regress
[rootcentos10 pg_repack-ver_1.5.0]#
[rootcentos10 pg_repack-ver_1.5.0]# cd
[rootcentos10 ~]# whereis pg_repack
pg_repack: /data/pgsql/bin/pg_repack登录postgresql的命令行激活插件这里是哪个数据库需要此插件就切换到哪个数据库内
比如我需要在名为test的数据库内使用此插件
\c test
create extension pg_repack;
此插件激活后将会看到一个名为repack的scheme和一系列相关函数和两个视图 四 pg_repack的表膨胀治理能力测试
1
创建新库和大表
新库名称为test大表的创建语句如下
大表名称为testpg数据量级为2000w只有两列数据
create or replace function gen_id( a date, b date
)
returns text as $$
select lpad((random()*99)::int::text, 3, 0) || lpad((random()*99)::int::text, 3, 0) || lpad((random()*99)::int::text, 3, 0) || to_char(a (random()*(b-a))::int, yyyymmdd) || lpad((random()*99)::int::text, 3, 0) || random()::int || (case when random()*10 9 then xy else (random()*9)::int::text end ) ;
$$ language sql strict;CREATE SEQUENCE test START 1;
create table if not exists testpg (id int8 not null DEFAULT nextval(test::regclass),CONSTRAINT user_vendorcode_pkey PRIMARY KEY (id),suijishuzi VARCHAR ( 255 ) COLLATE pg_catalog.default
);insert into testpg SELECT generate_series(1,20000000) as xm, gen_id(1949-01-01, 2023-10-16) as num;2
查看大表大小和该表的膨胀情况SQL语句如下
SELECT
table_schema,
table_name,
pg_size_pretty(table_size) AS table_size,
pg_size_pretty(indexes_size) AS indexes_size,
pg_size_pretty(total_size) AS total_size
FROM (
SELECT
table_schema,
table_name,
pg_table_size(table_name) AS table_size,
pg_indexes_size(table_name) AS indexes_size,
pg_total_relation_size(table_name) AS total_size
FROM (
SELECT table_schema,( || table_schema || . || table_name || ) AS table_name
FROM information_schema.tables
) AS all_tables
ORDER BY total_size DESC
) AS pretty_sizes where table_schema not in (pg_catalog,information_schema,repack);
输出如下 使用上面的第二个查询表膨胀的语句结果如下 3
大量更新testpg这个表人为制造表膨胀
UPDATE testpg set suijishuzi123456789 WHERE suijishuzi like%1%查看表大小和膨胀率
可以看到膨胀率达到了45% 4
pg_repack表膨胀治理
[rootcentos10 ~]# su - postgres -c /data/pgsql/bin/pg_repack -d test -t public.testpg
INFO: repacking table public.testpg在膨胀治理期间可以正常的对该表读写治理完毕后查看表大小和表膨胀率
可以看到膨胀治理完全成功 观察磁盘使用可以看到符合本次膨胀治理的结果
[rootcentos10 ~]# du -sh /data/pgsql/data/
4.0G /data/pgsql/data/
[rootcentos10 ~]# du -sh /data/pgsql/data/
2.7G /data/pgsql/data/未完待续