环境部署:
1.CentOS7最小安装+PG5.5源代码(使用最新版本的PG,在tap监控时存在问题)
2.编译PG代码的依赖(开启DebugInfo包搜索):systemtap、systemtap-sdt-devel.x86_64、readline-static.x86_64、zlib-static.x86_64、kernel-devel-version【依赖perl-core.x86_64】、kernel-debuginfo-version、kernel-debuginfo-common-version
3.安装PG时的配置项‘./configure --enable-dtrace --enable-debug --enable-cassert’
4.测试systemtap安装情况:stap -v -e ‘probe vfs.read {printf("read performed\n"); exit()}‘
校准参数:
seq_page_cost(stap), cpu_tuple_cost(公式)
步骤:
1.创建测试表,插入测试数据,并刷新统计信息
①create table tbl_cost_align (id int, info text, crt_time timestamp);
②insert into tbl_cost_align select (random()*2000000000)::int, md5(random()::text), clock_timestamp() from generate_series(1,100000);
③insert into tbl_cost_align select (random()*2000000000)::int, md5(random()::text), clock_timestamp() from generate_series(1,10000000);
④analyze tbl_cost_align;
2.CHECKPOINT,并关闭数据库
3.将操作系统的缓存刷入硬盘 sync; echo 3 > /proc/sys/vm/drop_caches
4.以非0号cpu亲和启动数据库 taskset -c 1 /usr/local/pgsql/bin/postgres >/dev/null 2>&1
5.连接数据库,查看pid:select pg_backend_pid();
6.启动监控代码:
taskset -c 0 stap -e ‘ global a probe process("/usr/local/pgsql/bin/postgres").mark("query__start") { delete a println("query__start ", user_string($arg1), "pid:", pid()) } probe vfs.read.return { t = gettimeofday_ns() - @entry(gettimeofday_ns()) # if (execname() == "postgres" && devname != "N/A") a[pid()] <<< t } probe process("/usr/local/pgsql/bin/postgres").mark("query__done") { if (@count(a[pid()])) printdln("**", pid(), @count(a[pid()]), @avg(a[pid()])) println("query__done ", user_string($arg1), "pid:", pid()) if (@count(a[pid()])) { println(@hist_log(a[pid()])) #println(@hist_linear(a[pid()],1024,4096,100)) } delete a }‘ -x 18026
7.执行SQL:explain (analyze,verbose,costs,buffers,timing) select * from tbl_cost_align;
8.查询结束,观察stap及explain输出结果:
QUERY PLAN ------------------------------------------------------------------------------------------------------------------ --------------------- Seq Scan on public.tbl_cost_align (cost=0.00..195391.91 rows=10099891 width=45) (actual time=5.557..17236.458 ro ws=10100000 loops=1) Output: id, info, crt_time Buffers: shared read=94393【注意这个read指的是未命中shared buffer, 如果是命中的话会有hit=?】 Planning time: 46.697 ms Execution time: 27467.293 ms (5 rows)
2558**81109**13444【io平均响应时间】 query__done explain (analyze,verbose,costs,buffers,timing) select * from tbl_cost_align;pid:2558 value |-------------------------------------------------- count 512 | 0 1024 | 0 2048 |@@@@@@@@@@@ 14011 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 61227 8192 | 694 16384 | 96 32768 |@@@ 4787 65536 | 263 131072 | 4 262144 | 0 524288 | 0 1048576 | 0 2097152 | 1 4194304 | 14 8388608 | 1 16777216 | 9 33554432 | 1 67108864 | 1 134217728 | 0 268435456 | 0
shuai=# show seq_page_cost; seq_page_cost --------------- 1 (1 row) shuai=# shuai=# show cpu_tuple_cost; cpu_tuple_cost ---------------- 0.01 (1 row)
9.验证公式:195391.91= (shared read=)94393*1(seq_page_cost) +(rows=)10099891*0.01(cpu_tuple_cost)
10.seq_page_cost=0.013444,真是的执行时间为17236.458,所以:17236.458 = 94393*0.013444+10100000*cpu_tuple_cost,得到cpu_tuple_cost=0.00158093450574257426
10.重启数据库,刷新系统缓存,调整成本因子后再次检查执行计划:
shuai=# set seq_page_cost = 0.013444 shuai-# ; SET shuai=# set cpu_tuple_cost = 0.00158093450574257426; SET shuai=# explain (analyze,verbose,costs,buffers,timing) select * from tbl_cost_align; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ --------------------- Seq Scan on public.tbl_cost_align (cost=0.00..17236.29 rows=10099891 width=45) (actual time=24.497..18461.308 ro ws=10100000 loops=1) Output: id, info, crt_time Buffers: shared read=94393 Planning time: 93.559 ms Execution time: 28700.791 ms (5 rows)
原文:http://www.cnblogs.com/pg-libs/p/6225090.html