PostgreSQL索引不用于IP范围查询

编程入门 行业动态 更新时间:2024-10-27 16:24:22
本文介绍了PostgreSQL索引不用于IP范围查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在使用PostgreSQL 9.2,并具有IP范围表.这是SQL:

I'm using PostgreSQL 9.2 and have a table of IP ranges. Here's the SQL:

CREATE TABLE ips ( id serial NOT NULL, begin_ip_num bigint, end_ip_num bigint, country_name character varying(255), CONSTRAINT ips_pkey PRIMARY KEY (id ) )

我在begin_ip_num和end_ip_num上都添加了普通的B树索引:

I've added plain B-tree indices on both begin_ip_num and end_ip_num:

CREATE INDEX index_ips_on_begin_ip_num ON ips (begin_ip_num); CREATE INDEX index_ips_on_end_ip_num ON ips (end_ip_num );

正在使用的查询是:

SELECT ips.* FROM ips WHERE 3065106743 BETWEEN begin_ip_num AND end_ip_num;

问题是我的BETWEEN查询仅使用begin_ip_num上的索引.使用索引后,它将使用end_ip_num过滤结果.这是EXPLAIN ANALYZE结果:

The problem is that my BETWEEN query is only using the index on begin_ip_num. After using the index, it filters the result using end_ip_num. Here's the EXPLAIN ANALYZE result:

Index Scan using index_ips_on_begin_ip_num on ips (cost=0.00..2173.83 rows=27136 width=76) (actual time=16.349..16.350 rows=1 loops=1) Index Cond: (3065106743::bigint >= begin_ip_num) Filter: (3065106743::bigint <= end_ip_num) Rows Removed by Filter: 47596 Total runtime: 16.425 ms

我已经尝试了各种索引组合,包括在begin_ip_num和end_ip_num上都添加了复合索引.

I've already tried various combinations of indices including adding a composite index on both begin_ip_num and end_ip_num.

推荐答案

尝试多列索引,但第二列的顺序相反:

Try a multicolumn index, but with reversed order on the second column:

CREATE INDEX index_ips_begin_end_ip_num ON ips (begin_ip_num, end_ip_num DESC);

对于单列索引,排序几乎是无关紧要的,因为它可以几乎一样快地向后扫描.但这对于多列索引很重要.

Ordering is mostly irrelevant for a single-column index, since it can be scanned backwards almost as fast. But it is important for multicolumn indexes.

使用我建议的索引,Postgres可以扫描第一列并找到地址,其余索引满足第一个条件.然后,对于第一列的每个值,它可以返回满足第二个条件的所有行,直到第一个条件失败.然后跳转到第一列的下一个值,等等. 这仍然 效率不高 ,而Postgres可能会更快,只需扫描第一个索引列并过滤第二个索引列即可.很大程度上取决于您的数据分布.

With the index I propose, Postgres can scan the first column and find the address, where the rest of the index fulfills the first condition. Then it can, for each value of the first column, return all rows that fulfill the second condition, until the first one fails. Then jump to the next value of the first column, etc. This is still not very efficient and Postgres may be faster just scanning the first index column and filtering for the second. Very much depends on your data distribution.

无论哪种方式, CLUSTER 都使用来自上面可以帮助提高性能:

Either way, CLUSTER using the multicolumn index from above can help performance:

CLUSTER ips USING index_ips_begin_end_ip_num

这样,将满足您的第一个条件的候选者打包到相同或相邻的数据页上.如果第一列的每个值都有很多行,则可以极大地提高性能.否则效果不佳. (也有用于此目的的非阻塞外部工具: pg_repack 或 pg_squeeze .)

This way, candidates fulfilling your first condition are packed onto the same or adjacent data pages. Can help performance a lot with if you have lots of rows per value of the first column. Else it is hardly effective. (There are also non-blocking external tools for the purpose: pg_repack or pg_squeeze.)

此外,自动清理还是已正确配置或正在运行?在桌子上运行ANALYZE?您需要Postgres的当前统计信息来选择合适的查询计划.

Also, is autovacuum running and configured properly or have you run ANALYZE on the table? You need current statistics for Postgres to pick appropriate query plans.

真正有用的是 GiST索引表示int8range列,自PostgreSQL 9.2起可用.

What would really help here is a GiST index for a int8range column, available since PostgreSQL 9.2.

进一步阅读:

  • 针对一系列时间戳(两列)优化查询

如果,您的IP范围可以使用内置网络类型inet或cidr ,请考虑替换您的两个bigint列.或者,更好的是,查看Andrew Gierth的附加模块 ip4r (不在标准分布中.索引策略也会相应更改.

If your IP ranges can be covered with one of the built-in network types inet or cidr, consider to replace your two bigint columns. Or, better yet, look to the additional module ip4r by Andrew Gierth (not in the standard distribution. The indexing strategy changes accordingly.

除非如此,否则您可以使用带有部分索引的复杂机制,在dba.SE上查看此相关答案.进阶的东西,但它提供出色的性能:

Barring that, you can check out this related answer on dba.SE with using a sophisticated regime with partial indexes. Advanced stuff, but it delivers great performance:

  • 空间索引可以帮助范围-限制-排序"查询

更多推荐

PostgreSQL索引不用于IP范围查询

本文发布于:2023-10-17 02:47:03,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1499546.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:索引   PostgreSQL   IP

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!