WHERE子句不同组合的索引策略,包括文字样式

编程入门 行业动态 更新时间:2024-10-24 16:26:29
本文介绍了WHERE子句不同组合的索引策略,包括文字样式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

此处继续其他问题:

如何获取date_part查询以命中索引?

在执行以下查询时,它命中了复合索引在datelocal,视图,印象,性别,年龄组字段上创建:

When executing the following query, it hits a compound index I created on the datelocal, views, impressions, gender, agegroup fields:

SELECT date_part('hour', datelocal) AS hour , SUM(views) FILTER (WHERE gender = 'male') AS male , SUM(views) FILTER (WHERE gender = 'female') AS female FROM reportimpression WHERE datelocal >= '2019-02-01' AND datelocal < '2019-03-01' GROUP BY 1 ORDER BY 1;

但是,我也希望能够根据例如,在哪里:

However, I'd like to be able to also filter this query down based on additional clauses in the WHERE, for example:

SELECT date_part('hour', datelocal) AS hour , SUM(views) FILTER (WHERE gender = 'male') AS male , SUM(views) FILTER (WHERE gender = 'female') AS female FROM reportimpression WHERE datelocal >= '2019-02-01' AND datelocal < '2019-03-01' AND network LIKE '%' GROUP BY 1 ORDER BY 1;

第二个查询比第一个查询慢很多,尽管它在更少的记录上运行

This second query is MUCH slower than the first, although it should be operating on far fewer records, in addition to the fact that it doesn't hit my index.

表模式:

CREATE TABLE reportimpression ( datelocal timestamp without time zone, devicename text, network text, sitecode text, advertisername text, mediafilename text, gender text, agegroup text, views integer, impressions integer, dwelltime numeric ); -- Indices ------------------------------------------------------- CREATE INDEX reportimpression_datelocal_index ON reportimpression(datelocal timestamp_ops); CREATE INDEX reportimpression_viewership_index ON reportimpression(datelocal timestamp_ops,views int4_ops,impressions int4_ops,gender text_ops,agegroup text_ops); CREATE INDEX reportimpression_test_index ON reportimpression(datelocal timestamp_ops,(date_part('hour'::text, datelocal)) float8_ops);

分析输出:

Finalize GroupAggregate (cost=1005368.37..1005385.70 rows=3151 width=24) (actual time=70615.636..70615.649 rows=24 loops=1) Group Key: (date_part('hour'::text, datelocal)) -> Sort (cost=1005368.37..1005369.94 rows=3151 width=24) (actual time=70615.631..70615.634 rows=48 loops=1) Sort Key: (date_part('hour'::text, datelocal)) Sort Method: quicksort Memory: 28kB -> Gather (cost=1005005.62..1005331.75 rows=3151 width=24) (actual time=70615.456..70641.208 rows=48 loops=1) Workers Planned: 1 Workers Launched: 1 -> Partial HashAggregate (cost=1004005.62..1004016.65 rows=3151 width=24) (actual time=70613.132..70613.152 rows=24 loops=2) Group Key: date_part('hour'::text, datelocal) -> Parallel Seq Scan on reportimpression (cost=0.00..996952.63 rows=2821195 width=17) (actual time=0.803..69876.914 rows=2429159 loops=2) Filter: ((datelocal >= '2019-02-01 00:00:00'::timestamp without time zone) AND (datelocal < '2019-03-01 00:00:00'::timestamp without time zone) AND (network ~~ '%'::text)) Rows Removed by Filter: 6701736 Planning time: 0.195 ms Execution time: 70641.349 ms

我是否需要创建其他索引,调整SELECT或完全其他?

Do I need to create additional indexes, tweak my SELECT, or something else entirely?

推荐答案

您添加的谓词使用 Like 运算符:

Your added predicate uses the LIKE operator:

AND network LIKE '%'

实际查询计划取决于您通过的内容而不是'%'。 但是,通常,纯btree索引对此没有用。您将需要一个三字母组合索引或使用文本搜索基础结构或类似工具,具体取决于您要寻找的模式。

The actual query plan depends on what you pass instead of '%'. But, generally, plain btree indexes are useless for this. You'll need a trigram index or use the text search infrastructure or similar, depending on what patterns you might be looking for.

请参阅:

  • PostgreSQL喜欢的查询性能差异
  • 与LIKE,SIMILAR TO或正则表达式中的模式匹配PostgreSQL
  • PostgreSQL LIKE query performance variations
  • Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL

您甚至可以结合使用多种索引策略。示例:

You might even combine multiple indexing strategies. Example:

  • PostgreSQL:查找最接近给定句子的句子
  • PostgreSQL: Find sentences closest to a given sentence

如果应该这样:

AND network = '<input_string>'

然后,一定要使用 = 运算符,而不是 Like 。按重要性升序排列的原因:

then, by all means, actually use the = operator, not LIKE. Reasons in ascending order of importance:

  • 较短
  • 较少混淆
  • 使Postgres计划程序的工作更简单(便宜得多)
  • 正确

  • shorter
  • less confusing
  • makes the job for the Postgres planner simpler (very slightly cheaper)
  • correct

    不小心包含特殊字符的字符串,可能会得到错误的结果。请参阅:

    If you pass a string with special characters inadvertently, you might get incorrect results. See:

    • 用于正则表达式或LIKE模式的转义函数
    • Escape function for regular expression or LIKE patterns
  • 更多推荐

    WHERE子句不同组合的索引策略,包括文字样式

    本文发布于:2023-11-29 13:35:27,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1646531.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:子句   组合   样式   索引   策略

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!