此处继续其他问题:
如何获取date_part查询以命中索引?
在执行以下查询时,它命中了复合索引在datelocal,视图,印象,性别,年龄组字段上创建:
When executing the following query, it hits a compound index I created on the datelocal, views, impressions, gender, agegroup fields:
SELECT date_part('hour', datelocal) AS hour , SUM(views) FILTER (WHERE gender = 'male') AS male , SUM(views) FILTER (WHERE gender = 'female') AS female FROM reportimpression WHERE datelocal >= '2019-02-01' AND datelocal < '2019-03-01' GROUP BY 1 ORDER BY 1;但是,我也希望能够根据例如,在哪里:
However, I'd like to be able to also filter this query down based on additional clauses in the WHERE, for example:
SELECT date_part('hour', datelocal) AS hour , SUM(views) FILTER (WHERE gender = 'male') AS male , SUM(views) FILTER (WHERE gender = 'female') AS female FROM reportimpression WHERE datelocal >= '2019-02-01' AND datelocal < '2019-03-01' AND network LIKE '%' GROUP BY 1 ORDER BY 1;第二个查询比第一个查询慢很多,尽管它在更少的记录上运行
This second query is MUCH slower than the first, although it should be operating on far fewer records, in addition to the fact that it doesn't hit my index.
表模式:
CREATE TABLE reportimpression ( datelocal timestamp without time zone, devicename text, network text, sitecode text, advertisername text, mediafilename text, gender text, agegroup text, views integer, impressions integer, dwelltime numeric ); -- Indices ------------------------------------------------------- CREATE INDEX reportimpression_datelocal_index ON reportimpression(datelocal timestamp_ops); CREATE INDEX reportimpression_viewership_index ON reportimpression(datelocal timestamp_ops,views int4_ops,impressions int4_ops,gender text_ops,agegroup text_ops); CREATE INDEX reportimpression_test_index ON reportimpression(datelocal timestamp_ops,(date_part('hour'::text, datelocal)) float8_ops);分析输出:
Finalize GroupAggregate (cost=1005368.37..1005385.70 rows=3151 width=24) (actual time=70615.636..70615.649 rows=24 loops=1) Group Key: (date_part('hour'::text, datelocal)) -> Sort (cost=1005368.37..1005369.94 rows=3151 width=24) (actual time=70615.631..70615.634 rows=48 loops=1) Sort Key: (date_part('hour'::text, datelocal)) Sort Method: quicksort Memory: 28kB -> Gather (cost=1005005.62..1005331.75 rows=3151 width=24) (actual time=70615.456..70641.208 rows=48 loops=1) Workers Planned: 1 Workers Launched: 1 -> Partial HashAggregate (cost=1004005.62..1004016.65 rows=3151 width=24) (actual time=70613.132..70613.152 rows=24 loops=2) Group Key: date_part('hour'::text, datelocal) -> Parallel Seq Scan on reportimpression (cost=0.00..996952.63 rows=2821195 width=17) (actual time=0.803..69876.914 rows=2429159 loops=2) Filter: ((datelocal >= '2019-02-01 00:00:00'::timestamp without time zone) AND (datelocal < '2019-03-01 00:00:00'::timestamp without time zone) AND (network ~~ '%'::text)) Rows Removed by Filter: 6701736 Planning time: 0.195 ms Execution time: 70641.349 ms我是否需要创建其他索引,调整SELECT或完全其他?
Do I need to create additional indexes, tweak my SELECT, or something else entirely?
推荐答案您添加的谓词使用 Like 运算符:
Your added predicate uses the LIKE operator:
AND network LIKE '%'实际查询计划取决于您通过的内容而不是'%'。 但是,通常,纯btree索引对此没有用。您将需要一个三字母组合索引或使用文本搜索基础结构或类似工具,具体取决于您要寻找的模式。
The actual query plan depends on what you pass instead of '%'. But, generally, plain btree indexes are useless for this. You'll need a trigram index or use the text search infrastructure or similar, depending on what patterns you might be looking for.
请参阅:
- PostgreSQL喜欢的查询性能差异
- 与LIKE,SIMILAR TO或正则表达式中的模式匹配PostgreSQL
- PostgreSQL LIKE query performance variations
- Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
您甚至可以结合使用多种索引策略。示例:
You might even combine multiple indexing strategies. Example:
- PostgreSQL:查找最接近给定句子的句子
- PostgreSQL: Find sentences closest to a given sentence
如果应该这样:
AND network = '<input_string>'然后,一定要使用 = 运算符,而不是 Like 。按重要性升序排列的原因:
then, by all means, actually use the = operator, not LIKE. Reasons in ascending order of importance:
正确
correct
不小心包含特殊字符的字符串,可能会得到错误的结果。请参阅:
If you pass a string with special characters inadvertently, you might get incorrect results. See:
- 用于正则表达式或LIKE模式的转义函数
- Escape function for regular expression or LIKE patterns
更多推荐
WHERE子句不同组合的索引策略,包括文字样式
发布评论