用单个查询替换循环以进行INSERT/UPDATE

编程入门行业动态更新时间:2024-10-08 02:19:49

本文介绍了用单个查询替换循环以进行INSERT/UPDATE的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在PostgreSQL中编写一个函数.它基本上执行3个步骤:

I am writing a function in PostgreSQL. It does basically 3 steps:

从源表中获取一条记录.

检查目标表中获取的记录的值，如果在目标表中找到记录，则使用获取的记录更新目标表的所有值，否则将获取的记录插入目标表.

如果不执行此循环，如果我编写用于插入/更新的单个查询，它会比上述方法快吗?如何通过编写单个查询而不是遍历每条记录并进行更新/插入来达到相同的结果.

Instead of doing this looping, if I write single query for insert/update, will it be faster than above mentioned approach? How can I achieve same result by writing single query instead looping through every records and doing updation/insertion.

我当前的方法如下

CREATE OR REPLACE FUNCTION fun1() RETURNS void AS $BODY$DECLARE source_tab_row RECORD; v_col1 TEXT; v_col2 TEXT; v_col3 TEXT; v_col4 double precision ; cnt integer; BEGIN FOR source_tab_row IN (SELECT * FROM source_tab where col5='abc') LOOP v_col1=source_tab_row.col1; v_col2=source_tab_row.col2; v_col3=source_tab_row.col3; v_col4=source_tab_row.col4; select count(*) INTO cnt from dest_tab where col1=v_col1; if (cnt =0) then -- If records is not found INSERT INTO dest_tab(col1, col2, col3,col4) VALUES( v_col1, v_col2, v_col3,v_col4) ; else --if records found then update it update dest_tab set col1=v_col1, col2=v_col2, col3=v_col3,col4=v_col4 where col1=v_col1; end if; END LOOP; END; $BODY$ LANGUAGE plpgsql;

推荐答案

更好的SQL

如果您使用的是PostgreSQL 9.1或更高版本，则绝对应该使用为此修改数据的CTE :

WITH x AS ( UPDATE dest_tab d SET col2 = s.col2 , col3 = s.col3 -- , ... FROM source_tab s WHERE s.col5 = 'abc' AND s.col1 = d.col1 RETURNING col1 ) INSERT INTO dest_tab(col1, col2, col3, col4) SELECT s.col1, s.col2, s.col3, s.col4 FROM source_tab s WHERE s.col5 = 'abc' LEFT JOIN x USING (col1) WHERE x.col1 IS NULL;

正如@Craig早已发布的那样，与基于集合的SQL相比，此类操作通常比遍历各个行要快得多.

As @Craig already posted, such operations are regularly much faster as set-based SQL than by iterating through individual rows.

但是，这种形式更快，更简单.它还在很大程度上避免了固有的(微小！)竞争状况.首先，由于这是单个SQL命令，因此时隙甚至更短.另外，如果并发事务应在UPDATE和INSERT之间输入竞争行，则会收到重复的键冲突(前提是您应有pk/唯一约束).因为您无需第二次查询dest_tab，而是将原始集重新用于INSERT.更快，更好.

However, this form is faster and simpler. It also avoids the inherent (tiny!) race condition to a large extent. To begin with, as this is a single SQL command, the time slot is even shorter. Also, if a concurrent transaction should enter competing rows between the UPDATE and the INSERT, you get a duplicate key violation (provided you have a pk / unique constraint as you should). Because you don't query dest_tab a second time and reuse the original set for the INSERT. Faster, better.

如果您发现重复的密钥冲突:没有任何不好的情况，请重试查询.

If you ever get to see a duplicate key violation: nothing bad happened, just retry the query.

它没有不涵盖与此相反的情况，在此情况下，并发事务将同时DELETE行. IMO，这确实是不太重要/经常发生的情况.

It does not cover the opposite case where a concurrent transaction would DELETE a row in the meantime. This is really the less important / frequent case, IMO.

如果为此使用plpgsql，请简化:

If you use plpgsql for this, simplify:

CREATE OR REPLACE FUNCTION fun1() RETURNS void AS $BODY$ DECLARE _source source_tab; -- name of table = type BEGIN FOR _source IN SELECT * FROM source_tab where col5 = 'abc' LOOP UPDATE dest_tab SET col2 = _source.col2 -- don't update col1, it doesn't change ,col3 = _source.col3 ,col4 = _source.col4 WHERE col1 = _source.col1; IF NOT FOUND THEN -- no row found INSERT INTO dest_tab(col1, col2, col3,col4) VALUES (_source.col1, _source.col2, _source.col3, _source.col4); END IF; END LOOP; END $BODY$ LANGUAGE plpgsql;

更多推荐

用单个查询替换循环以进行INSERT/UPDATE

本文发布于:2023-10-26 06:43:41，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1529382.html