我正在尝试学习plpgsql代码以自动执行数据库中的某些数据清理.
I am trying to learn plpgsql code to automate some data cleaning in a database.
我当前的任务是将数字字段中的所有"999"值替换为"NaN".我正在尝试做的是: 1)查找架构中所有数字列 2)遍历这些内容并使用更新/替换"
My current task is to replace all of '999' values in numeric fields with 'NaN'. What I am trying to do is: 1) find all columns in a schema that are numeric 2) loop through these and use 'update/replace'
我的代码如下.我认为我的主要问题是找出如何在update语句中引用schema.table(但我敢肯定,还有其他事情我做得还不够好).
My code is below. I think my main problem is finding out how to reference the schema.table in the update statement (but I am sure there are other things I have not done too well).
我得到的错误是该关系未被识别. 任何帮助将不胜感激
The error that I am getting is that the relation is not recognised. Any assistance would be appreciated
贝基
CREATE OR REPLACE FUNCTION household.nodata_replace(schemanm text) RETURNS VOID as $$ DECLARE cname text; tname text; BEGIN --FOR col IN for cname,tname in SELECT column_name::text,table_name::text FROM information_schema.columns where table_schema = schemanm and data_type in ('integer','double precision') LOOP RAISE NOTICE 'cname is: % from %', cname, tname; EXECUTE 'update '||schemanm::regclass||'.' ||tname::regclass|| ' set ' || quote_ident(cname) ||' = replace(' || quote_ident(cname) ||', 999, NaN);'; END LOOP; END; $$ LANGUAGE plpgsql;推荐答案
为此,我宁愿使用format().占位符%I会在需要时正确引用标识符.
I would rather use format() for this. The placeholder %I takes care of properly quoting identifiers if needed.
replace()用于字符串操作,而不用于替换数字.要分配值NaN,请使用set xxx = 'NaN' 但是,但您不能对integer值执行此操作.整数不支持NaN
replace() is for string manipulation not for replacing numbers. To assign the value NaN use set xxx = 'NaN' but you cannot do this for an integer value. Integers do not support NaN
因此,您的动态SQL归结为:
So your dynamic SQL boils down to:
execute format('update %I.%I set %I = ''NaN'' where %I = 999, schemanm, tname, cname, cname);但是您将需要更改where子句以不包含integer列.您可能还希望同时包含numeric和real:
But you will need to change your where clause to not include integer columns. You probably want to include numeric and real as well:
and data_type in ('numeric','double precision', 'real')
如果您只想标记缺少信息",我宁愿将null存储在这些列中.在这种情况下,您无需区分不同的数据类型:
If you just want to mark the "absence of information", I would rather store null in those columns. In that case you don't need to distinguish between the different data types:
execute format('update %I.%I set %I = null where %I = 999, schemanm, tname, cname, cname);更多推荐
如何在plpgsql中引用架构变量
发布评论