我有一个表,该表具有3列A,B和C-其中A不是主键。我们需要为每个不同的A(按A分组)选择B,C对,并将结果附加到最终结果集的末尾。
I have a table, with 3 columns A, B , C - where A is not the primary key. We need to select the B, C pairs for each distinct A(group by A), and append the results at the end of the final result set. Is this possible in sql ?
A | B | C a1| b1| c1 a1| b2| c2 a1| b3| c3 a2| b1| c2 a2| b2| c5我需要获得
a1 | (c1,b1) ; (c2,b2);(c3;b3) a2 | (c2,b1) ; (c5,b2)作为末尾附加的行。 我通常通过sqlalchemy进行此操作,然后最终在Python中转换数据,有没有一种方法可以直接在SQL中完成此操作?
as the rows appended at the end. I normally do this via sqlalchemy, and then end up transforming the data in Python, is there a way in which I could do this directly in SQL ?
编辑&未解决的问题: 在red shift(Postgres 8.0.2)中,string_agg()的替代方法是什么-有关上述用例的更多信息。
EDIT & open question : What is the alternative to string_agg() in red shift (Postgres 8.0.2) - more info on use-case above.
使用string_agg时出现 ERROR:函数string_agg(text, unknown)不存在提示:没有函数与给定的名称和参数匹配类型。您可能需要添加显式类型转换
On using string_agg I get ERROR: function string_agg(text, "unknown") does not exist Hint: No function matches the given name and argument types. You may need to add explicit type casts
编辑2 :使用自定义聚合函数
Edit 2: Adding errors using the custom aggregate function
An error occurred when executing the SQL command: CREATE FUNCTION cut_semicolon(text) RETURNS text AS $$ BEGIN RETURN SUBSTRING($1 FROM 4) ERROR: unterminated dollar-quoted string at or near "$$ BEGIN RETURN SUBSTRING($1 FROM 4)" Position: 53 CREATE FUNCTION cut_semicolon(text) RETURNS text AS $$ ^ Execution time: 0.24s (Statement 1 of 7 finished) 0 rows affected END executed successfully Execution time: 0.22s (Statement 2 of 7 finished) An error occurred when executing the SQL command: $$ LANGUAGE 'plpgsql' IMMUTABLE ERROR: unterminated dollar-quoted string at or near "$$ LANGUAGE 'plpgsql' IMMUTABLE" Position: 1 $$ LANGUAGE 'plpgsql' IMMUTABLE ^ Execution time: 0.22s (Statement 3 of 7 finished) An error occurred when executing the SQL command: CREATE FUNCTION concat_semicolon(text, text) RETURNS text AS $$ BEGIN RETURN $1 || ' ; ' || $2 ERROR: unterminated dollar-quoted string at or near "$$ BEGIN RETURN $1 || ' ; ' || $2" Position: 62 CREATE FUNCTION concat_semicolon(text, text) RETURNS text AS $$ ^ Execution time: 0.22s (Statement 4 of 7 finished) 0 rows affected END executed successfully Execution time: 0.22s (Statement 5 of 7 finished) An error occurred when executing the SQL command: $$ LANGUAGE 'plpgsql' IMMUTABLE ERROR: unterminated dollar-quoted string at or near "$$ LANGUAGE 'plpgsql' IMMUTABLE" Position: 1 $$ LANGUAGE 'plpgsql' IMMUTABLE ^ Execution time: 0.22s (Statement 6 of 7 finished) An error occurred when executing the SQL command: CREATE AGGREGATE concat_semicolon( BASETYPE=text, SFUNC=concat_semicolon, STYPE=text, FINALFUNC=cut_semicolon, INITCOND='' ) ERROR: SQL command "CREATE AGGREGATE concat_semicolon( BASETYPE=text, SFUNC=concat_semicolon, STYPE=text, FINALFUNC=cut_semicolon, INITCOND='' )" not supported. Execution time: 0.23s (Statement 7 of 7 finished) 5 statements failed. Script execution finished Total script execution time: 1.55sGoogle网上论坛中的相关答案,& 似乎替换了分隔符;可能有帮助?-尽管我不确定,哪个;在此函数定义中替换。 参考: groups.google/forum /#!topic / sql-workbench / 5LHVUXTm3BI
Also looked through a related answer in Google groups, & it looks like replacing the separator ";" might help? - though I am not sure, which ; to replace in this function definition. Reference : groups.google/forum/#!topic/sql-workbench/5LHVUXTm3BI
编辑3: 也许,create函数本身不是支持Redshift吗? 错误:不支持CREATE FUNCTION 2013年一个线程说,forums.aws.amazon/thread.jspa?threadID=121137
Edit 3: Perhaps,create function itself is not supported in Redshift ? "ERROR: CREATE FUNCTION is not supported" A 2013 thread says so forums.aws.amazon/thread.jspa?threadID=121137
编辑4:
select A, concat(concat(concat(C, ',' ) , cast(B as varchar)), ',') from my_table group by A,B,C -- Is it ok to group by all A,B, C - since I can't group by A alone, which removes the related "C" columns-- gives -: a1 c1b1b2b3 a2 c2b1b2但不是C的所有条目(以及分号)
But not ALL the entries for C (and with semicolons)
a1 c1,b1;c2,b2;c2,b3 a2 c2,b1;c5,b2,但我想在&之间添加逗号还需要知道由A,B,C组成的组是否还可以吗?
but I would like the commas in between & also need to know if the group by A, B, C are ok ?
推荐答案得出无法解决的结论在postgres + Redshift堆栈中。 这就是我解决的方法。
Came to the conclusion that it cant be solved in postgres+ Redshift stack. This is how I solved it.
import pandas as pd df =pd.DataFrame({'A':[1,1,1,2,2,3,3,3],'B':['aaa','bbb','cc','gg','aaa','bbb','cc','gg']}) def f(x): return [x['B'].values] #s=df.groupby('A').apply(f) series =df.groupby('A').apply(f) series.name = 'metric' s=pd.DataFrame(series.reset_index()) print s A metric 0 1 [[aaa, bbb, cc]] 1 2 [[gg, aaa]] 2 3 [[bbb, cc, gg]]更多推荐
将查询结果追加到PostgreSQL中的相同结果行
发布评论