bigquery输出中的group

编程入门 行业动态 更新时间:2024-10-25 18:26:14
本文介绍了bigquery输出中的group_concat / string_agg的最大限制是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在使用 group_concat / string_agg (可能是varchar),并且希望确保bigquery不会删除任何数据连接。

解决方案

如果特定查询的内存不足,BigQuery不会丢弃数据;你会得到一个错误。你应该尽量保持你的行大小低于100MB,因为除此之外你会开始犯错。您可以尝试使用如下示例创建大型字符串:

#standardSQL SELECT STRING_AGG(单词)AS单词FROM`bigquery-public-data.samples.shakespeare`;

此表中有164,656行,此查询创建一个字符串,其中包含1,168,286个字符(大约一兆字节在尺寸方面)。不过,如果您在单个执行节点上运行的查询需要的数量超过几百兆,那么您将开始看到一个错误:

<$ p $ (CONCAT(word,corpus))作为单词从`bigquery-public-data.samples.shakespeare` CROSS JOIN UNNEST( GENERATE_ARRAY(1,1000));

这会导致错误:

查询执行过程中超出资源。

如果您单击UI中的解释选项卡,可以看到失败发生在阶段1,同时构建 STRING_AGG 的结果。在这种情况下,字符串的长度应该是3,303,599,000个字符,或者大小约为3.3 GB。

I am using group_concat/string_agg (possibly varchar) and want to ensure that bigquery won't drop any of the data concatenated.

解决方案

BigQuery will not drop data if a particular query runs out of memory; you will get an error instead. You should try to keep your row sizes below ~100MB, since beyond that you'll start getting errors. You can try creating a large string with an example like this:

#standardSQL SELECT STRING_AGG(word) AS words FROM `bigquery-public-data.samples.shakespeare`;

There are 164,656 rows in this table, and this query creates a string with 1,168,286 characters (around a megabyte in size). You'll start to see an error if you run a query that requires more than something on the order of hundreds of megabytes on a single node of execution, though:

#standardSQL SELECT STRING_AGG(CONCAT(word, corpus)) AS words FROM `bigquery-public-data.samples.shakespeare` CROSS JOIN UNNEST(GENERATE_ARRAY(1, 1000));

This results in an error:

Resources exceeded during query execution.

If you click on the "Explanation" tab in the UI, you can see that the failure happened during stage 1 while building the results of STRING_AGG. In this case, the string would have been 3,303,599,000 characters long, or approximately 3.3 GB in size.

更多推荐

bigquery输出中的group

本文发布于:2023-10-24 09:28:33,感谢您对本站的认可!
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:bigquery   group

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!