BigQuery:是否可以在UDF内执行另一个查询?

编程入门 行业动态 更新时间:2024-10-12 14:20:01
本文介绍了BigQuery:是否可以在UDF内执行另一个查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一张表格,每天为每个独特用户记录一行,并在当天为该用户汇总一些统计信息,而且我需要生成一份报告,告诉我每天的信息。包括当天在过去30天内的唯一用户数。

例如。 8月31日,它将计算8月2日至8月31日的唯一用户数量。

  • 对于8月30日,它将计算8月1日至8月30日的独特用户数。
  • ...
  • 我看了一些相关的问题,但他们不是我所需要的 - 如果用户在过去30天内登录多天,他应该只计算一次,所以我不能只需对最近30天的DAU计数进行总和。

    用于滑动窗口的Bigquery SQL聚合

    用于28天滑动窗口聚合的BigQuery SQL(无需编写28行SQL语句)

    • 编写一个简单的脚本,为每个相关日子执行一个单独的BigQuery
    • 编写一个BigQuery UDF,它将为从另一个查询中选择的每一天执行基本相同的查询

    但是我还没有发现任何有关如何在UDF中执行另一个BigQuery查询的例子,或者根本没有可能。

    解决方案

    我需要生成一份报告,告诉我每一天的不。

    以下应该做到这一点

    SELECT calendar_day, EXACT_COUNT_DISTINCT(userID)AS unique_users FROM( SELECT calendar_day,userID FROM YourTable CROSS JOIN( SELECT DATE(DATE_ADD('2016-08-08',pos - 1,DAY))AS calendar_day FROM( SELECT ROW_NUMBER()OVER()作为pos,* FROM(FLATTEN(( SELECT SPLIT(RPAD('',1 + DATEDIFF('2016-09-08','2016-08-08' ),'。'),'')AS h FROM(SELECT NULL)),h ))))AS日历 WHERE DATEDIFF(calendar_day,dt)BETWEEN 0 AND 29 ) GROUP BY calendar_day ORDER BY calendar_day DESC

    它假定YourTable有userID和dt字段(例如下面的例子)

    dt userID 2016-09- 08 1 2016-09-08 2 ...

    您可以控制: - 报告日期范围分别改变 2016-08-08 和 2016-09-08 - 在 BETWEEN 0和29

    中更改 29

    I have a table that records a row for each unique user per day with some aggregated stats for that user on that day, and I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.

    eg.

    • for Aug 31st, it'll count the unique users from Aug 2nd to Aug 31st
    • for Aug 30th, it'll count the unique users from Aug 1st to Aug 30th
    • and so on...

    I've looked at some related questions but they aren't quite what I need - if a user logs in on multiple days in the last 30 days he should be counted only once, so I can't just sum the DAU count for the last 30 days.

    Bigquery SQL for sliding window aggregate

    BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)

    So far, my ideas are to either:

    • write a simple script that'll execute a separate BigQuery for each of the relevant days
    • write a BigQuery UDF that'll execute basically the same query for each day selected from another query

    but I've not found any examples on how to execute another BigQuery query inside an UDF, or if it's possible at all.

    解决方案

    I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.

    Below should do this

    SELECT calendar_day, EXACT_COUNT_DISTINCT(userID) AS unique_users FROM ( SELECT calendar_day, userID FROM YourTable CROSS JOIN ( SELECT DATE(DATE_ADD('2016-08-08', pos - 1, "DAY")) AS calendar_day FROM ( SELECT ROW_NUMBER() OVER() AS pos, * FROM (FLATTEN(( SELECT SPLIT(RPAD('', 1 + DATEDIFF('2016-09-08', '2016-08-08'), '.'),'') AS h FROM (SELECT NULL)),h ))) ) AS calendar WHERE DATEDIFF(calendar_day, dt) BETWEEN 0 AND 29 ) GROUP BY calendar_day ORDER BY calendar_day DESC

    It assumes YourTable has userID and dt fields (like below for example)

    dt userID 2016-09-08 1 2016-09-08 2 ...

    And you can control: - reporting dates range by changing respectively 2016-08-08 and 2016-09-08 - aggregation size by changing 29 in BETWEEN 0 AND 29

    更多推荐

    BigQuery:是否可以在UDF内执行另一个查询?

    本文发布于:2023-10-26 18:18:12,感谢您对本站的认可!
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:BigQuery   UDF

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!