我有一张表格,每天为每个独特用户记录一行,并在当天为该用户汇总一些统计信息,而且我需要生成一份报告,告诉我每天的信息。包括当天在过去30天内的唯一用户数。
例如。 8月31日,它将计算8月2日至8月31日的唯一用户数量。
我看了一些相关的问题,但他们不是我所需要的 - 如果用户在过去30天内登录多天,他应该只计算一次,所以我不能只需对最近30天的DAU计数进行总和。
用于滑动窗口的Bigquery SQL聚合
用于28天滑动窗口聚合的BigQuery SQL(无需编写28行SQL语句)
- 编写一个简单的脚本,为每个相关日子执行一个单独的BigQuery
- 编写一个BigQuery UDF,它将为从另一个查询中选择的每一天执行基本相同的查询
但是我还没有发现任何有关如何在UDF中执行另一个BigQuery查询的例子,或者根本没有可能。
解决方案
我需要生成一份报告,告诉我每一天的不。
以下应该做到这一点
SELECT calendar_day, EXACT_COUNT_DISTINCT(userID)AS unique_users FROM( SELECT calendar_day,userID FROM YourTable CROSS JOIN( SELECT DATE(DATE_ADD('2016-08-08',pos - 1,DAY))AS calendar_day FROM( SELECT ROW_NUMBER()OVER()作为pos,* FROM(FLATTEN(( SELECT SPLIT(RPAD('',1 + DATEDIFF('2016-09-08','2016-08-08' ),'。'),'')AS h FROM(SELECT NULL)),h ))))AS日历 WHERE DATEDIFF(calendar_day,dt)BETWEEN 0 AND 29 ) GROUP BY calendar_day ORDER BY calendar_day DESC它假定YourTable有userID和dt字段(例如下面的例子)
dt userID 2016-09- 08 1 2016-09-08 2 ...您可以控制: - 报告日期范围分别改变 2016-08-08 和 2016-09-08 - 在 BETWEEN 0和29
中更改 29I have a table that records a row for each unique user per day with some aggregated stats for that user on that day, and I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
eg.
- for Aug 31st, it'll count the unique users from Aug 2nd to Aug 31st
- for Aug 30th, it'll count the unique users from Aug 1st to Aug 30th
- and so on...
I've looked at some related questions but they aren't quite what I need - if a user logs in on multiple days in the last 30 days he should be counted only once, so I can't just sum the DAU count for the last 30 days.
Bigquery SQL for sliding window aggregate
BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)
So far, my ideas are to either:
- write a simple script that'll execute a separate BigQuery for each of the relevant days
- write a BigQuery UDF that'll execute basically the same query for each day selected from another query
but I've not found any examples on how to execute another BigQuery query inside an UDF, or if it's possible at all.
解决方案I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
Below should do this
SELECT calendar_day, EXACT_COUNT_DISTINCT(userID) AS unique_users FROM ( SELECT calendar_day, userID FROM YourTable CROSS JOIN ( SELECT DATE(DATE_ADD('2016-08-08', pos - 1, "DAY")) AS calendar_day FROM ( SELECT ROW_NUMBER() OVER() AS pos, * FROM (FLATTEN(( SELECT SPLIT(RPAD('', 1 + DATEDIFF('2016-09-08', '2016-08-08'), '.'),'') AS h FROM (SELECT NULL)),h ))) ) AS calendar WHERE DATEDIFF(calendar_day, dt) BETWEEN 0 AND 29 ) GROUP BY calendar_day ORDER BY calendar_day DESCIt assumes YourTable has userID and dt fields (like below for example)
dt userID 2016-09-08 1 2016-09-08 2 ...And you can control: - reporting dates range by changing respectively 2016-08-08 and 2016-09-08 - aggregation size by changing 29 in BETWEEN 0 AND 29
更多推荐
BigQuery:是否可以在UDF内执行另一个查询?
发布评论