展平BQ日志表?(Flatten BQ Logs tables?)

编程入门 行业动态 更新时间:2024-10-28 10:26:19
展平BQ日志表?(Flatten BQ Logs tables?)

我正在试图弄清楚如何压扁bigquery日志表(logs.cloudaudit_googleapis_com_data_access_20160404等),以便我基本上可以看到任何给定目标表的所有已完成的作业。

我理想情况下只需要下面的内容向我展示触及表[dataset_xyz.table_abc]的作业的所有条目,然后我就可以弄清楚如何理解根据作业类型填充的某些字段等。

SELECT * FROM [logs.cloudaudit_googleapis_com_data_access_20160404] where ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.datasetId='dataset_xyz' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.tableId='table_abc' ) or ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.tableCopy.destinationTable.datasetId='dataset_xyz' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.tableCopy.destinationTable.tableId='table_abc' ) or ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.datasetId='dataset_xyz' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId='table_abc' )

我试图做很多嵌套的扁平化但我无法弄明白,因为发现日志结构有点复杂。

我基本上希望能够查询日志导出,说“向我展示与编辑表[dataset_xyz.table_abc]有关的所有内容”,所以我猜主要是load,tableCopy和查询作业,它们附加或覆盖了任何数据table [dataset_xyz.table_abc]。

我唯一能想到的就是沿着它的嵌套记录逐字地分开表格然后以某种方式将它们全部重新组合在一起,但这似乎是一个疯狂的想法。 我确信有一种方法可以反复压扁它,但我不知道如何压扁这么复杂的结构。 即使我可以在protoPayload.serviceData.jobCompletedEvent下展平一切。*所以我可以做

select protoPayload.serviceData.jobCompletedEvent.* from flatten(...

或者也许有一种更简单的方法可以解决这个问题,我错过了?

ps我认为这可能是导游的一个很好的例子,因为我认为这是人们想做的常见事情。

I'm trying to figure out how to flatten the bigquery log tables (logs.cloudaudit_googleapis_com_data_access_20160404 etc) so that i can basically see all completed jobs for any given destination table.

I ideally just want something like the below to show me all the entries for jobs that touched the table [dataset_xyz.table_abc] and then i can figure out how to make sense of some fields being populated based on type of job etc.

SELECT * FROM [logs.cloudaudit_googleapis_com_data_access_20160404] where ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.datasetId='dataset_xyz' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.tableId='table_abc' ) or ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.tableCopy.destinationTable.datasetId='dataset_xyz' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.tableCopy.destinationTable.tableId='table_abc' ) or ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.datasetId='dataset_xyz' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId='table_abc' )

I was trying to do lots of nested flattens but i could not really figure it out tbh as find the log structure a bit complex.

I basically want to be able to query the log exports to say "show me everything that had anything to do with editing table [dataset_xyz.table_abc]" so i guess mainly load, tableCopy, and query jobs that either appended or overwrote any data in table [dataset_xyz.table_abc].

The only next thing i can think of it to literally pick apart the table along its nested records and then somehow join them all back together separately but that seems like a crazy idea. I'm sure there is a way to flatten it repeatedly but i just cant figure out how to flatten such a complicated structure. Even if i could just flatten everything under protoPayload.serviceData.jobCompletedEvent.* so i could do

select protoPayload.serviceData.jobCompletedEvent.* from flatten(...

Or maybe there is a much easier way to go about this that i'm missing?

p.s. i think this could be a good example for the guide as i'd imagine is a common enough thing people want to do.

最满意答案

你试过这个:

SELECT protoPayload.serviceData.jobCompletedEvent.job.jobName.jobId FROM [audit_logs.cloudaudit_googleapis_com_data_access_20160406]省略记录if(sum(sum)

(protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.tableId ='table'和protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.datasetId ='ds')或(protoPayload.serviceData.jobCompletedEvent.job .jobConfiguration.load.destinationTable.datasetId ='ds'AND protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId ='table'

))= 0)

Have you tried this:

SELECT protoPayload.serviceData.jobCompletedEvent.job.jobName.jobId FROM [audit_logs.cloudaudit_googleapis_com_data_access_20160406] omit record if ( sum(

(protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.tableId = 'table' and protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.destinationTable.datasetId = 'ds' ) or ( protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.datasetId = 'ds' AND protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.load.destinationTable.tableId = 'table'

) ) = 0)

更多推荐

本文发布于:2023-08-07 15:12:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1464677.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:日志   BQ   展平   tables   Logs

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!