问题描述
限时送ChatGPT账号..我正在尝试在 Dataflow runner 中运行 apache 光束管道;该作业从 bigquery 表读取数据并将数据写入数据库.
I am trying to run a apache beam pipeline in Dataflow runner; The job reads data from a bigquery table and write data to a database.
我在数据流中使用经典模板选项运行作业 - 意味着首先我必须暂存管道,然后使用适当的参数运行它.
I am running the job with classic template option in dataflow - means first I will have to stage the pipeline and then run it with appropriate argument.
我的管道选项如下
options = PipelineOptions()
options.view_as(SetupOptions).save_main_session = True
importer_options = options.view_as(ImporterOptions)
google_options = options.view_as(GoogleCloudOptions)
with beam.Pipeline(options=options) as p:
p | 'BigQuery Read' >> beam.io.ReadFromBigQuery(
table=importer_options.input_table)
ImportOptions 当前接受 input_table 作为参数.
The ImportOptions is currently accepting the input_table as an argument.
parser.add_value_provider_argument('--input-table',
help='The bigquery input table in the format dataset.table_name')
但是运行管道会引发如下错误
But running the pipeline throws me an error like below
文件/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/bigquery.py",第 791 行,如果不是 self.table_reference.projectId,则拆分:AttributeError: 'RuntimeValueProvider' 对象没有属性'项目编号'
File "/usr/local/lib/python3.8/site-packages/apache_beam/io/gcp/bigquery.py", line 791, in split if not self.table_reference.projectId: AttributeError: 'RuntimeValueProvider' object has no attribute 'projectId'
任何人都知道我在这里遗漏了什么.
Anyone has any idea what am I missing here.
我正在使用以下命令构建模板.
I am building the template using the below command.
python -m main
--runner DataflowRunner
--project 测试项目
--region=europe-west1
--staging_location gs://test/staging_python
--temp_location gs://test/test
--template_location gs://test/templates_python/test \
python -m main
--runner DataflowRunner
--project test-project
--region=europe-west1
--staging_location gs://test/staging_python
--temp_location gs://test/test
--template_location gs://test/templates_python/test \
注意 - 我尝试通过针对 input_table 提供完全限定的表名(意味着,包括项目 ID)来运行管道,但这也无济于事.
Note - I tried running the pipeline by providing the fully qualified table name against the input_table (means, including the project id), but that didn't help either.
推荐答案
我们面临同样的问题,这是自 2020 年底发布的 2.26.0 版本以来的错误.
we are facing same issue, it is a bug since version 2.26.0 which is released at the end of 2020.
创建错误报告:https://issues.apache/jira/browse/BEAM-12514
拉取请求现在可用:https://github/apache/beam/拉/15040
希望它会在下一个版本(2.31.0)中得到修复.
Hope it will be fixed in the next release (2.31.0).
这篇关于AttributeError: 'RuntimeValueProvider' 对象没有属性 'projectId'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论