问题描述
限时送ChatGPT账号..我无法使用 python 3.7 暂存云数据流模板.它在 apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') 无法访问的一个参数化参数上失败
I can't stage a cloud dataflow template with python 3.7. It fails on the one parametrized argument with apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible
使用 python 2.7 暂存模板工作正常.
Staging the template with python 2.7 works fine.
我尝试过使用 3.7 运行数据流作业,它们运行良好.只有模板暂存被破坏.数据流模板中仍然不支持 python 3.7 还是 python 3 中的暂存语法发生了变化?
I have tried running dataflow jobs with 3.7 and they work fine. Only the template staging is broken. Is python 3.7 still not supported in dataflow templates or did the syntax for staging in python 3 change?
这是管道部分
class WordcountOptions(PipelineOptions):
@classmethod
def _add_argparse_args(cls, parser):
parser.add_value_provider_argument(
'--input',
default='gs://dataflow-samples/shakespeare/kinglear.txt',
help='Path of the file to read from',
dest="input")
def main(argv=None):
options = PipelineOptions(flags=argv)
setup_options = options.view_as(SetupOptions)
wordcount_options = options.view_as(WordcountOptions)
with beam.Pipeline(options=setup_options) as p:
lines = p | 'read' >> ReadFromText(wordcount_options.input)
if __name__ == '__main__':
main()
这是带有暂存脚本的完整存储库 https://github/firemuzzy/dataflow-templates-bug-python3
Here is the full repo with the staging scripts https://github/firemuzzy/dataflow-templates-bug-python3
之前有一个类似的问题,但我不确定它是如何相关的,因为它是在 python 2.7 中完成的,但我的模板在 2.7 中运行良好,但在 3.7 中失败
There was a previous similar issues, but am not sure how related it is since that was done in python 2.7 but my template stages fine in 2.7 but fails in 3.7
如何创建 Google Cloud Dataflow WordcountPython 中的自定义模板?
**** 堆栈跟踪 ****
**** Stack Trace ****
Traceback (most recent call last):
File "run_pipeline.py", line 44, in <module>
main()
File "run_pipeline.py", line 41, in main
lines = p | 'read' >> ReadFromText(wordcount_options.input)
File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 906, in __ror__
return self.transform.__ror__(pvalueish, self.label)
File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
result = p.apply(self, pvalueish, label)
File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 490, in apply
return self.apply(transform, pvalueish)
File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
return m(transform, input, options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
return transform.expand(input)
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/textio.py", line 542, in expand
return pvalue.pipeline | Read(self._source)
File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
result = p.apply(self, pvalueish, label)
File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
return m(transform, input, options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1020, in apply_Read
return self.apply_PTransform(transform, pbegin, options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
return transform.expand(input)
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 863, in expand
return pbegin | _SDFBoundedSourceWrapper(self.source)
File "/usr/local/lib/python3.7/site-packages/apache_beam/pvalue.py", line 113, in __or__
return self.pipeline.apply(ptransform, self)
File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
return m(transform, input, options)
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
return transform.expand(input)
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1543, in expand
| core.ParDo(self._create_sdf_bounded_source_dofn()))
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1517, in _create_sdf_bounded_source_dofn
estimated_size = source.estimate_size()
File "/usr/local/lib/python3.7/site-packages/apache_beam/options/value_provider.py", line 136, in _f
raise error.RuntimeValueProviderError('%s not accessible' % obj)
apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible
推荐答案
不幸的是,Apache Beam 的 Python SDK 2.18.0 上的模板似乎已损坏.
Unfortunately, it looks like templates are broken on Apache Beam's Python SDK 2.18.0.
目前,解决方案是避免使用 Beam 2.18.0,因此在您的需求/依赖项中,定义 apache-beam[gcp]<2.18.0
或 apache-光束[gcp]>2.18.0
For now, the solution to this is to avoid Beam 2.18.0, so in your requirements / dependencies, define apache-beam[gcp]<2.18.0
or apache-beam[gcp]>2.18.0
这篇关于使用 Apache Beam python 创建谷歌云数据流模板时出现 RuntimeValueProviderError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论