在Google Dataflow作业完成之后,是否可以将消息发布到Google Pubsub?我们需要通知从属系统传入数据的处理已完成.将数据写入接收器后,Dataflow如何发布?
Is there a way to publish a message onto Google Pubsub after a Google Dataflow job completes? We have a need to notify dependent systems that the processing of incoming data is complete. How could Dataflow publish after writing data to the sink?
我们想在管道完成对GCS的写入后通知.我们的管道如下所示:
We want to notify after a pipeline completes writing to GCS. Our pipeline looks like this:
Pipeline.create(options) .apply(....) .apply(AvroIO.Write.named("Write to GCS") .withSchema(Extract.class) .to(options.getOutputPath()) .withSuffix(".avro")); p.run();如果我们在pipeline.apply(...)方法之外添加逻辑,则在代码完成执行时(而不是在管道完成时)会收到通知.理想情况下,我们可以在AvroIO接收器之后添加另一个.apply(...)并将消息发布到PubSub.
If we add logic outside of the pipeline.apply(...) methods we are notified when the code completes execution, not when the pipeline is completed. Ideally we could add another .apply(...) after the AvroIO sink and publish a message to PubSub.
推荐答案当管道完成时,您有两个选择可得到通知,然后随后发布消息-或在管道完成运行后执行任何操作:
You have two options to get notified when your pipeline finishes, and then subsequently publish a message - or do whatever you want to after the pipeline finishes running:
更多推荐
数据流作业完成后通知Google PubSub
发布评论