有可能在spark中从单个DStream中获取多个DStream.我的用例如下:我正在从HDFS文件获取日志数据流.日志行包含一个id(id = xyz).我需要根据ID对日志行进行不同的处理.因此,我尝试为输入Dstream中的每个ID设置不同的Dstream.我在文档中找不到任何相关内容.有谁知道如何在Spark中实现此目标,或指向此目标的任何链接.
Is is possible to get multiple DStream out of a single DStream in spark. My use case is follows: I am getting Stream of log data from HDFS file. The log line contains an id (id=xyz). I need to process log line differently based on the id. So I was trying to different Dstream for each id from input Dstream. I couldnt find anything related in documentation. Does anyone know how this can be achieved in Spark or point to any link for this.
谢谢
推荐答案您不能从单个DStream中拆分多个DStream.您可以做的最好的事情是:-
You cannot Split multiple DStreams from Single DStreams. The best you can do is: -
我总是更喜欢#1,因为这是更清洁的解决方案,但是有些例外需要实现#2.
I would always prefer #1 as that is cleaner solution but there are exceptions for which #2 needs to be implemented.
更多推荐
Spark:从单个DStream中获取多个DStream
发布评论