N次复制Spark Row

编程入门 行业动态 更新时间:2024-10-25 07:33:47
本文介绍了N次复制Spark Row的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我想在DataFrame中复制一行,该怎么做?

I want to duplicate a Row in a DataFrame, how can I do that?

例如,我有一个由1行组成的DataFrame,并且我想制作一个具有100个相同行的DataFrame.我想出了以下解决方案:

For example, I have a DataFrame consisting of 1 Row, and I want to make a DataFrame with 100 identical Rows. I came up with the following solution:

var data:DataFrame=singleRowDF for(i<-1 to 100-1) { data = data.unionAll(singleRowDF) }

但这引入了许多转换,而且看来我随后的动作非常缓慢.还有另一种方法吗?

But this introduces many transformations and it seems my subsequent actions become very slow. Is there another way to do it?

推荐答案

您可以添加一列,其字面值为Array,其大小为100,然后使用explode使其每个元素创建自己的行;然后,只需删除此虚拟"列即可:

You can add a column with a literal value of an Array with size 100, and then use explode to make each of its elements create its own row; Then, just get rid of this "dummy" column:

import org.apache.spark.sql.functions._ val result = singleRowDF .withColumn("dummy", explode(array((1 until 100).map(lit): _*))) .selectExpr(singleRowDF.columns: _*)

更多推荐

N次复制Spark Row

本文发布于:2023-11-30 20:20:27,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1651364.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:Spark   Row

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!