本文介绍了Spark Scala数据框具有单个Group By的多个聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
Spark Scala数据帧具有单个group by的多个聚合.例如
Spark Scala Data Frame to have multiple aggregation of single group by. eg
val groupped = df.groupBy("firstName", "lastName").sum("Amount").toDF()但是如果我需要计数,总和,最大值等
But What if I need Count, Sum, Max etc
/* Below Does Not Work , but this is what the intention is val groupped = df.groupBy("firstName", "lastName").sum("Amount").count().toDF() */输出 groupped.show()
-------------------------------------------------- | firstName | lastName| Amount|count | Max | Min | -------------------------------------------------- 推荐答案case class soExample(firstName: String, lastName: String, Amount: Int) val df = Seq(soExample("me", "zack", 100)).toDF import org.apache.spark.sql.functions._ val groupped = df.groupBy("firstName", "lastName").agg( sum("Amount"), mean("Amount"), stddev("Amount"), count(lit(1)).alias("numOfRecords") ).toDF() display(groupped)
更多推荐
Spark Scala数据框具有单个Group By的多个聚合
发布评论