使用Sparklyr的FPGrowth/关联规则

编程入门 行业动态 更新时间:2024-10-14 08:23:46
本文介绍了使用Sparklyr的FPGrowth/关联规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在尝试使用Sparklyr构建关联规则算法,并且一直在关注博客,这确实得到了很好的解释.

I am trying to build an association rules algorithm using Sparklyr and have been following this blog which is really well explained.

但是,在适合FPGrowth算法之后的一段中,作者从返回的"FPGrowthModel对象"中提取规则,但我无法复制以提取规则.

However, there is a section just after they fit the FPGrowth algorithm where the author extracts the rules from the "FPGrowthModel object" which is returned but I am not able to reproduce to extract my rules.

我苦苦挣扎的部分是这段代码:

The section where I am struggling is this piece of code:

rules = FPGmodel %>% invoke("associationRules")

有人可以解释一下FPG模型的来源吗?

Could someone please explain where FPGmodel comes from?

我的代码如下所示,但我没有看到可以从中提取规则的FPGmodel对象,将不胜感激.

My code looks as follows and I am not seeing an FPGmodel object that I can extract my rules from, any help would be greatly appreciated.

# CACHE HIVE TABLE INTO SPARK tbl_cache(sc, 'claims', force = TRUE) med_tbl <- tbl(sc, 'claims') # SELECT VARIABLES OF INTEREST med_tbl <- med_tbl %>% select(proc_desc,alt_claim_id) # REMOVE DUPLICATED ROWS med_tbl <- dplyr::distinct(med_tbl) med_tbl <- med_tbl %>% group_by(alt_claim_id) # AGGREGATING CLAIMS BY CLAIM ID med_agg <- med_tbl %>% group_by(alt_claim_id) %>% summarise(procedures = collect_list(proc_desc)) # CREATE UNIQUE STRING TO IDENTIFY THE MACHINE LEARNING ESTIMATOR uid = sparklyr:::random_string("fpgrowth_") # INVOKE THE FPGrowth JAVA CLASS jobj = invoke_new(sc, "org.apache.spark.ml.fpm.FPGrowth", uid) jobj %>% invoke("setItemsCol", "procedures") %>% invoke("setMinConfidence", 0.03) %>% invoke("setMinSupport", 0.01) %>% invoke("fit", spark_dataframe(med_agg))

推荐答案

您链接的博客帖子已经过时了将近两年.由于 2b0994c 提供了原生包装code> oasml.fpm.FPGrowth

The blog post you've linked has been obsolete for almost two years. Since 2b0994c provides native wrapper for o.a.s.ml.fpm.FPGrowth

df <- copy_to(sc, tibble(items=c("a b c", "a b", "c f g", "b c"))) %>% mutate(items = split(items, "\\\\s+") fp_growth_model <- ml_fpgrowth(df)

antecedent consequent confidence lift <list> <list> <dbl> <dbl> 1 <list [1]> <list [1]> 1 1.33

更多推荐

使用Sparklyr的FPGrowth/关联规则

本文发布于:2023-11-30 17:22:43,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1650876.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:规则   Sparklyr   FPGrowth

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!