在R中排序关联规则(Sorting Association Rules in R)
我正在努力实现下面提到的目标,并且有很多错误。 我花了很多时间试图对规则进行排序,然后打印前十名。 我知道如何打印整个列表。
使用R,探索在较大的数据文件中生成规则。 考虑成人数据(在R中可用> data(Adult)命令)。 生成关联规则,置信度阈值为0.8
打印按支持排序的前10条规则。 考虑使用inspect命令以及对排序规则进行排序和索引。 打印出自信排序的前10条规则。 查看生成限制为在规则的lhs上有收入的规则。 注意,收入选项是两个值:小和大。 考虑包括apriori函数的外观参数。 打印按电梯排序的前10条规则。这是我到目前为止的代码:
library(arules) library(arulesViz) data(Adult) head(Adult) rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.8)) top.support <- sort(rules, decreasing = TRUE, na.last = NA, by = "support") top.ten.support <- sort.list(top.support, partial=10) inspect(top.ten.support) top.confidence <- sort(rules, decreasing = TRUE, na.last = NA, by = "confidence") top.ten.confidence <- sort.list(top.support,partial=10) inspect(top.ten.confidence) rules2 <- apriori(Adult, parameter=list(supp = 0.5, conf = 0.8), appearance = income) top.lift <- sort(rules2, decreasing = TRUE, na.last = NA, by = "lift") top.ten.lift <- sort.list(top.lift, partial=10) inspect(top.ten.lift)I'm trying to accomplish the goals stated below and have oodles of errors. I've spent a lot of time trying to sort the rules and just print the top ten. I know how to print out the entire list.
Use R, to explore generating rules in larger data files. Consider the Adult data (available in R with the > data(Adult) command). Generate the association rules with a confidence threshold of 0.8
Print out the top 10 rules sorted by support. Consider using the inspect command along with sort and indexing into the sorted rules. Print out the top 10 rules sorted by confidence. Look at generating rules that are restricted to have income on the lhs of the rule. Note, options for income are two values: small and large. Consider including the appearance parameter of the apriori function. Print the first 10 rules sorted by lift.Here is my code so far:
library(arules) library(arulesViz) data(Adult) head(Adult) rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.8)) top.support <- sort(rules, decreasing = TRUE, na.last = NA, by = "support") top.ten.support <- sort.list(top.support, partial=10) inspect(top.ten.support) top.confidence <- sort(rules, decreasing = TRUE, na.last = NA, by = "confidence") top.ten.confidence <- sort.list(top.support,partial=10) inspect(top.ten.confidence) rules2 <- apriori(Adult, parameter=list(supp = 0.5, conf = 0.8), appearance = income) top.lift <- sort(rules2, decreasing = TRUE, na.last = NA, by = "lift") top.ten.lift <- sort.list(top.lift, partial=10) inspect(top.ten.lift)最满意答案
1)打印出按支持排序的前10条规则:
R> top.support <- sort(rules, decreasing = TRUE, na.last = NA, by = "support") R> inspect(head(top.support, 10)) # or inspect(sort(top.support)[1:10]) lhs rhs support confidence lift 1 {} => {capital-loss=None} 0.9533 0.9533 1.0000 2 {} => {capital-gain=None} 0.9174 0.9174 1.0000 3 {} => {native-country=United-States} 0.8974 0.8974 1.0000 4 {capital-gain=None} => {capital-loss=None} 0.8707 0.9491 0.9956 5 {capital-loss=None} => {capital-gain=None} 0.8707 0.9133 0.9956 ...2)打印出满意排序的前10条规则:
R> top.confidence <- sort(rules, decreasing = TRUE, na.last = NA, by = "confidence") R> inspect(head(top.confidence, 10)) lhs rhs support confidence lift 1 {hours-per-week=Full-time} => {capital-loss=None} 0.5607 0.9583 1.0052 2 {workclass=Private} => {capital-loss=None} 0.6640 0.9565 1.0034 3 {workclass=Private, native-country=United-States} => {capital-loss=None} 0.5897 0.9555 1.0023 4 {capital-gain=None, hours-per-week=Full-time} => {capital-loss=None} 0.5192 0.9551 1.0019 5 {workclass=Private, race=White} => {capital-loss=None} 0.5675 0.9550 1.0018 ...3)
R> rules2 <- apriori(Adult, parameter=list(supp = 0.1, conf = 0.8), appearance = list(lhs = c("income=small", "income=large"), default = "rhs")) R> top.lift <- sort(rules2, decreasing = TRUE, na.last = NA, by = "lift") R> inspect(head(subset(top.lift, lhs %pin% "income"), 10)) lhs rhs support confidence lift 1 {income=large} => {marital-status=Married-civ-spouse} 0.1370 0.8535 1.8627 2 {income=large} => {sex=Male} 0.1364 0.8496 1.2710 3 {income=large} => {race=White} 0.1457 0.9077 1.0615 4 {income=small} => {capital-gain=None} 0.4849 0.9581 1.0444 5 {income=large} => {native-country=United-States} 0.1468 0.9146 1.0191 ...1) Print out the top 10 rules sorted by support:
R> top.support <- sort(rules, decreasing = TRUE, na.last = NA, by = "support") R> inspect(head(top.support, 10)) # or inspect(sort(top.support)[1:10]) lhs rhs support confidence lift 1 {} => {capital-loss=None} 0.9533 0.9533 1.0000 2 {} => {capital-gain=None} 0.9174 0.9174 1.0000 3 {} => {native-country=United-States} 0.8974 0.8974 1.0000 4 {capital-gain=None} => {capital-loss=None} 0.8707 0.9491 0.9956 5 {capital-loss=None} => {capital-gain=None} 0.8707 0.9133 0.9956 ...2) Print out the top 10 rules sorted by confidence:
R> top.confidence <- sort(rules, decreasing = TRUE, na.last = NA, by = "confidence") R> inspect(head(top.confidence, 10)) lhs rhs support confidence lift 1 {hours-per-week=Full-time} => {capital-loss=None} 0.5607 0.9583 1.0052 2 {workclass=Private} => {capital-loss=None} 0.6640 0.9565 1.0034 3 {workclass=Private, native-country=United-States} => {capital-loss=None} 0.5897 0.9555 1.0023 4 {capital-gain=None, hours-per-week=Full-time} => {capital-loss=None} 0.5192 0.9551 1.0019 5 {workclass=Private, race=White} => {capital-loss=None} 0.5675 0.9550 1.0018 ...3)
R> rules2 <- apriori(Adult, parameter=list(supp = 0.1, conf = 0.8), appearance = list(lhs = c("income=small", "income=large"), default = "rhs")) R> top.lift <- sort(rules2, decreasing = TRUE, na.last = NA, by = "lift") R> inspect(head(subset(top.lift, lhs %pin% "income"), 10)) lhs rhs support confidence lift 1 {income=large} => {marital-status=Married-civ-spouse} 0.1370 0.8535 1.8627 2 {income=large} => {sex=Male} 0.1364 0.8496 1.2710 3 {income=large} => {race=White} 0.1457 0.9077 1.0615 4 {income=small} => {capital-gain=None} 0.4849 0.9581 1.0444 5 {income=large} => {native-country=United-States} 0.1468 0.9146 1.0191 ...更多推荐
发布评论