clf.tree

编程入门 行业动态 更新时间:2024-10-26 04:22:11
本文介绍了clf.tree_.feature的输出是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我观察到scikit-learn clf.tree_.feature偶尔会返回负值.例如-2.据我了解,clf.tree_.feature应该返回功能的顺序.如果我们有特征名称数组 ['feature_one','feature_two','feature_three'] ,则-2表示 feature_two .我对负索引的使用感到惊讶.用索引1引用 feature_two 会更有意义.(-2是便于人类消化的引用,不适用于机器处理).我读得对吗?

I observed that scikit-learn clf.tree_.feature occasional return negative values. For example -2. As far as I understand clf.tree_.feature is supposed to return sequential order of the features. In case we have array of feature names ['feature_one', 'feature_two', 'feature_three'], then -2 would refer to feature_two. I am surprised with usage of negative index. In would make more sense to refer to feature_two by index 1. (-2 is reference convenient for human digestion, not for machine processing). Am I reading it correctly?

更新:这是一个示例:

def leaf_ordering(): X = np.genfromtxt('X.csv', delimiter=',') Y = np.genfromtxt('Y.csv',delimiter=',') dt = DecisionTreeClassifier(min_samples_leaf=10, random_state=99) dt.fit(X, Y) print(dt.tree_.feature)

以下是文件 X 和是

以下是输出:

[ 8 9 -2 -2 9 4 -2 9 8 -2 -2 0 0 9 9 8 -2 -2 9 -2 -2 6 -2 -2 -2 2 -2 9 8 6 9 -2 -2 -2 8 9 -2 9 6 -2 -2 -2 6 -2 -2 9 -2 6 -2 -2 2 -2 -2]

推荐答案

通过阅读树生成器的Cython源代码,我们看到-2只是叶节点的特征分割属性的伪值.

By reading the Cython source code for the tree generator we see that the -2's are just dummy values for the leaf nodes's feature split attribute.

第63行

TREE_UNDEFINED = -2

359行

if is_leaf: # Node is not expandable; set node as leaf node.left_child = _TREE_LEAF node.right_child = _TREE_LEAF node.feature = _TREE_UNDEFINED node.threshold = _TREE_UNDEFINED

更多推荐

clf.tree

本文发布于:2023-11-30 05:18:58,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1648857.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:clf   tree

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!