我做了一个基于k-medoids算法的聚类,我的问题是图形,我得到了这个结果,两个组件作为轴:
library(fpc) rez<-pamk(tab$Presence) (the values of the presence column are the valures that i want to assign to groups or classes, tab is my data which i described it) plot(rez$pamobject)>我想要的是显示聚类,但x轴上的日期时间列(第一个是Dat_Heure)和组成聚类的因子作为y轴(13列是Prsence)
来自我的数据的子集::
Dat_Heure Devtype Devidx Capt_radio Fonction Fonction_nom Spec1 Spec2 Spec3 1 2015-09-22 00:00:08 IntelliTag 1 17 6 Alarme -55 2423 -1085 Spec4 Spec5 Spec6 Presence Spec8 Spec9 Spec10 timeserie 1 -503 145 1442880008 0 0 0 NA 2015-09-22 00:00:08i made a clustering based on the k-medoids algorithm, my problem is the graphic, i got this as a result, the two components as the axis:
library(fpc) rez<-pamk(tab$Presence) (the values of the presence column are the valures that i want to assign to groups or classes, tab is my data which i described it) plot(rez$pamobject)>what i want is that the clusters be showen but with the a datetime column on the x axis (first one which is Dat_Heure) and the factors that compose the clusters as the y axis (the 13 column which is Prsence)
a subset from my data ::
Dat_Heure Devtype Devidx Capt_radio Fonction Fonction_nom Spec1 Spec2 Spec3 1 2015-09-22 00:00:08 IntelliTag 1 17 6 Alarme -55 2423 -1085 Spec4 Spec5 Spec6 Presence Spec8 Spec9 Spec10 timeserie 1 -503 145 1442880008 0 0 0 NA 2015-09-22 00:00:08最满意答案
然后不要使用clusplot 。 根据其描述,
clusplot使用princomp和cmdscale函数。 这些功能是数据减少技术。 它们将代表双变量图中的数据。
换句话说,它会自动投影您的数据,而不会保留原始的coordinatr系统。
当您根本没有坐标或尺寸数据太高时,可以使用它。 它也意味着尽可能自动化,但代价是可定制性较低。 你必须以lomg的方式去做,以了解正在发生的事情。
更确切地说,它看起来好像只是可视化数据0,1,2。 请注意,第一个组件解释了100%的差异。 所以你的数据有1个维度,有3个值... 聚类分析是一个多变量的东西 - 对于一维数据,使用其他方法 。
Then don't use clusplot automagics. By its description,
clusplot uses the functions princomp and cmdscale. These functions are data reduction techniques. They will represent the data in a bivariate plot.
in other words, it projects your data automatically, and does not preserve the original coordinatr system.
It is meant to be used when you don't have coordinates at all, or too high dimensional data. It's also meant to automate as much as possible, at the cost of being less customizable. You'll have to do it the lomg way, to understand what is happening.
More precisely, it looks as if you are simply visualizing the data 0,1,2. Note that the first component explains 100% of the variance. So your data has 1 dimension, with 3 values... cluster analysis is a multivariate thing - for one-dimensional data, use other approaches.
更多推荐
发布评论