admin管理员组文章数量:1616700
Quick Summary of PCA:
1. Organize data as an m*n matrix, where m is the number of measurement types and n is the number of samples
2.Subtract off the mean for each measurement type
3. Calculate the SVD or the eigenvectors of the covariance
A deeper appreciation of the limits of PCA requires some consideration about the underlying assumptions and in tandem, a more rigorous description of the source of data. Generally speaking, the primary motivation behind this method is to decorrelate the data set, i.e. remove second-order depencies.
In the context of dimensional reduction, one measure of success is the degree to which a reduced representation can predict the original data. In statistical terms, we must define the error function(or loss function). It can be proved that under a common loss function, mean squared error(i.e. L2 norm), PCA provides the optimal reduced representation of the data. The means that selecting orthogonal directions for principal component is the best solution to predicting the original data.
The goal of the analysis is to decorrelate the data, or said in other terms, the goal is to remove second-order dependencies exist between the variables.
Multiple solutions exist for removing higher-order dependencies. For instance, if prior knowledge is known about the problem, then a nonlinearity might be applied to the data to transform the data to a more appropriate naive basis.
Another direction is to impose more general statistical definitions of dependency within a data, e.g. requiring that data along reduced dimensions be statistically independent. This class of algorithm, termed, independent component analysis(ICA), has been to demonstrated to succeed in many domains where PCA fails.
版权声明:本文标题:PCA 和 SVD 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/dongtai/1728741594a1171126.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论