算法分析(3)"/>
Mahout源码K均值聚类算法分析(3)
之前的关于中心点文件的分析基本是对的,但是在第一篇整体分析的时候没有说如何产生中心点向量文件所以在第二篇写了如何得到,其实在mahout里面有一个自动生成中心点文件的方法,之前漏掉了。现在补上,首先编写下面的debug代码:
package mahout.fansy.test.kmeans;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.util.ToolRunner;
import org.apache.mahout.clustering.kmeans.KMeansDriver;
import org.apache.mahoutmon.distance.ManhattanDistanceMeasure;
public class KmeansTest {/*** @param args* @throws Exception */public static void main(String[] args) throws Exception {test2();}// 间接调用run方法public static void test2() throws Exception{String[] arg={"-fs","fansyPC:9000","-jt","fansyPC:9001","--input","hdfs://fansyPC:9000/user/fansy/output/kmeans-in-transform/part-r-00000","--output","hdfs://fansyPC:9000/user/fansy/output/kmeans-output","-dm","org.apache.mahoutmon.distance.ManhattanDistanceMeasure","-c","hdfs://fansyPC:9000/user/fansy/output/kmeans-center","-k","2","-x","4","--tempDir","hdfs://fansyPC:9000/user/fansy/output/kmeans-tmp"};ToolRunner.run(new
更多推荐
Mahout源码K均值聚类算法分析(3)
发布评论