这是针对K均值算法的.这是做作业,所以我不想使用 内置Kmeans功能. 我有2个numpy数组.一种是质心.另一个是数据点. 我试图找到每个质心到每个数据点的距离. 我不知道如何将数组传递给函数以使其打印.我想结束 与形心一样多的距离数组.然后我可以比较数组中的每个距离,选择最小的 距离并将该点分配给群集之一.然后找到每个聚类的均值 数字成为我的新质心.
This is for a K-Means Algorithm. This is for homework, so I do not want to use the built in Kmeans function. I have 2 numpy arrays. One is of centroids. The other is of data points. I am trying to find the distance from each of the centroids to each of the data points. I don't know how to pass the arrays to my function in order for it to print. I want to end up with as many arrays of distances as there are centroids. Then I can compare each distance in the arrays, choose the smallest distance and assign that point to one of the clusters. Then find the mean of each of the clusters, and those numbers become my new centroids.
import numpy as np centroids = np.array([[3,44],[5,15]]) dataPoints = np.array([[2,4],[17,4],[45,2],[45,7],[16,32],[32,14],[20,56],[68,33]]) def distance(a,b): for x in a: #for each point in centroids array for y in b:#for each point in the dataPoints array print np.sqrt((a[0] - b[0])**2 + (a[1] - b[1])**2)#print the distance distance (randPoints, dataPoints)#call the function with the data我得到的输出:
[ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703] [ 12.04159458 41.48493703]我在做什么,这显然是错误的?我应该以2个不同的数组结束,每个数组有8个距离.
What am I doing that is obviously wrong here? I should end up with 2 different arrays with 8 distances each.
推荐答案import numpy as np centroids = np.array([[3,44],[5,15]]) dataPoints = np.array([[2,4],[17,4],[45,2],[45,7],[16,32],[32,14],[20,56],[68,33]]) def size(vector): return np.sqrt(sum(x**2 for x in vector)) def distance(vector1, vector2): return size(vector1 - vector2) def distances(array1, array2): return [[distance(vector1, vector2) for vector2 in array2] for vector1 in array1] print(distances(centroids, dataPoints))
更多推荐
如何从不同的numpy数组中找到两点之间的距离?
发布评论