admin管理员组文章数量:1638919
In brief,信息密度就是相似度的加和或均值!
When using uncertainty sampling (or other similar strategies), we are unable to take the structure of the data into account.
This can lead us to suboptimal queries. To alleviate this, one method is to use information density measures to help us guide the queries.
from sklearn.datasets import make_blobs
from modAL.density import information_density
import matplotlib.pyplot as plt
from pathlib import Path
import numpy as np
import pandas as pd
##获取数据
X, y = make_blobs(n_features=2, n_samples=1000, centers=3, random_state=0, cluster_std=0.7)
##获取数据
# data_path = Path(r"D:\OCdata")
# name = "FourBolbs"
# path_data = str(data_path.joinpath(name + ".csv"))
# data = np.array(pd.read_csv(path_data, header=None))
# X = data[:,:-1]
# y = data[:, -1]
euclidean_density = information_density(X,"euclidean")
cosine_density = information_density(X,"cosine")
plt.style.context('seaborn-while')
plt.figure(figsize=(14,7))
plt.subplot(1,2,1)
plt.scatter(x=X[:, 0], y=X[:, 1], c=cosine_density, cmap='viridis', s=50)
plt.title('The cosine information density')
plt.colorbar()
plt.subplot(1,2,2)
plt.scatter(x=X[:,0],y=X[:,1],c=euclidean_density,cmap="viridis",s=50)
plt.title("The euclidean information density")
plt.colorbar()
plt.show()
参考:https://modal-python.readthedocs.io/en/latest/content/query_strategies/information_density.html
计算各种距离和相似度的
from sklearn.metrics.pairwise import pairwise_distances
similarity_mtx = 1/(1+pairwise_distances(X, X, metric=metric))
return similarity_mtx.mean(axis=1)
本文标签: 密度代码信息informationdensity
版权声明:本文标题:信息密度 (information density),及代码实现 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/dongtai/1729278939a1193834.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论