空间相似度方法不能正常工作

编程入门行业动态更新时间:2024-10-27 18:18:37

本文介绍了空间相似度方法不能正常工作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

限时送ChatGPT账号..

我正在使用 spacy 进行简单的自然语言处理.我正在通过测量单词之间的相似性来过滤单词.

I am doing simple natural language processing using spacy. I'm working on filtering out words by measuring the similarity between words.

我编写并使用了 spacy 文档中显示的以下简单代码，但结果看起来不像文档.

I wrote and used the following simple code shown in the spacy documentation, but the result does not look like a documentation.

import spacy
nlp = spacy.load('en_core_web_lg')
tokens = nlp('dog cat banana')

for token1 in tokens:
    for token2 in tokens:
        sim = token1.similarity(token2)
        print("{:>6s}, {:>6s}: {}".format(token1.text, token2.text, sim))

代码结果如下.

   dog,    dog: 1.0
   dog,    cat: 2.307269867164827e-21
   dog, banana: 0.0
   cat,    dog: 2.307269867164827e-21
   cat,    cat: 1.0
   cat, banana: -0.04468117654323578
banana,    dog: -7.828739256116838e+17
banana,    cat: -8.242222286053048e+17
banana, banana: 1.0

特别是狗"和猫"之间的相似度应该在0.8左右，但并不是非常非常小的值.

Especially, similarity between "dog" and "cat" should be about 0.8, but it is not a nd very very small value.

此外，dog"和banana"之间的相似度为 0.0，但banana"和dog"之间的相似度为 -7.828739256116838e+17.

In addition, similarity between "dog" and "banana" is 0.0 but similarity between 'banana' and 'dog' is -7.828739256116838e+17.

我不知道如何解决它.

请帮帮我.

推荐答案

首先安装大型 EN 模型(或所有模型).

First install large EN model (or all models).

python3 -m spacy.en.download all

接下来，尝试按照文档使用示例代码，

Next, try with sample code as per documentation using,

nlp = spacy.load('en_core_web_md')

如果这不起作用，请不要尝试加载，

If that doesnt work, Instead of above try loading,

nlp = spacy.load('en')

执行上述更改后，结果与文档一致.

After doing above changes the result is as per documentation.

python3 /tmp/c.py
   dog,    dog: 1.000000078333395
   dog,    cat: 0.8016855098942641
   dog, banana: 0.2432764518408807
   cat,    dog: 0.8016855098942641
   cat,    cat: 1.0000001375986456
   cat, banana: 0.2815436412709355
banana,    dog: 0.2432764518408807
banana,    cat: 0.2815436412709355
banana, banana: 1.000000107068369

这篇关于空间相似度方法不能正常工作的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

更多推荐

[db:关键词]

本文发布于:2023-04-25 06:05:42，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1079993.html