NLP评估指标

编程入门行业动态更新时间:2024-10-25 13:17:34

NLP评估<a href=https://www.elefans.com/category/jswz/34/1768114.html style= 指标"/>

NLP评估指标

1. 介绍

计算两个句子相似度，使用的方式是共现词频率，用来衡量两句话的一致程度。主要用于评估翻译的好坏
思想：认为机器翻译结果越接近专业人工翻译，模型越准确
blue越高，认为模型越好
bleu的优点：方便、快速、结果有参考价值
bleu的缺点：不考虑语法上的准确性；测评精度会受常用词的干扰；短译句的测评精度有时会较高；没有考虑同义词或相似表达的情况。

2. 应用

bleu考虑1，2，3，4共4个n-gram，可以给每个n-gram指定权重。
步骤：

对句子分别进行n-gram分词
统计句子中每个word的出现频次
对于常用词出现频次进行限制，防止出现次数太多了，导致分值增高。
每个word的出现频次之和除以总的word数，即为得分score
score乘以句子长度惩罚因子即为最终的bleu分数（为了整治短句子，防止模型偏向于短句）
应用：

from nltk.translate.bleu_score import sentence_bleu
reference = [['this', 'is', 'a', 'test'], ['this', 'is' 'test']]
candidate = ['this', 'is', 'a', 'test']
score = sentence_bleu(reference, candidate)

2.1 给定一个句子和一个候选句子集计算bleu值—sentence-bleu

from collections import Counter
import numpy as np
from nltk.translate import bleu_scoredef bp(references, candidate):# brevity penality,句子长度惩罚因子ind = np.argmin([abs(len(i) - len(candidate)) for i in references])if len(references[ind]) < len(candidate):return 1scale =

更多推荐

NLP评估指标

本文发布于:2024-02-11 18:03:34，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1682461.html