本文介绍了适用于R的Wordnet分离器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想使用wordnet lemmatizer对a
I would like to use the wordnet lemmatizer to lemmatize the words in a
> a<-c("He saw a see-saw on a sea shore", "she is feeling cold") > a [1] "He saw a see-saw on a sea shore" "she is feeling cold"我将a转换为语料库并进行预处理(例如停用词去除,词形去除等)
I convert a into a corpus and do pre-processing steps (like stopword removal, lemmatization etc)
> a <- Corpus(VectorSource(a))我想通过以下方式进行lemmatization,
I wanted to do the lemmatization in the below way,
> filter <- getTermFilter("ExactMatchFilter", a, TRUE) > terms <- getIndexTerms("NOUN", 1, filter) > sapply(terms, getLemma)但是我得到这个错误
> filter <- getTermFilter("ExactMatchFilter", a, TRUE) Error in .jnew(paste("com.nexagis.jawbone.filter", type, sep = "."), word, : java.lang.NoSuchMethodError: <init>我的想法是使整个语料库而不是单个词去词缀化,如何实现?
My idea is to lemmatize the whole corpus and not a single word, How can it be accomplished?
推荐答案将代码循环放置,您可以尝试执行以下操作:
Put you code in a loop, you can try something like this:
lapply(a,function(x){ x.filter <- getTermFilter("ExactMatchFilter", x, TRUE)) terms <- getIndexTerms("NOUN", 1, x.filter) sapply(terms, getLemma) })更多推荐
适用于R的Wordnet分离器
发布评论