解析文本以获取专有名词(名称和组织)

编程入门行业动态更新时间:2024-10-25 12:24:13

本文介绍了解析文本以获取专有名词(名称和组织)-Python nltk的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在尝试从很小的文本(如sms)中提取专有名词，如名称和组织名称，这些文本是nltk提供的基本解析器能够获取名词，但问题是当我们得到专有名词 not 以大写字母开头时，对于像这样的文本，诸如sumit之类的名称不会被识别为专有名词

I am trying to extract proper nouns as in Names and Organization names from very small chunks of texts like sms, the basic parsers available with nltk Finding Proper Nouns using NLTK WordNet are being able to get the nouns but the problem is when we get proper nouns not starting with a capital letter , for texts like this the names like sumit do not get recognized as proper nouns

>>> sentence = "i spoke with sumit and rajesh and Samit about the gridlock situation last night @ around 8 pm last nite" >>> tagged_sent = pos_tag(sentence.split()) >>> print tagged_sent [('i', 'PRP'), ('spoke', 'VBP'), ('with', 'IN'), **('sumit', 'NN')**, ('and', 'CC'), ('rajesh', 'JJ'), ('and', 'CC'), **('Samit', 'NNP'),** ('about', 'IN'), ('the', 'DT'), ('gridlock', 'NN'), ('situation', 'NN'), ('last', 'JJ'), ('night', 'NN'), ('@', 'IN'), ('around', 'IN'), ('8', 'CD'), ('pm', 'NN'), ('last', 'JJ'), ('nite', 'NN')]

推荐答案

您可能想看看 python-nameparser . 它还尝试猜测名称的大写.抱歉，答案不完整，但是我在使用python-nameparser方面经验不足.

You might want to have a look at python-nameparser. It tries to guess capitalization of names also. Sorry for the incomplete answer but I don't have much experience using python-nameparser.

祝你好运！

更多推荐

解析文本以获取专有名词(名称和组织)

本文发布于:2023-10-23 01:53:59，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1519391.html