使用lambda函数对整个列进行定形

编程入门 行业动态 更新时间:2024-10-06 22:22:52
本文介绍了使用lambda函数对整个列进行定形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我已经对该代码测试了一个句子,我想对其进行转换,以便可以使整列的词素化,其中每一行包含单词而没有标点符号,例如:

import wordnet, nltk nltk.download('wordnet') from nltk.stem import WordNetLemmatizer from nltk.corpus import wordnet import pandas as pd df = pd.read_excel(r'C:\Test2\test.xlsx') # Init the Wordnet Lemmatizer lemmatizer = WordNetLemmatizer() sentence = 'FINAL_KEYWORDS' def get_wordnet_pos(word): """Map POS tag to first character lemmatize() accepts""" tag = nltk.pos_tag([word])[0][1][0].upper() tag_dict = {"J": wordnet.ADJ, "N": wordnet.NOUN, "V": wordnet.VERB, "R": wordnet.ADV} return tag_dict.get(tag, wordnet.NOUN) #Lemmatize a Sentence with the appropriate POS tag sentence = "The striped bats are hanging on their feet for best" print([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(sentence)])

让我们假设列名称为df ['keywords'],您能帮我使用lambda函数来使整个列均化吗?

非常感谢

解决方案

在这里:

  • 使用apply应用于列的句子
  • 使用lambda表达式获取sentence作为输入并应用您编写的功能,类似于在print语句中使用的方式
  • 作为词干化关键字:

    # Lemmatize a Sentence with the appropriate POS tag df['keywords'] = df['keywords'].apply(lambda sentence: [lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(sentence)])

    作为修饰词的句子( join 关键字使用''):

    # Lemmatize a Sentence with the appropriate POS tag df['keywords'] = df['keywords'].apply(lambda sentence: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(sentence)]))

    I have this code tested for a sentence and I want to convert it so that I can lemmatize an entire column where each row consists in words without punctuation like: deportivas calcetin hombres deportivas shoes

    import wordnet, nltk nltk.download('wordnet') from nltk.stem import WordNetLemmatizer from nltk.corpus import wordnet import pandas as pd df = pd.read_excel(r'C:\Test2\test.xlsx') # Init the Wordnet Lemmatizer lemmatizer = WordNetLemmatizer() sentence = 'FINAL_KEYWORDS' def get_wordnet_pos(word): """Map POS tag to first character lemmatize() accepts""" tag = nltk.pos_tag([word])[0][1][0].upper() tag_dict = {"J": wordnet.ADJ, "N": wordnet.NOUN, "V": wordnet.VERB, "R": wordnet.ADV} return tag_dict.get(tag, wordnet.NOUN) #Lemmatize a Sentence with the appropriate POS tag sentence = "The striped bats are hanging on their feet for best" print([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(sentence)])

    Let's suppose Column name is df['keywords'], can you help me use a lambda function in order to lemmatize the entire column like I lemmatize the sentence above?

    Many thanks in advance

    解决方案

    Here you go:

  • Use apply to apply on the column's sentences
  • Use lambda expression that gets a sentence as input and applies the function you wrote, in a similar to how you used in the print statement
  • As lemmatized keywords:

    # Lemmatize a Sentence with the appropriate POS tag df['keywords'] = df['keywords'].apply(lambda sentence: [lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(sentence)])

    As a lemmatized sentence (join keywords using ' '):

    # Lemmatize a Sentence with the appropriate POS tag df['keywords'] = df['keywords'].apply(lambda sentence: ' '.join([lemmatizer.lemmatize(w, get_wordnet_pos(w)) for w in nltk.word_tokenize(sentence)]))

    更多推荐

    使用lambda函数对整个列进行定形

    本文发布于:2023-11-28 06:17:11,感谢您对本站的认可!
    本文链接:https://www.elefans.com/category/jswz/34/1641260.html
    版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
    本文标签:函数   lambda

    发布评论

    评论列表 (有 0 条评论)
    草根站长

    >www.elefans.com

    编程频道|电子爱好者 - 技术资讯及电子产品介绍!