根据 pandas 中的行匹配,有条件地用来自另一个DataFrame的值填充列

编程入门 行业动态 更新时间:2024-10-08 12:35:35
本文介绍了根据 pandas 中的行匹配,有条件地用来自另一个DataFrame的值填充列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我发现自己无法解决这个问题(自动完成税务文书工作).我有两个数据框:例如,一个具有欧元/美元汇率的季度历史记录,另一个具有我自己的发票,例如:

I find myself lost trying to solve this problem (automating tax paperwork). I have two dataframes: one with the quarterly historical records of EUR/USD exchange rates, and another with my own invoices, as an example:

import pandas as pd import numpy as np usdeur = [(pd.Timestamp('20170705'),1.1329), (pd.Timestamp('20170706'),1.1385), (pd.Timestamp('20170707'),1.1412), (pd.Timestamp('20170710'),1.1387), (pd.Timestamp('20170711'),1.1405), (pd.Timestamp('20170712'),1.1449)] labels = ['Date', 'Rate'] rates = pd.DataFrame.from_records(usdeur, columns=labels) transactions = [(pd.Timestamp('20170706'), 'PayPal', 'USD', 100, 1), (pd.Timestamp('20170706'), 'Fastspring', 'USD', 200, 1), (pd.Timestamp('20170709'), 'Fastspring', 'USD', 100, 1), (pd.Timestamp('20170710'), 'EU', 'EUR', 100, 1), (pd.Timestamp('20170710'), 'PayPal', 'USD', 200, 1)] labels = ['Date', 'From', 'Currency', 'Amount', 'Rate'] sales =pd.DataFrame.from_records(transactions, columns=labels)

导致:

我需要在sales['Rate']栏中填充来自rates['Rate']的正确汇率,即:

I would need to have the sales['Rate'] column filled with the proper exchange rates from the rates['Rate'], that is to say:

  • 如果sales['Currency']是'EUR',请不要理会.
  • 对于sales的每一行,在rates中找到具有匹配的'Date'的行;抓取非常rates['Rate']的值并将其放入sales['Rate']
  • 奖励:如果没有匹配的'Date'(例如,在假期期间,交易所市场关闭),请检查上一行直到找到合适的值.
  • if sales['Currency'] is 'EUR', leave it alone.
  • for each row of sales, find the row in rates with matching 'Date'; grab that very rates['Rate'] value and put it in sales['Rate']
  • bonus: if there's no matching 'Date' (e.g. during holidays, the exchange market is closed), check the previous row until a suitable value is found.

完整结果应如下所示(请注意,第2行的费率自2017年7月7日开始):

The full result should look like the following (note that row #2 has the rate from 2017-07-07):

我尝试遵循其他问题的一些建议解决方案,但没有运气. 提前非常感谢您

I've tried to follow several suggested solutions from other questions, but with no luck. Thank you very much in advance

推荐答案

我使用pd.merge_asof

sales=pd.merge_asof(sales,rates,on='Date',direction='backward',allow_exact_matches =True) sales.loc[sales.From=='EU','Rate_y']=sales.Rate_x sales Out[748]: Date From Currency Amount Rate_x Rate_y 0 2017-07-06 PayPal USD 100 1 1.1385 1 2017-07-06 Fastspring USD 200 1 1.1385 2 2017-07-09 Fastspring USD 100 1 1.1412 3 2017-07-10 EU EUR 100 1 1.0000 4 2017-07-10 PayPal USD 200 1 1.1387

然后

sales.drop('Rate_x',1).rename(columns={'Rate_y':'Rate'}) Out[749]: Date From Currency Amount Rate 0 2017-07-06 PayPal USD 100 1.1385 1 2017-07-06 Fastspring USD 200 1.1385 2 2017-07-09 Fastspring USD 100 1.1412 3 2017-07-10 EU EUR 100 1.0000 4 2017-07-10 PayPal USD 200 1.1387

更多推荐

根据 pandas 中的行匹配,有条件地用来自另一个DataFrame的值填充列

本文发布于:2023-11-30 07:44:02,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1649220.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:有条件   pandas   DataFrame

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!