使用所述组合的名称计算pandas中列的组合的总和,以及输出文件(Calculating sum of a combination of columns in pandas, row

系统教程 行业动态 更新时间:2024-06-14 17:00:14
使用所述组合的名称计算pandas中列的组合的总和,以及输出文件(Calculating sum of a combination of columns in pandas, row-wise, with output file with the name of said combination)

我正在寻找一种为数据框中的列的特定数据组合生成csv文件的方法。

我的数据看起来像这样(除了200多行)

+-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+ | Species | OGT | Domain | A | C | D | E | F | G | H | I | K | L | M | N | P | Q | R | S | T | V | W | Y | +-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+ | Aeropyrum pernix | 95 | Archaea | 9.7659115711 | 0.6720465616 | 4.3895390781 | 7.6501943794 | 2.9344881615 | 8.8666657183 | 1.5011817208 | 5.6901432494 | 4.1428307243 | 11.0604191603 | 2.21143353 | 1.9387130928 | 5.1038552753 | 1.6855017182 | 7.7664358772 | 6.266067034 | 4.2052190807 | 9.2692433532 | 1.318690698 | 3.5614200159 | | Argobacterium fabrum | 26 | Bacteria | 11.5698896021 | 0.7985475923 | 5.5884500155 | 5.8165463343 | 4.0512504104 | 8.2643271309 | 2.0116736244 | 5.7962804605 | 3.8931525401 | 9.9250463349 | 2.5980609708 | 2.9846761128 | 4.7828063605 | 3.1262365491 | 6.5684282943 | 5.9454781844 | 5.3740045968 | 7.3382308193 | 1.2519739683 | 2.3149400984 | | Anaeromyxobacter dehalogenans | 27 | Bacteria | 16.0337898849 | 0.8860252895 | 5.1368827707 | 6.1864992608 | 2.9730203513 | 9.3167603253 | 1.9360386851 | 2.940143349 | 2.3473650439 | 10.898494736 | 1.6343905351 | 1.5247123262 | 6.3580285706 | 2.4715303021 | 9.2639057482 | 4.1890063803 | 4.3992339725 | 8.3885969061 | 1.2890166336 | 1.8265589289 | | Aquifex aeolicus | 85 | Bacteria | 5.8730327277 | 0.795341216 | 4.3287799008 | 9.6746388172 | 5.1386954322 | 6.7148035486 | 1.5438364179 | 7.3358775924 | 9.4641440609 | 10.5736658776 | 1.9263080969 | 3.6183861236 | 4.0518679067 | 2.0493569604 | 4.9229955632 | 4.7976564501 | 4.2005259246 | 7.9169763709 | 0.9292167138 | 4.1438942987 | | Archaeoglobus fulgidus | 83 | Archaea | 7.8742687687 | 1.1695110027 | 4.9165979364 | 8.9548767369 | 4.568636662 | 7.2640358917 | 1.4998752909 | 7.2472039919 | 6.8957233203 | 9.4826333048 | 2.6014466253 | 3.206476915 | 3.8419576418 | 1.7789787933 | 5.7572748236 | 5.4763351139 | 4.1490633048 | 8.6330814159 | 1.0325605451 | 3.6494619148 | +-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+

我想要做的是找到一种方法,用物种,OGT生成csv,然后结合其他一些列,比如A,C,E和G以及这些特定值的百分比之和。

所以输出看起来像这样:(这些总和只是组成)

ACEG.csv

Species OGT Sum of percentage ------------------------------- ----- ------------------- Aeropyrum pernix 95 23.4353 Anaeromyxobacter dehalogenans 26 20.3232 Argobacterium fabrum 27 14.2312 Aquifex aeolicus 85 15.0403 Archaeoglobus fulgidus 83 34.0532

这样做的目的是为每个列(AY)的1000万个组合中的每个组合执行此操作,但我认为这是一个简单的for循环。 我最初尝试在R中实现这一点,但是在python中使用pandas进行反射可能是一个更好的选择。

I am looking for a way of generating a csv file for a specific combination of data from columns in a dataframe.

My data looks like this (except with 200 more rows)

+-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+ | Species | OGT | Domain | A | C | D | E | F | G | H | I | K | L | M | N | P | Q | R | S | T | V | W | Y | +-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+ | Aeropyrum pernix | 95 | Archaea | 9.7659115711 | 0.6720465616 | 4.3895390781 | 7.6501943794 | 2.9344881615 | 8.8666657183 | 1.5011817208 | 5.6901432494 | 4.1428307243 | 11.0604191603 | 2.21143353 | 1.9387130928 | 5.1038552753 | 1.6855017182 | 7.7664358772 | 6.266067034 | 4.2052190807 | 9.2692433532 | 1.318690698 | 3.5614200159 | | Argobacterium fabrum | 26 | Bacteria | 11.5698896021 | 0.7985475923 | 5.5884500155 | 5.8165463343 | 4.0512504104 | 8.2643271309 | 2.0116736244 | 5.7962804605 | 3.8931525401 | 9.9250463349 | 2.5980609708 | 2.9846761128 | 4.7828063605 | 3.1262365491 | 6.5684282943 | 5.9454781844 | 5.3740045968 | 7.3382308193 | 1.2519739683 | 2.3149400984 | | Anaeromyxobacter dehalogenans | 27 | Bacteria | 16.0337898849 | 0.8860252895 | 5.1368827707 | 6.1864992608 | 2.9730203513 | 9.3167603253 | 1.9360386851 | 2.940143349 | 2.3473650439 | 10.898494736 | 1.6343905351 | 1.5247123262 | 6.3580285706 | 2.4715303021 | 9.2639057482 | 4.1890063803 | 4.3992339725 | 8.3885969061 | 1.2890166336 | 1.8265589289 | | Aquifex aeolicus | 85 | Bacteria | 5.8730327277 | 0.795341216 | 4.3287799008 | 9.6746388172 | 5.1386954322 | 6.7148035486 | 1.5438364179 | 7.3358775924 | 9.4641440609 | 10.5736658776 | 1.9263080969 | 3.6183861236 | 4.0518679067 | 2.0493569604 | 4.9229955632 | 4.7976564501 | 4.2005259246 | 7.9169763709 | 0.9292167138 | 4.1438942987 | | Archaeoglobus fulgidus | 83 | Archaea | 7.8742687687 | 1.1695110027 | 4.9165979364 | 8.9548767369 | 4.568636662 | 7.2640358917 | 1.4998752909 | 7.2472039919 | 6.8957233203 | 9.4826333048 | 2.6014466253 | 3.206476915 | 3.8419576418 | 1.7789787933 | 5.7572748236 | 5.4763351139 | 4.1490633048 | 8.6330814159 | 1.0325605451 | 3.6494619148 | +-------------------------------+-----+----------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+---------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+

What I want to do is find a way of generating a csv with species, OGT, and then a combination of a few of the other columns, say A,C,E & G and the sum of the percentages of those particular values.

So output looking something like this: (these sums are just made up)

ACEG.csv

Species OGT Sum of percentage ------------------------------- ----- ------------------- Aeropyrum pernix 95 23.4353 Anaeromyxobacter dehalogenans 26 20.3232 Argobacterium fabrum 27 14.2312 Aquifex aeolicus 85 15.0403 Archaeoglobus fulgidus 83 34.0532

The aim of this is so I can do this for each of the 10 million combinations of each column (A-Y), but I figure that's a simple for loop. I intially was trying to achieve this in R but upon reflection using pandas in python is probably a better bet.

最满意答案

像这样的东西?

def subset_to_csv(cols): df['Sum of percentage'] = your_data[list(cols)].sum(axis=1) df.to_csv(cols + '.csv') df = your_data[['Species', 'OGT']] for c in your_list_of_combinations: subset_to_csv(c)

其中cols是一个包含要cols的列的字符串,例如: 'ABC'

Something like this?

def subset_to_csv(cols): df['Sum of percentage'] = your_data[list(cols)].sum(axis=1) df.to_csv(cols + '.csv') df = your_data[['Species', 'OGT']] for c in your_list_of_combinations: subset_to_csv(c)

Where cols is a string containing the columns you want to subset, e.g.: 'ABC'

更多推荐

本文发布于:2023-04-18 00:51:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/dzcp/a490801b126b06de9b95641fd2386e3e.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:组合   总和   所述   名称   文件

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!