在numpy数组中按最大或最小分组

编程入门 行业动态 更新时间:2024-10-27 04:28:11
本文介绍了在numpy数组中按最大或最小分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有两个等长的一维numpy数组id和data,其中id是重复的有序整数序列,这些整数定义了data上的子窗口.例如,

I have two equal-length 1D numpy arrays, id and data, where id is a sequence of repeating, ordered integers that define sub-windows on data. For example,

id data 1 2 1 7 1 3 2 8 2 9 2 10 3 1 3 -10

我想通过对id进行分组并采用最大值或最小值来汇总data.在SQL中,这将是典型的聚合查询,例如SELECT MAX(data) FROM tablename GROUP BY id ORDER BY id.有没有一种方法可以避免Python循环并以矢量化方式执行此操作,还是必须降到C?

I would like to aggregate data by grouping on id and taking either the max or the min. In SQL, this would be a typical aggregation query like SELECT MAX(data) FROM tablename GROUP BY id ORDER BY id. Is there a way I can avoid Python loops and do this in a vectorized manner, or do I have to drop down to C?

推荐答案

最近几天,我一直在堆栈上看到一些非常相似的问题.以下代码与numpy.unique的实现非常相似,并且由于它利用了底层的numpy机制,因此它很可能会比在python循环中可以执行的任何操作都要快.

I've been seeing some very similar questions on stack overflow the last few days. The following code is very similar to the implementation of numpy.unique and because it takes advantage of the underlying numpy machinery, it is most likely going to be faster than anything you can do in a python loop.

import numpy as np def group_min(groups, data): # sort with major key groups, minor key data order = np.lexsort((data, groups)) groups = groups[order] # this is only needed if groups is unsorted data = data[order] # construct an index which marks borders between groups index = np.empty(len(groups), 'bool') index[0] = True index[1:] = groups[1:] != groups[:-1] return data[index] #max is very similar def group_max(groups, data): order = np.lexsort((data, groups)) groups = groups[order] #this is only needed if groups is unsorted data = data[order] index = np.empty(len(groups), 'bool') index[-1] = True index[:-1] = groups[1:] != groups[:-1] return data[index]

更多推荐

在numpy数组中按最大或最小分组

本文发布于:2023-11-22 00:52:06,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1615331.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:组中   最小   numpy

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!