优化Python:大型数组,内存问题

编程入门 行业动态 更新时间:2024-10-28 11:21:58
本文介绍了优化Python:大型数组,内存问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我在运行python/numypy代码时遇到速度问题.我不知道如何使其更快,也许是其他人?

I'm having a speed problem running a python / numypy code. I don't know how to make it faster, maybe someone else?

假定存在一个具有两个三角剖分的曲面,一个具有M个点的精细(..._ fine),一个具有N个点的粗糙表面.此外,每个点都有粗网格上的数据(N个浮点数).我正在尝试执行以下操作:

Assume there is a surface with two triangulation, one fine (..._fine) with M points, one coarse with N points. Also, there's data on the coarse mesh at every point (N floats). I'm trying to do the following:

对于细网格上的每个点,在粗网格上找到k个最接近的点并获得平均值.短:从粗到细插值数据.

For every point on the fine mesh, find the k closest points on coarse mesh and get mean value. Short: interpolate data from coarse to fine.

我的代码现在就这样.对于大数据(在我的情况下,M = 2e6,N = 1e4),代码运行约25分钟,这是因为显式的for循环不会进入numpy.有什么想法如何使用智能索引解决该问题吗? M x N个阵列正在烧毁RAM.

My code right now goes like that. With large data (in my case M = 2e6, N = 1e4) the code runs about 25 minutes, guess due to the explicit for loop not going into numpy. Any ideas how to solve that one with smart indexing? M x N arrays blowing the RAM..

import numpy as np p_fine.shape => m x 3 p.shape => n x 3 data_fine = np.empty((m,)) for i, ps in enumerate(p_fine): data_fine[i] = np.mean(data_coarse[np.argsort(np.linalg.norm(ps-p,axis=1))[:k]])

干杯!

推荐答案

首先,感谢您的详细帮助.

First of all thanks for the detailed help.

首先,Divakar,您的解决方案大大提高了速度.根据我的数据,代码运行了不到2分钟,具体取决于块大小.

First, Divakar, your solutions gave substantial speed-up. With my data, the code ran for just below 2 minutes depending a bit on the chunk size.

我也尝试了sklearn,最后得到

I also tried my way around sklearn and ended up with

def sklearnSearch_v3(p, p_fine, k): neigh = NearestNeighbors(k) neigh.fit(p) return data_coarse[neigh.kneighbors(p_fine)[1]].mean(axis=1)

最终变得非常快,对于我的数据大小,我得到以下信息

which ended up being quite fast, for my data sizes, I get the following

import numpy as np from sklearn.neighbors import NearestNeighbors m,n = 2000000,20000 p_fine = np.random.rand(m,3) p = np.random.rand(n,3) data_coarse = np.random.rand(n) k = 3

收益

%timeit sklearv3(p, p_fine, k) 1 loop, best of 3: 7.46 s per loop

更多推荐

优化Python:大型数组,内存问题

本文发布于:2023-08-05 11:15:58,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1305177.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:数组   内存   Python

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!