在Python中使用硬盘而不是RAM(Use hard drive instead of RAM in Python)

编程入门行业动态更新时间:2024-10-25 12:29:35

我想知道是否有方法或Python包可以让我使用大型数据集而无需将其写入RAM。

我也在使用pandas进行统计功能。

我需要访问整个数据集，因为许多统计函数需要整个数据集才能返回可靠的结果。

我在使用Windows 10的LiClipse上使用PyDev（带有解释器Python 3.4）。

I'd like to know if there's a method or a Python Package that can make me use a large dataset without writing it in RAM.

I'm also using pandas for statistical function.

I need to have access on the entire dataset because many statistical functions needs the entire dataset to return credible results.

I'm using PyDev (with interpreter Python 3.4) on LiClipse with Windows 10.

最满意答案

您也可以使用Sframes ， Dask进行大型数据集支持，或者使用pandas和read / iterate in chunk，以最大限度地减少RAM使用。另外值得一看火焰库

读入块：

chunksize = 10 ** 6 for chunk in pd.read_csv(filename, chunksize=chunksize): process(chunk)

You could alternatively use Sframes, Dask for large dataset support or alternatively use pandas and read/iterate in chunks in order to minimise RAM usage. Also worth having a look at the blaze library

Read in chunks:

chunksize = 10 ** 6 for chunk in pd.read_csv(filename, chunksize=chunksize): process(chunk)

更多推荐

本文发布于:2023-08-02 16:35:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1378738.html

而不是硬盘 Python drive hard

发布评论取消回复

评论列表（有 0 条评论）

在Python中使用硬盘而不是RAM(Use hard drive instead of RAM in Python)

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表