我使用 Python 的 unittest 模块并想检查两个复杂的数据结构是否相等.对象可以是具有各种值的字典列表:数字、字符串、Python 容器(列表/元组/字典)和 numpy 数组.后者是问这个问题的原因,因为我不能只是做
I use Python's unittest module and want to check if two complex data structures are equal. The objects can be lists of dicts with all sorts of values: numbers, strings, Python containers (lists/tuples/dicts) and numpy arrays. The latter are the reason for asking the question, because I cannot just do
self.assertEqual(big_struct1, big_struct2)因为它产生一个
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()我想我需要为此编写自己的平等测试.它应该适用于任意结构.我目前的想法是一个递归函数:
I imagine that I need to write my own equality test for this. It should work for arbitrary structures. My current idea is a recursive function that:
- 尝试将arg1的当前节点"与arg2的对应节点进行直接比较;
- 如果没有引发异常,则继续(终端"节点/叶也在此处处理);
- 如果 ValueError 被捕获,继续深入直到找到 numpy.array;
- 比较数组(例如像这样).
- tries direct comparison of the current "node" of arg1 to the corresponding node of arg2;
- if no exception is raised, moves on ("terminal" nodes/leaves are processed here, too);
- if ValueError is caught, goes deeper until it finds a numpy.array;
- compares the arrays (e.g. like this).
跟踪两个结构的对应"节点似乎有点问题,但也许 zip 就是我所需要的.
What seems a little problematic is keeping track of "corresponding" nodes of two structures, but perhaps zip is all I need here.
问题是:这种方法有没有好的(更简单的)替代方法? 也许 numpy 会为此提供一些工具?如果没有建议的替代方案,我将实施这个想法(除非我有更好的想法)并作为答案发布.
The question is: are there good (simpler) alternatives to this approach? Maybe numpy presents some tools for this? If no alternatives are suggested, I will implement this idea (unless I have a better one) and post as an answer.
附言我有一种模糊的感觉,我可能看到过一个解决这个问题的问题,但我现在找不到了.
P.S. I have a vague feeling that I might have seen a question addressing this problem, but I can't find it now.
P.P.S.另一种方法是遍历结构并将所有 numpy.array 转换为列表的函数,但这是否更容易实现?对我来说似乎一样.
P.P.S. An alternative approach would be a function that traverses the structure and converts all numpy.arrays to lists, but is this any easier to implement? Seems the same to me.
子类化 numpy.ndarray 听起来很有希望,但显然我没有将比较的两边硬编码到测试中.不过,其中之一确实是硬编码的,所以我可以:
Subclassing numpy.ndarray sounds very promising, but obviously I don't have both sides of the comparison hard-coded into a test. One of them, though, is indeed hardcoded, so I can:
- 用 numpy.array 的自定义子类填充它;
- 在isinstance(other, SaneEqualityArray)改为isinstance(other, np.ndarray)/1258041">jterrace 的回答;
- 在比较中始终将其用作 LHS.
- populate it with custom subclasses of numpy.array;
- change isinstance(other, SaneEqualityArray) to isinstance(other, np.ndarray) in jterrace's answer;
- always use it as LHS in comparisons.
我在这方面的问题是:
编辑 2:我试过了,(看似)有效的实现显示在 this answer.
Edit 2: I tried it out, the (seemingly) working implementation is shown in this answer.
推荐答案所以 jterrace 说明的想法似乎可行对我稍作修改:
So the idea illustrated by jterrace seems to work for me with a slight modification:
class SaneEqualityArray(np.ndarray): def __eq__(self, other): return (isinstance(other, np.ndarray) and self.shape == other.shape and np.allclose(self, other))就像我说的,带有这些对象的容器应该在等式检查的左侧.我从现有的 numpy.ndarray 像这样创建 SaneEqualityArray 对象:
Like I said, the container with these objects should be on the left side of the equality check. I create SaneEqualityArray objects from existing numpy.ndarrays like this:
SaneEqualityArray(my_array.shape, my_array.dtype, my_array)按照ndarray构造函数签名:
ndarray(shape, dtype=float, buffer=None, offset=0, strides=None, order=None)这个类是在测试套件中定义的,仅用于测试目的.相等检查的 RHS 是被测试函数返回的实际对象,包含真正的 numpy.ndarray 对象.
This class is defined within the test suite and serves for testing purposes only. The RHS of the equality check is an actual object returned by the tested function and contains real numpy.ndarray objects.
附言感谢到目前为止发布的两个答案的作者,他们都非常有帮助.如果有人发现这种方法有任何问题,我们将非常感谢您的反馈.
P.S. Thanks to the authors of both answers posted so far, they were both very helpful. If anyone sees any problems with this approach, I'd appreciate your feedback.
更多推荐
在单元测试中比较(断言)两个包含 numpy 数组的复杂数据结构
发布评论