HashSet GetHashcode优化(HashSet GetHashcode optimization)

我在C＃中有以下结构来表示图形边缘：

struct Edge { public Edge(int leftA, int leftB, int leftC, int leftD, int rightA, int rightB, int rightC, int rightD) { LeftIdA = leftA; LeftIdB = leftB; LeftIdC = leftC; LeftIdD = leftD; RightIdA = rightA; RightIdB = rightB; RightIdC = rightC; RightIdD = rightD; } public readonly int LeftIdA; public readonly int LeftIdB; public readonly int LeftIdC; public readonly int LeftIdD; public readonly int RightIdA; public readonly int RightIdB; public readonly int RightIdC; public readonly int RightIdD; }

并且需要在HashSet中存储大量（大约5百万），因此没有重复项。什么是GetHashCode的良好实现，因此它针对速度进行了优化？

我试图在返回的整数中存储每个id的4位，如下所示：

public override int GetHashCode() { int A = LeftIdA & 0xF; int B = LeftIdB & 0xF; int C = LeftIdC & 0xF; int D = LeftIdD & 0xF; int E = RightIdA & 0xF; int F = RightIdB & 0xF; int G = RightIdC & 0xF; int H = RightIdD & 0xF; int result = A; result = (result << 4) | B; result = (result << 4) | C; result = (result << 4) | D; result = (result << 4) | E; result = (result << 4) | F; result = (result << 4) | G; result = (result << 4) | H; return result; }

但它比将项目添加到列表中要慢80％。

I have the following structure in C# to represent a graph edge:

struct Edge { public Edge(int leftA, int leftB, int leftC, int leftD, int rightA, int rightB, int rightC, int rightD) { LeftIdA = leftA; LeftIdB = leftB; LeftIdC = leftC; LeftIdD = leftD; RightIdA = rightA; RightIdB = rightB; RightIdC = rightC; RightIdD = rightD; } public readonly int LeftIdA; public readonly int LeftIdB; public readonly int LeftIdC; public readonly int LeftIdD; public readonly int RightIdA; public readonly int RightIdB; public readonly int RightIdC; public readonly int RightIdD; }

And need to store a lot of it (about 5 millions) in a HashSet so there is no duplicates. What would be a good implementation for GetHashCode so it is optimized for speed?

I have tried to store the 4 bits of each id in the returned integer like this:

public override int GetHashCode() { int A = LeftIdA & 0xF; int B = LeftIdB & 0xF; int C = LeftIdC & 0xF; int D = LeftIdD & 0xF; int E = RightIdA & 0xF; int F = RightIdB & 0xF; int G = RightIdC & 0xF; int H = RightIdD & 0xF; int result = A; result = (result << 4) | B; result = (result << 4) | C; result = (result << 4) | D; result = (result << 4) | E; result = (result << 4) | F; result = (result << 4) | G; result = (result << 4) | H; return result; }

but it is like 80% slower than adding the items to a list.

最满意答案

什么是GetHashCode的良好实现，因此它针对速度进行了优化？

由于您的所有字段都是只读的，因此最好的办法是在构造函数中预先计算哈希码，然后从GetHashCode返回。

要预先计算哈希码，您可以使用Guffa答案中的公式。

What would be a good implementation for GetHashCode so it is optimized for speed?

Since all your fields are read-only, your best bet is probably to pre-compute the hashcode in the constructor, and then just return that from GetHashCode.

To precalculate the hashcode, you can use the formula from Guffa's answer.

更多推荐