我要选择 K 元素均匀随机出一个可能的 N 而不选择相同数量的两倍。有两种方法琐碎于此。
I want to choose k elements uniformly at random out of a possible n without choosing the same number twice. There are two trivial approaches to this.
我的问题:
有一个 O(K)时, O(K)空间算法对所有 K 和 N ?
is there an O(k) time, O(k) space algorithm for all k and n?
推荐答案随着Ø (1)哈希表,部分费雪耶茨的方法,可以向运行在O( K 的)时间和空间。诀窍是简单地只存储的修改的数组中的哈希表中的元素。
With an O(1) hash table, the partial Fisher-Yates method can be made to run in O(k) time and space. The trick is simply to store only the changed elements of the array in the hash table.
下面是Java中的一个简单的例子:
Here's a simple example in Java:
public static int[] getRandomSelection (int k, int n, Random rng) { if (k > n) throw new IllegalArgumentException( "Cannot choose " + k + " elements out of " + n + "." ); HashMap<Integer, Integer> hash = new HashMap<Integer, Integer>(2*k); int[] output = new int[k]; for (int i = 0; i < k; i++) { int j = i + rng.nextInt(n - i); output[i] = (hash.containsKey(j) ? hash.remove(j) : j); if (j > i) hash.put(j, (hash.containsKey(i) ? hash.remove(i) : i)); } return output; }这code分配的2倍是一个HashMap; K 的水桶存储修改元素(这应该足以确保哈希表永远不会改头换面),只是运行的部分费舍尔-Yates洗牌就可以了。
This code allocates a HashMap of 2×k buckets to store the modified elements (which should be enough to ensure that the hash table is never rehashed), and just runs a partial Fisher-Yates shuffle on it.
这里有Ideone 的快速测试;它选择两个元素出三30000次,并计算的时候每对元素的被选择的号码。对于一个不带偏见的洗牌,每个有序对应该会出现约5000名(PM 100左右)时,除了不可能的情况下,这两个因素将等于
Here's a quick test on Ideone; it picks two elements out of three 30,000 times, and counts the number of times each pair of elements gets chosen. For an unbiased shuffle, each ordered pair should appear approximately 5,000 (±100 or so) times, except for the impossible cases where both elements would be equal.
更多推荐
选择K掉的n
发布评论