经验估计大噢时间效率(Empirically estimating big

经验估计大噢时间效率(Empirically estimating big-oh time efficiency)

背景

我想通过基准估计图书馆中某些方法的大喔表现。我不需要精确度 - 足以证明O（1），O（logn），O（n），O（nlogn），O（n ^ 2）或更差的东西。由于big-oh表示上限，因此对于O（log logn）的某个值估计O（logn）不是问题。

现在，我正在考虑找到最适合每个大哦的数据的恒定乘数k（但是将会显示所有结果），然后选择最适合的大哦。

问题

有比这更好的方法吗？如果是，那是什么？否则，任何人都可以指出算法来估计k以获得最佳拟合，并比较每条曲线适合数据的程度如何？

注释和限制

鉴于迄今为止的意见，我需要说清楚一点：

这需要自动化。我不能“查看”数据并进行判断。我要用多种n尺寸的方法进行基准测试。对于每个大小，我将使用一个经过验证的基准框架，提供可靠的统计结果。我实际上事先知道大部分将要测试的方法的大哦。我的主要目的是为他们提供性能回归测试。代码将用Scala编写，任何可用的Java库都可以使用。

例

这是我想测量的那种东西的一个例子。我有一个这样的签名方法：

def apply(n: Int): A

Background

I'd like to estimate the big-oh performance of some methods in a library through benchmarks. I don't need precision -- it suffices to show that something is O(1), O(logn), O(n), O(nlogn), O(n^2) or worse than that. Since big-oh means upper-bound, estimating O(logn) for something that is O(log logn) is not a problem.

Right now, I'm thinking of finding the constant multiplier k that best fits data for each big-oh (but will top all results), and then choosing the big-oh with the best fit.

Questions

Are there better ways of doing it than what I'm thiking of? If so, what are they? Otherwise, can anyone point me to the algorithms to estimate k for best fitting, and comparing how well each curve fits the data?

Notes & Constraints

Given the comments so far, I need to make a few things clear:

This needs to be automated. I can't "look" at data and make a judgment call. I'm going to benchmark the methods with multiple n sizes. For each size n, I'm going to use a proven benchmark framework that provides reliable statistical results. I actually know beforehand the big-oh of most of the methods that will be tested. My main intention is to provide performance regression testing for them. The code will be written in Scala, and any free Java library can be used.

Example

Here's one example of the kind of stuff I want to measure. I have a method with this signature:

def apply(n: Int): A

Given an n, it will return the nth element of a sequence. This method can have O(1), O(logn) or O(n) given the existing implementations, and small changes can get it to use a suboptimal implementation by mistake. Or, more easily, could get some other method that depends on it to use a suboptimal version of it.

最满意答案

为了开始，你必须做出几个假设。

n与任何常数项相比都很大。您可以有效地随机输入您的输入数据您可以以足够的密度进行采样，以便掌握运行时间的分布情况

特别是（3）难以与（1）一致。所以你可能会得到一个指数最坏的情况，但从来没有遇到这种最坏的情况，因此认为你的算法要好于平均水平。

就这样说，你需要的就是任何标准曲线拟合图书馆。 Apache Commons数学有一个完全足够的数学。然后，您可以创建一个所有要测试的常用术语的函数（例如常量，日志n，n，n log n，n n，n n * n，e ^ n），或者取数据的日志并适合指数，然后如果你得到一个不接近整数的指数，那么看看抛出一个日志n是否更合适。

（更详细地说，如果您适合C*x^a和a C*x^a ，或者更容易地log C + a log x ，则可以得到指数a ;在所有常见术语中，我会得到每个术语的权重，所以如果你有n*n + C*n*log(n) ，其中C大，你也会拿起这个术语）

你会想要改变大小，以便你可以分辨不同的情况（如果你关心那些话可能很难记录日志条款），并且安全地更多的不同的大小，比你有参数（大概3x多余会开始好的，只要你做至少十几个运行总计）。

编辑：这是Scala代码，为您做这一切。而不是解释每一个小块，我会留给你调查; 它使用C * x ^ a拟合实现上述方案，并返回（（a，C），（a的下限，a的上限））。边界是相当保守的，你可以从运行的东西看几次。 C的单位是秒（ a是无单位），但不要太信任，因为有一些循环开销（也有一些噪音）。

class TimeLord[A: ClassManifest,B: ClassManifest](setup: Int => A, static: Boolean = true)(run: A => B) { @annotation.tailrec final def exceed(time: Double, size: Int, step: Int => Int = _*2, first: Int = 1): (Int,Double) = { var i = 0 val elapsed = 1e-9 * { if (static) { val a = setup(size) var b: B = null.asInstanceOf[B] val t0 = System.nanoTime var i = 0 while (i < first) { b = run(a) i += 1 } System.nanoTime - t0 } else { val starts = if (static) { val a = setup(size); Array.fill(first)(a) } else Array.fill(first)(setup(size)) val answers = new Array[B](first) val t0 = System.nanoTime var i = 0 while (i < first) { answers(i) = run(starts(i)) i += 1 } System.nanoTime - t0 } } if (time > elapsed) { val second = step(first) if (second <= first) throw new IllegalArgumentException("Iteration size increase failed: %d to %d".format(first,second)) else exceed(time, size, step, second) } else (first, elapsed) } def multibench(smallest: Int, largest: Int, time: Double, n: Int, m: Int = 1) = { if (m < 1 || n < 1 || largest < smallest || (n>1 && largest==smallest)) throw new IllegalArgumentException("Poor choice of sizes") val frac = (largest.toDouble)/smallest (0 until n).map(x => (smallest*math.pow(frac,x/((n-1).toDouble))).toInt).map{ i => val (k,dt) = exceed(time,i) if (m==1) i -> Array(dt/k) else { i -> ( (dt/k) +: (1 until m).map(_ => exceed(time,i,first=k)).map{ case (j,dt2) => dt2/j }.toArray ) } }.foldLeft(Vector[(Int,Array[Double])]()){ (acc,x) => if (acc.length==0 || acc.last._1 != x._1) acc :+ x else acc.dropRight(1) :+ (x._1, acc.last._2 ++ x._2) } } def alpha(data: Seq[(Int,Array[Double])]) = { // Use Theil-Sen estimator for calculation of straight-line fit for exponent // Assume timing relationship is t(n) = A*n^alpha val dat = data.map{ case (i,ad) => math.log(i) -> ad.map(x => math.log(i) -> math.log(x)) } val slopes = (for { i <- dat.indices j <- ((i+1) until dat.length) (pi,px) <- dat(i)._2 (qi,qx) <- dat(j)._2 } yield (qx - px)/(qi - pi)).sorted val mbest = slopes(slopes.length/2) val mp05 = slopes(slopes.length/20) val mp95 = slopes(slopes.length-(1+slopes.length/20)) val intercepts = dat.flatMap{ case (i,a) => a.map{ case (li,lx) => lx - li*mbest } }.sorted val bbest = intercepts(intercepts.length/2) ((mbest,math.exp(bbest)),(mp05,mp95)) } }

请注意，假设使用静态初始化数据并且与运行的任何数据相比， multibench方法都需要大约sqrt（2） n m *运行时间。以下是一些示例，其中参数选择需要〜15秒才能运行：

val tl1 = new TimeLord(x => List.range(0,x))(_.sum) // Should be linear // Try list sizes 100 to 10000, with each run taking at least 0.1s; // use 10 different sizes and 10 repeats of each size scala> tl1.alpha( tl1.multibench(100,10000,0.1,10,10) ) res0: ((Double, Double), (Double, Double)) = ((1.0075537890632216,7.061397125245351E-9),(0.8763463348353099,1.102663784225697)) val longList = List.range(0,100000) val tl2 = new TimeLord(x=>x)(longList.apply) // Again, should be linear scala> tl2.alpha( tl2.multibench(100,10000,0.1,10,10) ) res1: ((Double, Double), (Double, Double)) = ((1.4534378213477026,1.1325696181862922E-10),(0.969955396265306,1.8294175293676322)) // 1.45?! That's not linear. Maybe the short ones are cached? scala> tl2.alpha( tl2.multibench(9000,90000,0.1,100,1) ) res2: ((Double, Double), (Double, Double)) = ((0.9973235607566956,1.9214696731124573E-9),(0.9486294398193154,1.0365312207345019)) // Let's try some sorting val tl3 = new TimeLord(x=>Vector.fill(x)(util.Random.nextInt))(_.sorted) scala> tl3.alpha( tl3.multibench(100,10000,0.1,10,10) ) res3: ((Double, Double), (Double, Double)) = ((1.1713142886974603,3.882658025586512E-8),(1.0521099621639414,1.3392622111121666)) // Note the log(n) term comes out as a fractional power // (which will decrease as the sizes increase) // Maybe sort some arrays? // This may take longer to run because we have to recreate the (mutable) array each time val tl4 = new TimeLord(x=>Array.fill(x)(util.Random.nextInt), false)(java.util.Arrays.sort) scala> tl4.alpha( tl4.multibench(100,10000,0.1,10,10) ) res4: ((Double, Double), (Double, Double)) = ((1.1216172965292541,2.2206198821180513E-8),(1.0929414090177318,1.1543697719880128)) // Let's time something slow def kube(n: Int) = (for (i <- 1 to n; j <- 1 to n; k <- 1 to n) yield 1).sum val tl5 = new TimeLord(x=>x)(kube) scala> tl5.alpha( tl5.multibench(10,100,0.1,10,10) ) res5: ((Double, Double), (Double, Double)) = ((2.8456382116915484,1.0433534274508799E-7),(2.6416659356198617,2.999094292838751)) // Okay, we're a little short of 3; there's constant overhead on the small sizes

无论如何，对于指定的用例 - 您正在检查以确保顺序不会改变 - 这可能是足够的，因为您可以在设置测试时玩一些值，以确保他们给出明智的一些。还可以创建搜索稳定性的启发式方法，但这可能是过度的。

（顺便说一句，这里没有明确的预热步骤; Theil-Sen估计器的稳健配合应该使得没有必要的明智的大基准，这也是为什么我不使用任何其他的框架;任何统计数据，它只是失去了这个测试的力量）

再次编辑：如果将alpha方法替换为以下内容：

// We'll need this math @inline private[this] def sq(x: Double) = x*x final private[this] val inv_log_of_2 = 1/math.log(2) @inline private[this] def log2(x: Double) = math.log(x)*inv_log_of_2 import math.{log,exp,pow} // All the info you need to calculate a y value, e.g. y = x*m+b case class Yp(x: Double, m: Double, b: Double) {} // Estimators for data order // fx = transformation to apply to x-data before linear fitting // fy = transformation to apply to y-data before linear fitting // model = given x, slope, and intercept, calculate predicted y case class Estimator(fx: Double => Double, invfx: Double=> Double, fy: (Double,Double) => Double, model: Yp => Double) {} // C*n^alpha val alpha = Estimator(log, exp, (x,y) => log(y), p => p.b*pow(p.x,p.m)) // C*log(n)*n^alpha val logalpha = Estimator(log, exp, (x,y) =>log(y/log2(x)), p => p.b*log2(p.x)*pow(p.x,p.m)) // Use Theil-Sen estimator for calculation of straight-line fit case class Fit(slope: Double, const: Double, bounds: (Double,Double), fracrms: Double) {} def theilsen(data: Seq[(Int,Array[Double])], est: Estimator = alpha) = { // Use Theil-Sen estimator for calculation of straight-line fit for exponent // Assume timing relationship is t(n) = A*n^alpha val dat = data.map{ case (i,ad) => ad.map(x => est.fx(i) -> est.fy(i,x)) } val slopes = (for { i <- dat.indices j <- ((i+1) until dat.length) (pi,px) <- dat(i) (qi,qx) <- dat(j) } yield (qx - px)/(qi - pi)).sorted val mbest = slopes(slopes.length/2) val mp05 = slopes(slopes.length/20) val mp95 = slopes(slopes.length-(1+slopes.length/20)) val intercepts = dat.flatMap{ _.map{ case (li,lx) => lx - li*mbest } }.sorted val bbest = est.invfx(intercepts(intercepts.length/2)) val fracrms = math.sqrt(data.map{ case (x,ys) => ys.map(y => sq(1 - y/est.model(Yp(x,mbest,bbest)))).sum }.sum / data.map(_._2.length).sum) Fit(mbest, bbest, (mp05,mp95), fracrms) }

那么当有一个日志项也可以得到一个指数的估计 - 存在误差估计，以选择是否有正确的方法去做，但是由你决定（即我假设你最初会监督这个，然后阅读掉数字）：

val tl3 = new TimeLord(x=>Vector.fill(x)(util.Random.nextInt))(_.sorted) val timings = tl3.multibench(100,10000,0.1,10,10) // Regular n^alpha fit scala> tl3.theilsen( timings ) res20: tl3.Fit = Fit(1.1811648421030059,3.353753446942075E-8,(1.1100382697696545,1.3204652930525234),0.05927994882343982) // log(n)*n^alpha fit--note first value is closer to an integer // and last value (error) is smaller scala> tl3.theilsen( timings, tl3.logalpha ) res21: tl3.Fit = Fit(1.0369167329732445,9.211366397621766E-9,(0.9722967182484441,1.129869067913768),0.04026308919615681)

（编辑：固定RMS计算，这实际上是平均值，加上证明你只需要做一次定时，然后可以尝试两者）。

In order to get started, you have to make a couple of assumptions.

n is large compared to any constant terms. You can effectively randomize your input data You can sample with sufficient density to get a good handle on the distribution of runtimes

In particular, (3) is difficult to achieve in concert with (1). So you may get something with an exponential worst case, but never run into that worst case, and thus think your algorithm is much better than it is on average.

With that said, all you need is any standard curve fitting library. Apache Commons Math has a fully adequate one. You then either create a function with all the common terms that you want to test (e.g. constant, log n, n, n log n, nn, nn*n, e^n), or you take the log of your data and fit the exponent, and then if you get an exponent not close to an integer, see if throwing in a log n gives a better fit.

(In more detail, if you fit C*x^a for C and a, or more easily log C + a log x, you can get the exponent a; in the all-common-terms-at-once scheme, you'll get weights for each term, so if you have n*n + C*n*log(n) where C is large, you'll pick up that term also.)

You'll want to vary the size by enough so that you can tell the different cases apart (might be hard with log terms, if you care about those), and safely more different sizes than you have parameters (probably 3x excess would start being okay, as long as you do at least a dozen or so runs total).

Edit: Here is Scala code that does all this for you. Rather than explain each little piece, I'll leave it to you to investigate; it implements the scheme above using the C*x^a fit, and returns ((a,C),(lower bound for a, upper bound for a)). The bounds are quite conservative, as you can see from running the thing a few times. The units of C are seconds (a is unitless), but don't trust that too much as there is some looping overhead (and also some noise).

class TimeLord[A: ClassManifest,B: ClassManifest](setup: Int => A, static: Boolean = true)(run: A => B) { @annotation.tailrec final def exceed(time: Double, size: Int, step: Int => Int = _*2, first: Int = 1): (Int,Double) = { var i = 0 val elapsed = 1e-9 * { if (static) { val a = setup(size) var b: B = null.asInstanceOf[B] val t0 = System.nanoTime var i = 0 while (i < first) { b = run(a) i += 1 } System.nanoTime - t0 } else { val starts = if (static) { val a = setup(size); Array.fill(first)(a) } else Array.fill(first)(setup(size)) val answers = new Array[B](first) val t0 = System.nanoTime var i = 0 while (i < first) { answers(i) = run(starts(i)) i += 1 } System.nanoTime - t0 } } if (time > elapsed) { val second = step(first) if (second <= first) throw new IllegalArgumentException("Iteration size increase failed: %d to %d".format(first,second)) else exceed(time, size, step, second) } else (first, elapsed) } def multibench(smallest: Int, largest: Int, time: Double, n: Int, m: Int = 1) = { if (m < 1 || n < 1 || largest < smallest || (n>1 && largest==smallest)) throw new IllegalArgumentException("Poor choice of sizes") val frac = (largest.toDouble)/smallest (0 until n).map(x => (smallest*math.pow(frac,x/((n-1).toDouble))).toInt).map{ i => val (k,dt) = exceed(time,i) if (m==1) i -> Array(dt/k) else { i -> ( (dt/k) +: (1 until m).map(_ => exceed(time,i,first=k)).map{ case (j,dt2) => dt2/j }.toArray ) } }.foldLeft(Vector[(Int,Array[Double])]()){ (acc,x) => if (acc.length==0 || acc.last._1 != x._1) acc :+ x else acc.dropRight(1) :+ (x._1, acc.last._2 ++ x._2) } } def alpha(data: Seq[(Int,Array[Double])]) = { // Use Theil-Sen estimator for calculation of straight-line fit for exponent // Assume timing relationship is t(n) = A*n^alpha val dat = data.map{ case (i,ad) => math.log(i) -> ad.map(x => math.log(i) -> math.log(x)) } val slopes = (for { i <- dat.indices j <- ((i+1) until dat.length) (pi,px) <- dat(i)._2 (qi,qx) <- dat(j)._2 } yield (qx - px)/(qi - pi)).sorted val mbest = slopes(slopes.length/2) val mp05 = slopes(slopes.length/20) val mp95 = slopes(slopes.length-(1+slopes.length/20)) val intercepts = dat.flatMap{ case (i,a) => a.map{ case (li,lx) => lx - li*mbest } }.sorted val bbest = intercepts(intercepts.length/2) ((mbest,math.exp(bbest)),(mp05,mp95)) } }

Note that the multibench method is expected to take about sqrt(2)nm*time to run, assuming that static initialization data is used and is relatively cheap compared to whatever you're running. Here are some examples with parameters chosen to take ~15s to run:

val tl1 = new TimeLord(x => List.range(0,x))(_.sum) // Should be linear // Try list sizes 100 to 10000, with each run taking at least 0.1s; // use 10 different sizes and 10 repeats of each size scala> tl1.alpha( tl1.multibench(100,10000,0.1,10,10) ) res0: ((Double, Double), (Double, Double)) = ((1.0075537890632216,7.061397125245351E-9),(0.8763463348353099,1.102663784225697)) val longList = List.range(0,100000) val tl2 = new TimeLord(x=>x)(longList.apply) // Again, should be linear scala> tl2.alpha( tl2.multibench(100,10000,0.1,10,10) ) res1: ((Double, Double), (Double, Double)) = ((1.4534378213477026,1.1325696181862922E-10),(0.969955396265306,1.8294175293676322)) // 1.45?! That's not linear. Maybe the short ones are cached? scala> tl2.alpha( tl2.multibench(9000,90000,0.1,100,1) ) res2: ((Double, Double), (Double, Double)) = ((0.9973235607566956,1.9214696731124573E-9),(0.9486294398193154,1.0365312207345019)) // Let's try some sorting val tl3 = new TimeLord(x=>Vector.fill(x)(util.Random.nextInt))(_.sorted) scala> tl3.alpha( tl3.multibench(100,10000,0.1,10,10) ) res3: ((Double, Double), (Double, Double)) = ((1.1713142886974603,3.882658025586512E-8),(1.0521099621639414,1.3392622111121666)) // Note the log(n) term comes out as a fractional power // (which will decrease as the sizes increase) // Maybe sort some arrays? // This may take longer to run because we have to recreate the (mutable) array each time val tl4 = new TimeLord(x=>Array.fill(x)(util.Random.nextInt), false)(java.util.Arrays.sort) scala> tl4.alpha( tl4.multibench(100,10000,0.1,10,10) ) res4: ((Double, Double), (Double, Double)) = ((1.1216172965292541,2.2206198821180513E-8),(1.0929414090177318,1.1543697719880128)) // Let's time something slow def kube(n: Int) = (for (i <- 1 to n; j <- 1 to n; k <- 1 to n) yield 1).sum val tl5 = new TimeLord(x=>x)(kube) scala> tl5.alpha( tl5.multibench(10,100,0.1,10,10) ) res5: ((Double, Double), (Double, Double)) = ((2.8456382116915484,1.0433534274508799E-7),(2.6416659356198617,2.999094292838751)) // Okay, we're a little short of 3; there's constant overhead on the small sizes

Anyway, for the stated use case--where you are checking to make sure the order doesn't change--this is probably adequate, since you can play with the values a bit when setting up the test to make sure they give something sensible. One could also create heuristics that search for stability, but that's probably overkill.

(Incidentally, there is no explicit warmup step here; the robust fitting of the Theil-Sen estimator should make it unnecessary for sensibly large benchmarks. This also is why I don't use any other benching framework; any statistics that it does just loses power from this test.)

Edit again: if you replace the alpha method with the following:

// We'll need this math @inline private[this] def sq(x: Double) = x*x final private[this] val inv_log_of_2 = 1/math.log(2) @inline private[this] def log2(x: Double) = math.log(x)*inv_log_of_2 import math.{log,exp,pow} // All the info you need to calculate a y value, e.g. y = x*m+b case class Yp(x: Double, m: Double, b: Double) {} // Estimators for data order // fx = transformation to apply to x-data before linear fitting // fy = transformation to apply to y-data before linear fitting // model = given x, slope, and intercept, calculate predicted y case class Estimator(fx: Double => Double, invfx: Double=> Double, fy: (Double,Double) => Double, model: Yp => Double) {} // C*n^alpha val alpha = Estimator(log, exp, (x,y) => log(y), p => p.b*pow(p.x,p.m)) // C*log(n)*n^alpha val logalpha = Estimator(log, exp, (x,y) =>log(y/log2(x)), p => p.b*log2(p.x)*pow(p.x,p.m)) // Use Theil-Sen estimator for calculation of straight-line fit case class Fit(slope: Double, const: Double, bounds: (Double,Double), fracrms: Double) {} def theilsen(data: Seq[(Int,Array[Double])], est: Estimator = alpha) = { // Use Theil-Sen estimator for calculation of straight-line fit for exponent // Assume timing relationship is t(n) = A*n^alpha val dat = data.map{ case (i,ad) => ad.map(x => est.fx(i) -> est.fy(i,x)) } val slopes = (for { i <- dat.indices j <- ((i+1) until dat.length) (pi,px) <- dat(i) (qi,qx) <- dat(j) } yield (qx - px)/(qi - pi)).sorted val mbest = slopes(slopes.length/2) val mp05 = slopes(slopes.length/20) val mp95 = slopes(slopes.length-(1+slopes.length/20)) val intercepts = dat.flatMap{ _.map{ case (li,lx) => lx - li*mbest } }.sorted val bbest = est.invfx(intercepts(intercepts.length/2)) val fracrms = math.sqrt(data.map{ case (x,ys) => ys.map(y => sq(1 - y/est.model(Yp(x,mbest,bbest)))).sum }.sum / data.map(_._2.length).sum) Fit(mbest, bbest, (mp05,mp95), fracrms) }

then you can get an estimate of the exponent when there's a log term also--error estimates exist to pick whether the log term or not is the correct way to go, but it's up to you to make the call (i.e. I'm assuming you'll be supervising this initially and reading the numbers that come off):

val tl3 = new TimeLord(x=>Vector.fill(x)(util.Random.nextInt))(_.sorted) val timings = tl3.multibench(100,10000,0.1,10,10) // Regular n^alpha fit scala> tl3.theilsen( timings ) res20: tl3.Fit = Fit(1.1811648421030059,3.353753446942075E-8,(1.1100382697696545,1.3204652930525234),0.05927994882343982) // log(n)*n^alpha fit--note first value is closer to an integer // and last value (error) is smaller scala> tl3.theilsen( timings, tl3.logalpha ) res21: tl3.Fit = Fit(1.0369167329732445,9.211366397621766E-9,(0.9722967182484441,1.129869067913768),0.04026308919615681)

(Edit: fixed the RMS computation so it's actually the mean, plus demonstrated that you only need to do timings once and can then try both fits.)

更多推荐