给定一个离散分布,如何将数字四舍五入到该分布中最接近的值?

编程入门 行业动态 更新时间:2024-10-13 06:13:42
本文介绍了给定一个离散分布,如何将数字四舍五入到该分布中最接近的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我最终想要做的是将离散随机变量分布的期望值四舍五入为分布中的有效数字.例如,如果我从数字 [1, 5, 6] 中均匀地绘制,则预期值为 4,但我想返回最接近的数字(即 5).

What I ultimately want to do is round the expected value of a discrete random variable distribution to a valid number in the distribution. For example if I am drawing evenly from the numbers [1, 5, 6], the expected value is 4 but I want to return the closest number to that (ie, 5).

from scipy.stats import * xk = (1, 5, 6) pk = np.ones(len(xk))/len(xk) custom = rv_discrete(name='custom', values=(xk, pk)) print(custom.expect()) # 4.0 def round_discrete(discrete_rv_dist, val): # do something here return answer print(round_discrete(custom, custom.expect())) # 5.0

我不知道先验将使用什么分布(即可能不是整数,可能是无界分布),所以我真的很难想出一种足够通用的算法.我刚刚了解到 rv_discrete 不适用于非整数 xk 值.

I don't know apriori what distribution will be used (ie might not be integers, might be an unbounded distribution), so I'm really struggling to think of an algorithm that is sufficiently generic. I just learned that rv_discrete doesn't work on non-integer xk values.

至于为什么我想这样做,我正在组合蒙特卡罗模拟,并希望每个分布都有一个名义"值.我认为 EV 是最合适的,而不是众数或中位数.我可能在下游模拟中有一些值必须是几个离散选择之一,因此传递不在该集合内的值是不可接受的.

As to why I want to do this, I'm putting together a monte-carlo simulation, and want a "nominal" value for each distribution. I think that the EV is the most physically appropriate rather than the mode or median. I might have values in the downstream simulation that have to be one of several discrete choices, so passing a value that is not within that set is not acceptable.

如果在 Python 中已经有一种很好的方法可以做到这一点,那就太好了,否则我可以将数学解释为代码.

If there's already a nice way to do this in Python that would be great, otherwise I can interpret math into code.

推荐答案

想通了,并测试了它的工作原理.如果我将我的值 X 插入到 cdf 中,那么我可以将概率 P = cdf(X) 插入到 ppf 中.ppf(P +- epsilon) 处的值将为我提供集合中与 X 最接近的值.

Figured it out, and tested it working. If I plug my value X into the cdf, then I can plug that probability P = cdf(X) into the ppf. The values at ppf(P +- epsilon) will give me the closest values in the set to X.

或者从几何角度来说,对于离散的 pmf,点 (X,P) 将位于相应 cdf 的水平部分.当您反转 cdf 时,(P,X) 现在位于 ppf 的垂直部分.取 P +- eps 将为您提供与该垂直跳跃连接的 ppf 的 2 个最近的平坦部分,它们对应于有效值 X1、X2.然后你可以做一个简单的差异来找出哪个更接近你的目标值.

Or more geometrically, for a discrete pmf, the point (X,P) will lie on a horizontal portion of the corresponding cdf. When you invert the cdf, (P,X) is now on a vertical section of the ppf. Taking P +- eps will give you the 2 nearest flat portions of the ppf connected to that vertical jump, which correspond to the valid values X1, X2. You can then do a simple difference to figure out which is closer to your target value.

import numpy as np eps = np.finfo(float).eps ev = custom.expect() p = custom.cdf(ev) ev_candidates = custom.ppf([p - eps, p, p + eps]) ev_candidates_distance = abs(ev_candidates - ev) ev_closest = ev_candidates[np.argmin(ev_candidates_distance)] print(ev_closest) # 5.0

条款:pmf - 概率质量函数cdf - 累积分布函数(pdf 的累积和)ppf - 百分点函数(cdf 的倒数)eps - epsilon(最小可能的增量)

Terms: pmf - probability mass function cdf - cumulative distribution function (cumulative sum of the pdf) ppf - percentage point function (inverse of the cdf) eps - epsilon (smallest possible increment)

更多推荐

给定一个离散分布,如何将数字四舍五入到该分布中最接近的值?

本文发布于:2023-11-07 23:31:48,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1567732.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:如何将   到该   最接近   四舍五入   数字

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!