有向概率图

编程入门行业动态更新时间:2024-10-11 21:20:26

本文介绍了有向概率图-减少周期的算法?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

考虑一个有向图，该图从第一个节点 1 遍历到某些最终节点(没有更多的输出边).图中的每个边都有与之相关的概率.汇总将所有可能路径通向所有可能的最终节点的概率返回 1 .(这意味着，我们可以保证最终到达最后一个节点之一.)

Consider a directed graph which is traversed from first node 1 to some final nodes (which have no more outgoing edges). Each edge in the graph has a probability associated with it. Summing up the probabilities to take each possible path towards all possible final nodes returns 1. (Which means, we are guaranteed to arrive at one of the final nodes eventually.)

如果图形中的循环不存在，问题将很简单.不幸的是，图中可能出现相当复杂的循环，可以遍历无数次(显然，每次循环遍历的概率都会成倍降低).

The problem would be simple if loops in the graph would not exist. Unfortunately rather convoluted loops can arise in the graph, which can be traversed an infinite amount of times (probability decreases multiplicatively with each loop traversal, obviously).

是否存在一种通用算法来找到到达每个最终节点的概率?

Is there a general algorithm to find the probabilities to arrive at each of the final nodes?

一个特别讨厌的例子:

我们可以将边缘表示为矩阵(从行(节点) x 到行(节点) y 的概率在条目(x，y))

We can represent the edges as a matrix (probability to go from row (node) x to row (node) y is in the entry (x,y))

{{0, 1/2, 0, 1/14, 1/14, 0, 5/14}, {0, 0, 1/9, 1/2, 0, 7/18, 0}, {1/8, 7/16, 0, 3/16, 1/8, 0, 1/8}, {0, 1, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0, 0}}

或作为有向图:

起始节点 1 为蓝色，最终节点 5,6,7 为绿色.所有边缘均从从其起源的节点开始时遍历它们的可能性进行标记.

The starting node 1 is blue, the final nodes 5,6,7 are green. All edges are labelled by the probability to traverse them when starting from the node where they originate.

从起始节点 1 到最终节点，这具有八种不同的路径:

This has eight different paths from starting node 1 to the final nodes:

{{1/14, {1, 5}}, {5/14, {1, 7}}, {7/36, {1, 2, 6}}, {1/144, {1, 2, 3, 5}}, {1/144, {1, 2, 3, 7}}, {1/36, {1, 4, 2, 6}}, {1/1008, {1, 4, 2, 3, 5}}, {1/1008, {1, 4, 2, 3, 7}}}

(每个路径的表示法是{概率，访问的节点的顺序})

(The notation for each path is {probability,sequence of nodes visited})

有五个不同的循环:

{{1/144, {2, 3, 1}}, {7/144, {3, 2}}, {1/2, {4, 2}}, {1/48, {3, 4, 2}}, {1/1008, {4, 2, 3, 1}}}

(循环符号为{遍历循环的概率，访问的节点的顺序}.

(Notation for loops is {probability to traverse loop once,sequence of nodes visited}).

如果仅可以解决这些循环以获得有效的树状图，就可以解决问题.

If only these cycles could be resolved to obtain an effectively tree like graph, the problem would be solved.

有关如何解决此问题的任何提示?

Any hint on how to tackle this?

我熟悉Java，C ++和C，因此首选使用这些语言的建议.

I'm familiar with Java, C++ and C, so suggestions in these languages are preferred.

推荐答案

我不是马尔可夫链领域的专家，尽管我认为算法因您所遇到的问题而闻名，但我找不到它们.

I'm not expert in the area of Markov chains, and although I think it's likely that algorithms are known for the kind of problem you present, I'm having difficulty finding them.

如果从那个方向没有帮助，那么您可以考虑自己动手.我在这里至少看到两种不同的方法:

If no help comes from that direction, then you can consider rolling your own. I see at least two different approaches here:

模拟.

通过从处于状态1的系统开始，以100％的概率开始，并执行许多迭代，其中您应用转换概率来计算采取步骤后获得的状态的概率，以检查系统的状态随时间的变化.如果可以从每个节点(以非零概率)到达至少一个最终(吸收")节点，那么经过足够的步骤，系统处于非最终状态的概率将渐近地朝零减小.您可以将系统在最终状态S下结束的概率估计为系统在 n 步骤后处于状态S的概率，该估计误差的上限由系统的概率给出在 n 步骤之后处于非最终状态.

Examine how the state of the system evolves over time by starting with the system in state 1 at 100% probability, and performing many iterations in which you apply your transition probabilities to compute the probabilities of the state obtained after taking a step. If at least one final ("absorbing") node can be reached (at non-zero probability) from every node, then over enough steps, the probability that the system is in anything other than a final state will decrease asymptotically toward zero. You can estimate the probability that the system ends in final state S as the probability that it is in state S after n steps, with an upper bound on the error in that estimate given by the probability that the system is in a non-final state after n steps.

实际上，这与计算 Tr n 相同，其中 Tr 是您的过渡概率矩阵，在所有最终状态下均以100％的概率增加自边缘.

As a practical matter, this is the same is computing Trn, where Tr is your transition probability matrix, augmented with self-edges at 100% probability for all the final states.

精确计算.

请考虑您所描述的图形G.给定两个顶点 i 和 f ，因此从 i 到 f 至少有一条路径，而 f 除了自身边缘外没有其他输出边缘，我们可以将从 i 到 f 的路径划分为以其重访次数为特征的类 i 到达 f 之前.可能有无数这样的类，我将指定 C if ( n )，其中 n 表示如果 C if ( n )重新访问节点的路径的次数 i .特别是 C ii (em)(0)包含G中所有包含 i (说明:以及其他路径).

Consider a graph, G, such as you describe. Given two vertices i and f, such that there is at least one path from i to f, and f has no outgoing edges other than self-edges, we can partition the paths from i to f into classes characterized by the number of times they revisit i prior to reaching f. There may be an infinite number of such classes, which I will designate Cif(n), where n represents the number of times the paths in Cif(n) revisit node i. In particular, Cii(0) contains all the simple loops in G that contain i (clarification: as well as other paths).

假设系统从节点 i 开始遍历图G，则终止于节点 f 的总概率为

The total probability of ending at node f given that the system traverses graph G starting at node i is given by

Pr( f | i ，G)= Pr( C if (0)| G)+ Pr( C 如果 (1)| G)+ Pr( C if (2)| G)...

Pr(f|i, G) = Pr(Cif(0)|G) + Pr(Cif(1)|G) + Pr(Cif(2)|G) ...

现在请注意，如果 n > 0，则 C if ( n )具有两个路径 c 和 t 的并集形式，其中 c 属于 C ii ( n -1)和 t 属于 C if (0).也就是说， c 是一条路径，该路径始于节点 i ，终止于节点 i ，经过 i n -1倍，而 t 是从 i 到 f 的一条路径，该路径不通过我我们可以用它来重写我们的概率公式:

Now observe that if n > 0 then each path in Cif(n) has the form of a union of two paths c and t, where c belongs to Cii(n-1) and t belongs to Cif(0). That is, c is a path that starts at node i and ends at node i, passing through i n-1 times between, and t is a path from i to f that does not pass through i again. We can use that to rewrite our probability formula:

但是请注意， C ii ( n )中的每个路径都是 n的组成属于 C ii (0)的 +1条路径.由此得出Pr( C ii ( n )| G)= Pr( C ii (0)| G) n +1 ，这样我们得到

But note that every path in Cii(n) is a composition of n+1 paths belonging to Cii(0). It follows that Pr(Cii(n)|G) = Pr(Cii(0)|G)n+1, so we get

现在，一些代数给了我们

And now, a little algebra gives us

Pr( f | i ，G)-Pr( C if (0)| G)= Pr( C ii (0)| G)* Pr( f | i ，G)

Pr(f|i,G) - Pr(Cif(0)|G) = Pr(Cii(0)|G) * Pr(f|i,G)

，我们可以解决Pr( f | i ，G)以获得

, which we can solve for Pr(f|i,G) to get

Pr( f | i ，G)= Pr( C if (0)| G)/(1-Pr( C ii (0)| G))

Pr(f|i,G) = Pr(Cif(0)|G) / (1 - Pr(Cii(0)|G))

因此，就不返回起始节点的路径(可能作为其结束节点)，我们将问题简化为一个.这些并不排除具有不包含起始节点的循环的路径，但是我们仍然可以根据在原始图的子图上计算的原始问题的多个实例来重写此问题.

We've thus reduced the problem to one in terms of paths that do not return to the starting node, except possibly as their end node. These do not preclude paths that have loops that don't include the starting node, but we can we nevertheless rewrite this problem in terms of several instances of the original problem, computed on a subgraph of the original graph.

尤其是，让 S ( i ，G)为图G中顶点 i 的后继者集合-即，一组顶点 s ，使得G中从 i 到 s 有一条边，并令X(G， i )是G的子图，该子图是通过删除以 i 开头的所有边而形成的.此外，令 p is 是与G中的边( i ， s )相关的概率.

In particular, let S(i, G) be the set of successors of vertex i in graph G -- that is, the set of vertices s such that there is an edge from i to s in G, and let X(G,i) be the subgraph of G formed by removing all edges that start at i. Furthermore, let pis be the probability associated with edge (i, s) in G.

Pr( C if (0)| G)= S中 s 的总和( i ，G)的 p is * Pr( f | s ，X(G， i ))

Pr(Cif(0)|G) = Sum over s in S(i, G) of pis * Pr(f|s,X(G,i))

换句话说，从 i 到G到达 f 且在两次之间不重新访问 i 的概率是所有<一步从 i 达到 s 的概率与< f 的概率 i em> s 到G，而不会遍历 i 出站的任何边.这适用于G中的所有 f ，包括 i .

In other words, the probability of reaching f from i through G without revisiting i in between is the sum over all successors of i of the product of the probability of reaching s from i in one step with the probability of reaching f from s through G without traversing any edges outbound from i. That applies for all f in G, including i.

现在观察到 S ( i ，G)和所有 p is 是已知的，并且计算Pr( f | s ，X(G， i ))的问题是原始问题的一个新的，严格较小的实例.因此，可以递归地执行该计算，并且保证这种递归终止.但是，如果您的图很复杂，则可能要花费很长时间，并且看来这种递归方法的幼稚实现会在节点数上成倍地扩展.您可以通过多种方法来加快计算速度，以换取更高的内存使用率(即记忆).

Now observe that S(i, G) and all the pis are knowns, and that the problem of computing Pr(f|s,X(G,i)) is a new, strictly smaller instance of the original problem. Thus, this computation can be performed recursively, and such a recursion is guaranteed to terminate. It may nevertheless take a long time if your graph is complex, and it looks like a naive implementation of this recursive approach would scale exponentially in the number of nodes. There are ways you could speed the computation in exchange for higher memory usage (i.e. memoization).

还有其他可能性.例如，我对解决方案可能采用自底向上的动态编程方法感到怀疑，但我无法说服图表中的循环并不会带来无法解决的问题.

There are likely other possibilities as well. For example, I'm suspicious that there may be a bottom-up dynamic programming approach to a solution, but I haven't been able to convince myself that loops in the graph don't present an insurmountable problem there.

更多推荐

有向概率图

本文发布于:2023-11-29 22:44:33，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1647846.html