PGMs represent joint probability distributions over a set of variables $\mathcal{X}$. Let’s use this representation to answer actual queries.
A query over a graphical model asks to compute simple statistics over the function such as its minimum or average value.
Given: Variables $\mathcal{X}$, Evidence $E = e$, Query $Y$.
Task: Compute $P(Y \mid E = e)$.
$$ \boxed{P(Y \mid E = e) = \frac{P(Y, e)}{P(e)}} $$
Let $W = \mathcal{X} - Y - E$ be variables that are neither query nor evidence (hidden variables).
$$ \boxed{P(Y, E = e) = \sum_{W} P(Y, E = e, W)} $$
$$ \boxed{P(e) = \sum_{Y, W} P(Y, E = e, W) = \sum_Y P(Y, E = e)} $$
Example
$$ P(Y = y, E = e) = \sum_{x_3, x_4, x_5} P(Y = y, X_1 = 1, \underbrace{X_3 = x_3, X_4 = x_4, X_5 = x_5}_{W}) $$
$\rightarrow$ Compute the joint probability for all 8 possible assignments of $x_3, x_4, x_5$.
In case of 20 variables? $2^{20}$.
$$ P(J) = \sum_{C,D,I,G,S,L,H} \phi_C(C) \phi_D(C,D) \phi_I(I) \phi_G(G,I,D) \phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) $$
Observe evidences $I = i$ and $H = h$, plug in the original factors. $$ P(J \mid I = i, H = h) = \sum_{C,D,G,S,L} \phi_C(C) \phi_D(C,D) \phi_I(i) \phi_G(G,i,D) \phi_S(S,i) \phi_L(L,G) \phi_J(J,L,S) \phi_H(h,G,J) $$
$$ \tilde{P}(D) = \sum_{ABC} \phi_1(A, B) \times \phi_2(B, C) \times \phi_3(C, D) \times \phi_4(A, D) $$
Observe $A = a^0$, reduce factors by removing rows that are inconsistent with the observed evidence. $$ \tilde{P}(D, A = a^0) = \sum_{ABC} \phi^{\prime}_1(A = a^0, B) \times \phi_2(B, C) \times \phi_3(C, D) \times \phi^{\prime}_4(A = a^0, D) $$
Given: Variables $\mathcal{X}$, Evidence $E = e$, Query $Y = \mathcal{X} - E$.
Task: Compute $\text{MAP}(Y \mid E = e) = \text{argmax}_yP(Y = y \mid E = e)$.
Max-Product
$$ P(Y \mid E = e) = \frac{P(Y, e)}{\underbrace{P(e)}_{\text{constant w.r.t Y}}} \propto P(Y, E = e) $$
$$ \begin{align*} P(Y, E = e) &= \frac{1}{\underbrace{Z}_{\text{const}}} \prod_k \phi^{\prime}_k(D^{\prime}_k) \\ &\propto \prod_k \phi^{\prime}_k(D^{\prime}_k) \end{align*} $$
$$ \boxed{\text{argmax}_Y P(Y \mid E = e) = \text{argmax}_Y \prod_k \phi^{\prime}_k (D^{\prime}_k)} $$
Can use dynamic programming techniques to perform inference even for certain large and complex networks in a reasonable time.
Fibonacci
$ F_0 = 1 \\ F_1 = 1 \\ F_n = F_{n - 1} + F_{n - 2}. $
Recursive implementation
1(define (fib n)
2 (if (< n 2)
3 n
4 (+ (fib (- n 1)) (fib (- n 2)))))
$\rightarrow \mathcal{O}(2^n)$
Tail-Recursive implementation
1(define (fib n)
2 (define (fib-iter a b count)
3 (if (= count 0)
4 a
5 (fib-iter b (+ a b) (- count 1))))
6 (fib-iter 0 1 n))
$\rightarrow \mathcal{O}(n)$
Factor Product
The product factor $\phi_1 \times \phi_2$ is a factor $\psi: \text{Val}(X, Y, Z) \to \mathbb{R}$ st:
$$ \boxed{\psi(X, Y, Z) = \phi_1(X, Y) \cdot \phi_2(Y, Z)} $$
Example: $\psi(A, B, C) = \phi_1(A, B) \cdot \phi_2(B, C)$.
Factor Marginalization
The factor marginalization of $Y$ in $\phi$, denoted $\sum_Y\phi$ is a factor $\psi$ over $X$ st:
$$ \boxed{\psi(X) = \sum_Y \phi(X, Y)} $$
summing out of $Y$ in $\psi$.
Commutative $$ \boxed{\phi_1 \cdot \phi_2 = \phi_2 \cdot \phi_1 \quad \text{and} \quad \sum_X \sum_Y \phi = \sum_Y \sum_X \phi} $$
Associative $$ \boxed{(\phi_1 \cdot \phi_2) \cdot \phi_3 = \phi_1 \cdot (\phi_2 \cdot \phi_3)} $$
Exchange summation and product: If $X \notin \text{Scope}[\phi_1]$, then $$ \boxed{\sum_X(\phi_1 \cdot \phi_2) = \phi_1 \cdot\sum_X\phi_2} $$
$\Phi = \{\phi_{X_i}\}^n_{i=1}$: Set of factors
$Z = \mathcal{X} - Y - E$.
General idea
$\rightarrow$ this suggests an “elimination order” of latent variables to be marginalized.
$$ \psi = \prod_{\phi_i \in \Phi^{\prime}} \phi_i \quad ({\text{factor product}}). $$
$$ \tau = \sum_{Z_k} \psi \quad ({\text{sum out $Z_k$ / marginalization}}). $$
$$ \Phi := (\Phi - \Phi^{\prime}) \cup { \tau }. $$
$$ P(Y, e) = \sum_{Z_1} \dots \sum_{Z_{k - 1}} \sum_{Z_{k + 1}} \dots \prod_{\phi \in \Phi}\phi \cdot \tau. $$
If $\phi$ are MN facors $\rightarrow$ normalize.
Compute $P(J)$. Elimination ordering: $C, D, I, H, G, S, L$.
$$ P(J) = \sum_{C,D,I,G,S,L,H} \phi_C(C) \phi_D(C,D) \phi_I(I) \phi_G(G,I,D) \phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) $$
$$ P(J) = \sum_{D,I,G,S,L,H} \phi_I(I) \phi_G(G,I,D) \phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) \underbrace{\sum_{C} \phi_C(C) \phi_D(C,D)}_{\tau_1 (D)} $$
$$ P(J) = \sum_{D,I,G,S,L,H} \phi_I(I) \phi_G(G,I,D) \phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) \tau_1 (D) $$
$$ P(J) = \sum_{I,G,S,L,H} \phi_I(I) \phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) \underbrace{\sum_{D} \phi_G(G,I,D) \tau_1 (D)}_{\tau_2 (G, I)} $$
$$ P(J) = \sum_{I,G,S,L,H} \phi_I(I) \phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) \tau_2 (G, I) $$
$$ P(J) = \sum_{G,S,L,H} \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) \underbrace{\sum_{I} \phi_I(I) \phi_S(S,I) \tau_2 (G, I)}_{\tau_3 (G, S)} $$
$\dots$
$$ P(J) = \sum_{L}\tau_6 (J, L) $$
$$ P(J) = \tau_7 (J) $$
Compute $P(J , I = i^1, H = h^0 )$. Elimination ordering: $C, D, G, S, L$.
Reduced factors, eliminate as before.
$$ \sum_{L, S, G,, D, C} \phi_J(J, L, S) \phi_L(L, G) \phi_S^{\prime}(S) \phi_G^{\prime}(G, D) \phi_H^{\prime}(G, J) \underbrace{\phi_I()}_{\phi_I(i^1): \text{ constant}} \phi_D(C, D) \phi_C(C) $$
Factors are not always correspond to marginal or conditional probabilities in the network.
$$ \psi_k(X_k) = \prod_{i = 1}^{m_k}\phi_i \quad \text{(factor product)} $$
$$ \tau_k(X_k - \{Z\}) = \sum_{Z} \psi_k(X_k) \quad \text{(marginalization)} $$
Factor Product
$$ \psi_k(X_k) = \prod_{i = 1}^{m_k}\phi_i $$
Number of rows in the resulting table: $N_k = |\text{Val}(X_k)| \rightarrow$ $3 \times 2 \times 2 = 12$
Each row: $m_k - 1$ products $\rightarrow$ 2 in this case.
$$ \text{Cost: }\boxed{(m_k - 1)N_k}\text{ multiplications} $$
Factor Marginalization
$$ \tau_k(X_k - \{Z\}) = \sum_{Z} \psi_k(X_k) $$
$$\boxed{N_k = |\text{Val}(X_k)|} \text{ additions}$$
$$ \underbrace{\sum_k (m_k - 1)N_k}_{\text{products}} + \sum_k N_k \leq (m + n)N = \mathcal{O}((m + n)N) = \mathcal{O}(mN) $$
Moralization procedure
Eliminating a variable $X$:
To reflect this in graph:
Example
$$ \begin{align*} &\phi_C(C) \phi_D(C,D) \phi_I(I) \phi_G(G,I,D) \\ &\phi_S(S,I) \phi_L(L,G) \phi_J(J,L,S) \phi_H(H,G,J) \end{align*} $$
$$ \tau_1(D) = \sum_C \phi_C(C)\phi_D(C, D) $$
$$ \tau_2(G, I) = \sum_D \phi_G(G, I, D)\tau_1(D) $$
$$ \tau_3(S, G) = \sum_I \phi_S(S, I)\phi_I(I)\tau_2(G, I) $$