Data Mining - Mehmed Kantardzic [213]
3. Each node has a conditional probability table (CPT) that quantifies the effects that the parents have on the node. The parents of a node X are all those nodes that have arrows pointing to X.
4. The graph has no directed cycles (hence is a DAG).
Figure 12.35. Two examples of Bayesian network architectures.
Each node in the BN corresponds to a random variable X, and has a probability distribution of the variable P(X). If there is a directed arc from node X to node Y, this indicates that X has a direct influence on Y. The influence is specified by the conditional probability P(Y|X). Nodes and arcs define a structure of the BN. Probabilities are parameters of the structure.
We turn now to the problem of inference in graphical models, in which some of the nodes in a graph are clamped to observed values, and we wish to compute the posterior distributions of one or more subsets of other nodes. The network supports the computation of the probabilities of any subset of variables given evidence about any other subset. We can exploit the graphical structure both to find efficient algorithms for inference and to make the structure of those algorithms transparent. Specifically, many inference-based algorithms can be expressed in terms of the propagation of local probabilities around the graph. A BN can be considered as a probabilistic graph in which the probabilistic knowledge is represented by the topology of the network and the conditional probabilities at each node. The main purpose of building knowledge on probabilities is to use it for inference, that is, for computing the answer for particular cases about the domain.
For example, we may assume that rain causes the grass to get wet. Causal graph in Figure 12.36 explains the cause–effect relation between these variables, including corresponding probabilities. If P(Rain) = P(R) = 0.4 is given, that also means P(R) = 0.6. Also, note that the sum of presented conditional probabilities is not equal to 1. If you analyze the relations between probabilities, P(W|R) + P(W|R) = 1, and also P(W|R) + P(W|R) = 1, not the sum of given probabilities. In these expressions R means “Rain,” and W means “Wet grass.” Based on the given BN, we may check the probability of “Wet grass”:
Figure 12.36. Simple causal graph.
Bayes’ rule allows us to invert the dependencies, and obtain probabilities of parents in the graph based on probabilities of children. That could be useful in many applications, such as determining probability of a diagnosis based on symptoms. For example, based on the BN in Figure 12.36, we may determine conditional probability P(Rain|Wet grass) = P(R|W). We know that
and therefore
Let us include now more complex problems, and the more complex BN represented in Figure 12.37. In this case we have three nodes, and they are connected serially, often called head-to-tail connections of three events. Now an additional event, “Cloudy,” with yes and no values is included as a variable at the beginning of the network. The R node blocks a path from C to W; it separates them. If the R node is removed, there is no path from C to W. Therefore, the relation between conditional probabilities in the graph are given as: P(C, R, W) = P(C) * P(R|C) * P(W|R).
Figure 12.37. An extended causal graph.
In our case, based on the BN in Figure 12.37, it is possible to determine and use “forward” and “backward” conditional probabilities as represented in the previous BN. We are starting with:
Then, we may use Bayes’ rule for inverted conditional probabilities:
More complex connections may be analyzed in BN. The following Figure 12.38 shows the graph structure and the assumed input parameters.
Figure 12.38. Four-node architecture of a Bayesian network.
The parameters of a graphical model are represented by the conditional probability distributions in a form of CPT tables for each node, given its parents. The simplest form of a formalized distribution, a CPT table, is suitable when the nodes are discrete-valued. All nodes in Figure 12.38 are represented