Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

Data Mining_ Concepts and Techniques - Jiawei Han [248]

By Root 1649 0

weight and bias values of the network are given in Table 9.1, along with the first training tuple, , with a class label of 1.

This example shows the calculations for backpropagation, given the first training tuple, X. The tuple is fed into the network, and the net input and output of each unit are computed. These values are shown in Table 9.2. The error of each unit is computed and propagated backward. The error values are shown in Table 9.3. The weight and bias updates are shown in Table 9.4.

Figure 9.5 Example of a multilayer feed-forward neural network.

Table 9.1 Initial Input, Weight, and Bias Values

x1x2x3w14w15w24w25w34w35w46w56θ4θ5θ6

1 0 1 0.2 −0.3 0.4 0.1 −0.5 0.2 −0.3 −0.2 −0.4 0.2 0.1

Table 9.2 Net Input and Output Calculations

Unit, jNet Input, IjOutput, Oj

4 0.2 + 0 − 0.5 − 0.4 = −0.7 1/(1 + e0.7) = 0.332

5 −0.3 + 0 + 0.2 + 0.2 = 0.1 1/(1 + e−0.1) = 0.525

6 (−0.3)(0.332) − (0.2)(0.525) + 0.1 = −0.105 1/(1 + e0.105) = 0.474

Table 9.3 Calculation of the Error at Each Node

Unit, jErrj

6 (0.474)(1 − 0.474)(1 − 0.474) = 0.1311

5 (0.525)(1 − 0.525)(0.1311)(−0.2) = −0.0065

4 (0.332)(1 − 0.332)(0.1311)(−0.3) = −0.0087

Table 9.4 Calculations for Weight and Bias Updating

Weight or BiasNew Value

w46 −0.3 + (0.9)(0.1311)(0.332) = −0.261

w56 −0.2 + (0.9)(0.1311)(0.525) = −0.138

w14 0.2 + (0.9)(−0.0087)(1) = 0.192

w15 −0.3 + (0.9)(−0.0065)(1) = −0.306

w24 0.4 + (0.9)(−0.0087)(0) = 0.4

w25 0.1 + (0.9)(−0.0065)(0) = 0.1

w34 −0.5 + (0.9)(−0.0087)(1) = −0.508

w35 0.2 + (0.9)(−0.0065)(1) = 0.194

θ6 0.1 + (0.9)(0.1311) = 0.218

θ5 0.2 + (0.9)(−0.0065) = 0.194

θ4 −0.4 + (0.9)(−0.0087) = −0.408

“How can we classify an unknown tuple using a trained network?” To classify an unknown tuple, X, the tuple is input to the trained network, and the net input and output of each unit are computed. (There is no need for computation and/or backpropagation of the error.) If there is one output node per class, then the output node with the highest value determines the predicted class label for X. If there is only one output node, then output values greater than or equal to 0.5 may be considered as belonging to the positive class, while values less than 0.5 may be considered negative.

Several variations and alternatives to the backpropagation algorithm have been proposed for classification in neural networks. These may involve the dynamic adjustment of the network topology and of the learning rate or other parameters, or the use of different error functions.

9.2.4. Inside the Black Box: Backpropagation and Interpretability

“Neural networks are like a black box. How can I 'understand' what the backpropagation network has learned?” A major disadvantage of neural networks lies in their knowledge representation. Acquired knowledge in the form of a network of units connected by weighted links is difficult for humans to interpret. This factor has motivated research in extracting the knowledge embedded in trained neural networks and in representing that knowledge symbolically. Methods include extracting rules from networks and sensitivity analysis.

Various algorithms for rule extraction have been proposed. The methods typically impose restrictions regarding procedures used in training the given neural network, the network topology, and the discretization of input values.

Fully connected networks are difficult to articulate. Hence, often the first step in extracting rules from neural networks is network pruning. This consists of simplifying the network structure by removing weighted links that have the least effect on the trained network. For example, a weighted link may be deleted if such removal does not result in a decrease in the classification accuracy of the network.

Once the trained network has been pruned, some approaches will then perform link, unit, or activation value clustering. In one method, for example, clustering is used to find the set of common activation values for each hidden unit in a given trained two-layer neural network (Figure 9.6). The combinations of these activation values for each hidden

Online Book Reader

Data Mining_ Concepts and Techniques - Jiawei Han [248]

®Online Book Reader