Data Mining - Mehmed Kantardzic [135]
(c) What is the bias that will do the job with a log-sigmoid–activation function?
8. Consider a classification problem defined with the set of 3-D samples X, where two dimensions are inputs and the third one is the output.
(a) Draw a graph of the data points X labeled according to their classes. Is the problem of classification solvable with a single-neuron perceptron? Explain the answer.
(b) Draw a diagram of the perceptron you would use to solve the problem. Define the initial values for all network parameters.
(c) Apply single iteration of the delta-learning algorithm. What is the final vector of weight factors?
9. The one-neuron network is trained to classify input–output samples:
1 0 1
1 1 −1
0 1 1
Show that this problem cannot be solved unless the network uses a bias.
10. Consider the classification problem based on the set of samples X:
(a) Draw a graph of the data points labeled according to their classification. Is the problem solvable with one artificial neuron? If yes, graph the decision boundaries.
(b) Design a single-neuron perceptron to solve this problem. Determine the final weight factors as a weight vector orthogonal to the decision boundary.
(c) Test your solution with all four samples.
(d) Using your network classify the following samples: (−2, 0), (1, 1), (0, 1), and (−1, −2).
(e) Which of the samples in (d) will always be classified the same way, and for which samples classification may vary depending on the solution?
11. Implement the program that performs the computation (and learning) of a single-layer perceptron.
12. For the given competitive network:
(a) find the output vector [Y1, Y2, Y3] if the input sample is [X1, X2, X3] = [1, −1, −1];
(b) what are the new weight factors in the network?
13. Search the Web to find the basic characteristics of publicly available or commercial software tools that are based on ANNs. Document the results of your search. Which of them are for learning with a teacher, and which are support learning without a teacher?
14. For a neural network, which one of these structural assumptions is the one that most affects the trade-off between underfitting (i.e., a high-bias model) and overfitting (i.e., a high-variance model):
(a) the number of hidden nodes,
(b) the learning rate,
(c) the initial choice of weights, or
(d) the use of a constant-term unit input.
15. Is it true that the Vapnik-Chervonenkis (VC) dimension of a perceptron is smaller than the VC dimension of a simple linear support vector machine (SVM)? Discuss your answer.
7.9 REFERENCES FOR FURTHER STUDY
Engel, A., C. Van den Broeck, Statistical Mechanics of Learning, Cambridge University Press, Cambridge, UK, 2001.
The subject of this book is the contribution made to machine learning over the last decade by researchers applying the techniques of statistical mechanics. The authors provide a coherent account of various important concepts and techniques that are currently only found scattered in papers. They include many examples and exercises to make a book that can be used with courses, or for self-teaching, or as a handy reference.
Haykin, S., Neural Networks and Learning Machines, 3rd edition, Prentice Hall, Upper Saddle River, NJ, 2009.
Fluid and authoritative, this well-organized book represents the first comprehensive treatment of neural networks from an engineering perspective, providing extensive, state-of-the-art coverage that will expose readers to the myriad facets of neural networks and help them appreciate the technology’s origin, capabilities, and potential applications. The book examines all the important aspects of this emerging technology, covering the learning process, backpropagation, radial basis functions, recurrent networks, self-organizing systems, modular networks, temporal processing, neurodynamics, and VLSI implementation. It integrates computer experiments throughout to demonstrate how neural networks are designed and perform in practice. Chapter objectives, problems, worked examples, a bibliography,