Online Book Reader

Home Category

Data Mining - Mehmed Kantardzic [100]

By Root 704 0
a discriminant function that yields different scores when computed with data from different output classes. A linear discriminant function has the following form:

where x1, x2, … , xk are independent variables. The quantity z is called the discriminant score, and w1, w2, … ,wk are called weights. A geometric interpretation of the discriminant score is shown in Figure 5.5. As the figure shows, the discriminant score for a data sample represents its projection onto a line defined by the set of weight parameters.

Figure 5.5. Geometric interpretation of the discriminant score.

The construction of a discriminant function z involves finding a set of weight values wi that maximizes the ratio of the between-class to the within-class variance of the discriminant score for a preclassified set of samples. Once constructed, the discriminant function z is used to predict the class of a new nonclassified sample. Cutting scores serve as the criteria against which each individual discriminant score is judged. The choice of cutting scores depends upon a distribution of samples in classes. Letting za and zb be the mean discriminant scores of preclassified samples from class A and B, respectively, the optimal choice for the cutting score zcut-ab is given as

when the two classes of samples are of equal size and are distributed with uniform variance. A new sample will be classified to one or another class depending on its score z > zcut-ab or z < zcut-ab. A weighted average of mean discriminant scores is used as an optimal cutting score when the set of samples for each of the classes are not of equal size:

The quantities na and nb represent the number of samples in each class. Although a single discriminant function z with several discriminant cuts can separate samples into several classes, multiple discriminant analysis is used for more complex problems. The term multiple discriminant analysis is used in situations when separate discriminant functions are constructed for each class. The classification rule in such situations takes the following form: Decide in favor of the class whose discriminant score is the highest. This is illustrated in Figure 5.6.

Figure 5.6. Classification process in multiple-discriminant analysis.

5.9 REVIEW QUESTIONS AND PROBLEMS

1. What are the differences between statistical testing and estimation as basic areas in statistical inference theory?

2. A data set for analysis includes only one attribute X:

X = {7, 12, 5, 18, 5, 9, 13, 12, 19, 7, 12, 12, 13, 3, 4, 5, 13, 8, 7, 6}.

(a) What is the mean of the data set X?

(b) What is the median?

(c) What is the mode, and what is the modality of the data set X?

(d) Find the standard deviation for X.

(e) Give a graphical summarization of the data set X using a boxplot representation.

(f) Find outliers in the data set X. Discuss the results.

3. For the training set given in Table 5.1, predict the classification of the following samples using simple Bayesian classifier.

(a) {2, 1, 1}

(b) {0, 1, 1}

4. Given a data set with two dimensions X and Y:

X Y

1 5

4 2.75

3 3

5 2.5

(a) Use a linear regression method to calculate the parameters α and β where y = α + β x.

(b) Estimate the quality of the model obtained in (a) using the correlation coefficient r.

(c) Use an appropriate nonlinear transformation (one of those represented in Table 5.3) to improve regression results. What is the equation for a new, improved, and nonlinear model? Discuss a reduction of the correlation coefficient value.

5. A logit function, obtained through logistic regression, has the form:

Find the probability of output values 0 and 1 for the following samples:

(a) { 1, −1, −1 }

(b) { −1, 1, 0 }

(c) { 0, 0, 0 }

6. Analyze the dependency between categorical attributes X and Y if the data set is summarized in a 2 × 3 contingency table:

7. Implement the algorithm for a boxplot representation of each numeric attribute in an input flat file.

8. What are the basic principles in the construction of a discriminant function applied in an LDA?

9. Implement the algorithm

Return Main Page Previous Page Next Page

®Online Book Reader