Data Mining - Mehmed Kantardzic [244]
14.5.2 A Problem of Evaluating Teaching
Assume that the basic factors that influence students’ evaluation of teaching are f1 = clarity and understandability, f2 = proficiency in teaching, f3 = liveliness and stimulation, and f4 = writing neatness or clarity, that is, F = {f1, f2, f3, f4}. Let E = {e1, e2, e3, e4} = {excellent, very good, good, poor} be the verbal grade set. We evaluate a teacher u. By selecting an appropriate group of students and faculty, we can have them respond with their ratings on each factor and then obtain the single-factor evaluation. As in the previous example, we can combine the single-factor evaluation into an evaluation matrix. Suppose that the final matrix R(u) is
For a specific weight vector W(u) = {0.2, 0.3, 0.4, 0.1}, describing the importance of the teaching-evaluation factor fi and using the multifactorial-evaluation model, it is easy to find
Analyzing the evaluation results D(u), because d2 = 0.4 is a maximum, we may conclude that teacher u should be rated as “very good.”
14.6 EXTRACTING FUZZY MODELS FROM DATA
In the context of different data-mining analyses, it is of great interest to see how fuzzy models can automatically be derived from a data set. Besides prediction, classification, and all other data-mining tasks, understandability is of prime concern, because the resulting fuzzy model should offer an insight into the underlying system. To achieve this goal, different approaches exist. Let us explain a common technique that constructs grid-based rule sets using a global granulation of the input and output spaces.
Grid-based rule sets model each input variable usually through a small set of linguistic values. The resulting rule base uses all or a subset of all possible combinations of these linguistic values for each variable resulting in a global granulation of the feature space into rectangular regions. Figure 14.14 illustrates this approach in two dimensions: with three linguistic values (low, medium, high) for the first dimension x1 and two linguistic values for the second dimension x2 (young, old).
Figure 14.14. A global granulation for a two-dimensional space using three membership functions for x1 and two for x2.
Extracting grid-based fuzzy models from data is straightforward when the input granulation is fixed, that is, the antecedents of all rules are predefined. Then, only a matching consequent for each rule needs to be found. This approach, with fixed grids, is usually called the Mamdani model. After predefinition of the granulation of all input variables and also the output variable, one sweeps through the entire data set and determines the closest example to the geometrical center of each rule, assigning the closest fuzzy value output to the corresponding rule. Using graphical interpretation in a 2-D space, the global steps of the procedure are illustrated through an example in which only one input x and one output dimension y exist. The formal analytical specification, even with more than one input/output dimension, is very easy to establish.
1. Granulate the Input and Output Space. Divide each variable xi into ni equidistant, triangular, MFs. In our example, both input x and output y are granulated using the same four linguistic values: low, below average, above average, and high. A representation of the input–output granulated space is given in Figure 14.15.
2. Analyze the Entire Data Set in the Granulated Space. First, enter a data set in the granulated space and then find the points that lie closest to the centers of the granulated regions. Mark these points and the centers of the region. In our example, after entering all discrete data, the selected center points (closest to the data) are additionally marked with x, as in Figure 14.16.
3. Generate Fuzzy Rules from Given Data. Data representative directly selects the regions in a granulated space. These regions may be described with the corresponding fuzzy rules. In