Online Book Reader

Home Category

Data Mining - Mehmed Kantardzic [251]

By Root 933 0
to augment the data-mining process. Some data-mining techniques and algorithms are difficult for decision makers to understand and use. Visualization can make the data and the mining results more accessible, allowing comparison and verification of results. Visualization can also be used to steer the data-mining algorithm.

It is useful to develop a taxonomy for data visualization, not only because it brings order to disjointed techniques, but also because it clarifies and interprets ideas and purposes behind these techniques. Taxonomy may trigger the imagination to combine existing techniques or discover a totally new technique.

Visualization techniques can be classified in a number of ways. They can be classified as to whether their focus is geometric or symbolic, whether the stimulus is 2-D, 3-D, or n-dimensional, or whether the display is static or dynamic. Many visualization tasks involve detection of differences in data rather than a measurement of absolute values. It is the well-known Weber’s Law that states that the likelihood of detection is proportional to the relative change, not the absolute change, of a graphical attribute. In general, visualizations can be used to explore data, to confirm a hypothesis, or to manipulate a view.

In exploratory visualizations, the user does not necessarily know what he/she is looking for. This creates a dynamic scenario in which interaction is critical. The user is searching for structures or trends and is attempting to arrive at some hypothesis. In confirmatory visualizations, the user has a hypothesis that needs only to be tested. This scenario is more stable and predictable. System parameters are often predetermined and visualization tools are necessary for the user to confirm or refute the hypothesis. In manipulative (production) visualizations, the user has a validated hypothesis and so knows exactly what is to be presented. Therefore, he/she focuses on refining the visualization to optimize the presentation. This type is the most stable and predictable of all visualizations.

The accepted taxonomy in this book is primarily based on different approaches in visualization caused by different types of source data. Visualization techniques are divided roughly into two classes, depending on whether physical data are involved. These two classes are scientific visualization and information visualization.

Scientific visualization focuses primarily on physical data such as the human body, the earth, and molecules. Scientific visualization also deals with multidimensional data, but most of the data sets used in this field use the spatial attributes of the data for visualization purposes, for example, computer-aided tomography(CAT) and CAD. Also, many of the Geographical Information Systems (GIS) use either the Cartesian coordinate system or some modified geographical coordinates to achieve a reasonable visualization of the data.

Information visualization focuses on abstract, nonphysical data such as text, hierarchies, and statistical data. Data-mining techniques are primarily oriented toward information visualization. The challenge for nonphysical data is in designing a visual representation of multidimensional samples (where the number of dimensions is greater than three). Multidimensional-information visualizations present data that are not primarily plenary or spatial. One-, two-, and three-dimensional, but also temporal information–visualization schemes can be viewed as a subset of multidimensional information visualization. One approach is to map the nonphysical data to a virtual object such as a cone tree, which can be manipulated as if it were a physical object. Another approach is to map the nonphysical data to the graphical properties of points, lines, and areas.

Using historical developments as criteria, we can divide IVT into two broad categories: traditional IVT and novel IVT. Traditional methods of 2-D and 3-D graphics offer an opportunity for information visualization, even though these techniques are more often used for presentation of physical data in scientific

Return Main Page Previous Page Next Page

®Online Book Reader