Online Book Reader

Home Category

Data Mining - Mehmed Kantardzic [20]

By Root 717 0
is not necessarily required as the concepts and techniques discussed within the book can be utilized without deeper knowledge of the underlying theory.

1.8 REVIEW QUESTIONS AND PROBLEMS

1. Explain why it is not possible to analyze some large data sets using classical modeling techniques.

2. Do you recognize in your business or academic environment some problems in which the solution can be obtained through classification, regression, or deviation? Give examples and explain.

3. Explain the differences between statistical and machine-learning approaches to the analysis of large data sets.

4. Why are preprocessing and dimensionality reduction important phases in successful data-mining applications?

5. Give examples of data where the time component may be recognized explicitly, and other data where the time component is given implicitly in a data organization.

6. Why is it important that the data miner understand data well?

7. Give examples of structured, semi-structured, and unstructured data from everyday situations.

8. Can a set with 50,000 samples be called a large data set? Explain your answer.

9. Enumerate the tasks that a data warehouse may solve as a part of the data-mining process.

10. Many authors include OLAP tools as a standard data-mining tool. Give the arguments for and against this classification.

11. Churn is a concept originating in the telephone industry. How can the same concept apply to banking or to human resources?

12. Describe the concept of actionable information.

13. Go to the Internet and find a data-mining application. Report the decision problem involved, the type of input available, and the value contributed to the organization that used it.

14. Determine whether or not each of the following activities is a data-mining task. Discuss your answer.

(a) Dividing the customers of a company according to their age and sex.

(b) Classifying the customers of a company according to the level of their debt.

(c) Analyzing the total sales of a company in the next month based on current-month sales.

(d) Classifying a student database based on a department and sorting based on student identification numbers.

(e) Determining the influence of the number of new University of Louisville students on the stock market value.

(f) Estimating the future stock price of a company using historical records.

(g) Monitoring the heart rate of a patient with abnormalities.

(h) Monitoring seismic waves for earthquake activities.

(i) Extracting frequencies of a sound wave.

(j) Predicting the outcome of tossing a pair of dice.

1.9 REFERENCES FOR FURTHER STUDY


Berson, A., S. Smith, K. Thearling, Building Data Mining Applications for CRM, McGraw-Hill, New York, 2000.

The book is written primarily for the business community, explaining the competitive advantage of data-mining technology. It bridges the gap between understanding this vital technology and implementing it to meet a corporation’s specific needs. Basic phases in a data-mining process are explained through real-world examples.

Han, J., M. Kamber, Data Mining: Concepts and Techniques, 2nd edition, Morgan Kaufmann, San Francisco, CA, 2006.

This book gives a sound understanding of data-mining principles. The primary orientation of the book is for database practitioners and professionals, with emphasis on OLAP and data warehousing. In-depth analysis of association rules and clustering algorithms is an additional strength of the book. All algorithms are presented in easily understood pseudo-code, and they are suitable for use in real-world, large-scale data-mining projects, including advanced applications such as Web mining and text mining.

Hand, D., H. Mannila, P. Smith, Principles of Data Mining, MIT Press, Cambridge, MA, 2001.

The book consists of three sections. The first, foundations, provides a tutorial overview of the principles underlying data-mining algorithms and their applications. The second section, data-mining algorithms, shows how algorithms are constructed to solve specific problems in a principled manner. The third section shows how

Return Main Page Previous Page Next Page

®Online Book Reader