Newsletter Volume 5 Number 1 ERUDIT

EU New Books / Journals / Book Review


Data Mining. Methods for Knowledge Discovery

by Krzysztof Cios, Witold Pedrycz and Roman Swiniarski
Kluwer Academic Publishers, 1998, Boston/Dordrecht/London, 495 pp.

The overwhelming information flow of modern times is urgently calling for computer-based methods to distill the knowledge that is both comprehensible and important to us. This book is a collection of such methods. Data Mining (DM) is a technology for Knowledge Discovery (KD), which means finding novel, understandable, and potentially useful patterns in data. DM for KD started taking shape as an individual discipline in the late 80s, so the book addresses a relatively new area, currently gaining momentum. Despite the immaturity of the area the authors are not hyper euphoric about it: they warn the reader to be cautious about calling KD "evolutionary and unique approach". This view has resulted in an "umbrella"' book consisting of nine self-contained chapters, each one introducing a specific area. These areas (machine learning, pattern recognition, neural networks, fuzzy systems, rough sets, evolutionary computing and data preprocessing) differ in terminology and technology but have the same main goal -- discovery of knowledge from data. It is always a good idea to give the methods their own labels before renaming them as the bricks of the new domain (DM and KD). A certain coherence may be lost, however the preserved individuality of the DM methods within their own theoretical frameworks makes the book an excellent compact reference tool.

The book begins with an enlightening introduction into the terminology and philosophy of DM and KD. The authors characterize the data (large volume, missing values, uncertainties of various types), outline the main knowledge patterns (rules and trees), and specify the basic models, detailed in the ensuing chapters.

Rough sets are introduced in Chapter 2 with the help of 23 examples, guiding the reader through the concepts of information systems, indiscernability relations, lower and upper approximation of a set, and accuracy of approximation. Classification and extraction of decision rules by rough sets are explained, with data reduction playing a major role.

Chapter 3 introduces fuzzy sets. Unlike many similar texts, the authors give proper attention to the crucial problem of how to determine membership functions: empirically (4 methods are given) or by parametric oprtimization. Starting with the simple concepts the authors build the next level: fuzzy relations, triangular norms and co-norms, extension principle, fuzzy arithmetic, measures of fuzziness, "defuzzification" (representation of a fuzzy set by a single number). The links between fuzzy sets, rough sets, and probability are sketched. The "frame of cognition" (a set of semantically linked fuzzy sets on the same universal set) is presented in detail as the basis of fuzzy reasoning, a highly adequate model for DM and KD.

Bayesian decision theory is presented in Chapter 4. Relevant notions from classical pattern recognition are introduced: Bayes risk, minimum risk classifier, decision regions, discriminant functions (deriving the linear and quadratic discriminant classifier for normal class-conditional probability density functions), parametric, nonparametric and semiparametric probability density estimation. Chapter 4 concludes with a probabilistic neural network model (PNN). Compared with the rest of the book, this chapter is quite condensed, has a much more theoretical flavor, and requires some mathematical background. The authors use clear, adequate, and consistent notations and terminology, widely accepted within the pattern recognition community, which makes the text readable and very useful for the interested audience.

Combinatorial problems arise frequently in data mining, e.g., when selecting a robust, consistent and comprehensible core from a large rule base. To address this type of problem Chapter 5 introduces evolutionary computing, covering the basics of Genetic Algorithms (GAs), and exemplifying evolutionary strategies, evolutionary programming, and genetic programming.

Machine Learning (ML) methods are described in Chapter 6. Unlike Bayesian methods and neural networks which are ``black boxes'', ML output does not require further deciphering: typically it comes as a rule base or a decision tree. This makes ML methods a very suitable model for DM and KD. Chapter 6 details two popular ML designs: AQ algorithms for rule extraction, and ID algorithms for constructing decision trees. A hybrid algorithm between the two is also explained, and all three types are compared on synthetic and real data. A collection of DM methods is illustrated on a cardiological diagnostic problem. Discretization of continuous-valued attributes is also well covered.

A glimpse of Neural Networks (NN) is given in Chapter 7 beginning with NN architectures and learning modes. Radial Basis Function (RBF) networks are chosen as the main DM design. Although RBF networks are not particularly suitable for large data volumes, their relationship with a class of fuzzy if-then systems makes the trained RBF network interpretable and understandable, as explained in 7.3. Kohonen's Self Organizing Map (SOM) and an Image recognition NN complete Chapter 7. Clustering in undoubtedly an important facet of DM and KD.

Chapter 8 begins with an introduction to hierarchical and objective function based (traditional and fuzzy) clustering and proceeds with the well illustrated advanced concept of context-oriented fuzzy clustering.

Preprocessing of data is methodologically exposed in Chapter 9. Principal Component Analysis (PCA), Karhunen-Loeve expansion, and Fisher's projection are suggested for feature extraction. A general paradigm for feature selection is presented, discussing feature selection criteria and search algorithms. This chapter is a natural complement of Chapter 4 (Bayesian methods), which together give an excellent succinct account of the statistical pattern recognition background of data mining and knowledge discovery.

The book is suitable as a course text not only in Data Mining but in a variety of subjects. Giving a brief introduction to diverse areas and guiding the reader to some advanced stages, the book has a great reference value -- not only for beginners but also for specialists who want to learn more about related areas. Finally, the book summarizes the building stones of a relatively new discipline, and thereby sets up the perspective for these separate (as for now) subjects to merge and evolve into a new monolithic Knowledge Discovery theory.

Ludmila I. Kuncheva
School of Mathematics
University of Wales


Top|Contents|Next page|Previous page

ERUDIT Home|Contact Us