|
Data Mining. Methods for Knowledge Discovery
by Krzysztof Cios, Witold Pedrycz and Roman
Swiniarski
Kluwer Academic Publishers, 1998, Boston/Dordrecht/London, 495
pp.
The overwhelming information flow of modern times is urgently
calling for computer-based methods to distill the knowledge that is
both comprehensible and important to us. This book is a collection of
such methods. Data Mining (DM) is a technology for Knowledge Discovery
(KD), which means finding novel, understandable, and potentially
useful patterns in data. DM for KD started taking shape as an
individual discipline in the late 80s, so the book addresses a
relatively new area, currently gaining momentum. Despite the
immaturity of the area the authors are not hyper euphoric about it:
they warn the reader to be cautious about calling KD "evolutionary
and unique approach". This view has resulted in an "umbrella"'
book consisting of nine self-contained chapters, each one introducing
a specific area. These areas (machine learning, pattern recognition,
neural networks, fuzzy systems, rough sets, evolutionary computing and
data preprocessing) differ in terminology and technology but have the
same main goal -- discovery of knowledge from data. It is always a
good idea to give the methods their own labels before renaming them as
the bricks of the new domain (DM and KD). A certain coherence may be
lost, however the preserved individuality of the DM methods within
their own theoretical frameworks makes the book an excellent compact
reference tool.
The book begins with an enlightening introduction into the
terminology and philosophy of DM and KD. The authors characterize the
data (large volume, missing values, uncertainties of various types),
outline the main knowledge patterns (rules and trees), and specify the
basic models, detailed in the ensuing chapters.
Rough sets are introduced in Chapter 2 with the help of 23 examples,
guiding the reader through the concepts of information systems,
indiscernability relations, lower and upper approximation of a set,
and accuracy of approximation. Classification and extraction of
decision rules by rough sets are explained, with data reduction
playing a major role.
Chapter 3 introduces fuzzy sets. Unlike many similar texts, the
authors give proper attention to the crucial problem of how to
determine membership functions: empirically (4 methods are given) or
by parametric oprtimization. Starting with the simple concepts the
authors build the next level: fuzzy relations, triangular norms and
co-norms, extension principle, fuzzy arithmetic, measures of
fuzziness, "defuzzification" (representation of a fuzzy set
by a single number). The links between fuzzy sets, rough sets, and
probability are sketched. The "frame of cognition" (a set of
semantically linked fuzzy sets on the same universal set) is presented
in detail as the basis of fuzzy reasoning, a highly adequate model for
DM and KD.
Bayesian decision theory is presented in Chapter 4. Relevant notions
from classical pattern recognition are introduced: Bayes risk, minimum
risk classifier, decision regions, discriminant functions (deriving
the linear and quadratic discriminant classifier for normal
class-conditional probability density functions), parametric,
nonparametric and semiparametric probability density estimation.
Chapter 4 concludes with a probabilistic neural network model (PNN).
Compared with the rest of the book, this chapter is quite condensed,
has a much more theoretical flavor, and requires some mathematical
background. The authors use clear, adequate, and consistent notations
and terminology, widely accepted within the pattern recognition
community, which makes the text readable and very useful for the
interested audience.
Combinatorial problems arise frequently in data mining, e.g., when
selecting a robust, consistent and comprehensible core from a large
rule base. To address this type of problem Chapter 5 introduces
evolutionary computing, covering the basics of Genetic Algorithms
(GAs), and exemplifying evolutionary strategies, evolutionary
programming, and genetic programming.
Machine Learning (ML) methods are described in Chapter 6. Unlike
Bayesian methods and neural networks which are ``black boxes'', ML
output does not require further deciphering: typically it comes as a
rule base or a decision tree. This makes ML methods a very suitable
model for DM and KD. Chapter 6 details two popular ML designs: AQ
algorithms for rule extraction, and ID algorithms for constructing
decision trees. A hybrid algorithm between the two is also explained,
and all three types are compared on synthetic and real data. A
collection of DM methods is illustrated on a cardiological diagnostic
problem. Discretization of continuous-valued attributes is also well
covered.
A glimpse of Neural Networks (NN) is given in Chapter 7 beginning
with NN architectures and learning modes. Radial Basis Function (RBF)
networks are chosen as the main DM design. Although RBF networks are
not particularly suitable for large data volumes, their relationship
with a class of fuzzy if-then systems makes the trained RBF network
interpretable and understandable, as explained in 7.3. Kohonen's Self
Organizing Map (SOM) and an Image recognition NN complete Chapter 7.
Clustering in undoubtedly an important facet of DM and KD.
Chapter 8 begins with an introduction to hierarchical and objective
function based (traditional and fuzzy) clustering and proceeds with
the well illustrated advanced concept of context-oriented fuzzy
clustering.
Preprocessing of data is methodologically exposed in Chapter 9.
Principal Component Analysis (PCA), Karhunen-Loeve expansion, and
Fisher's projection are suggested for feature extraction. A general
paradigm for feature selection is presented, discussing feature
selection criteria and search algorithms. This chapter is a natural
complement of Chapter 4 (Bayesian methods), which together give an
excellent succinct account of the statistical pattern recognition
background of data mining and knowledge discovery.
The book is suitable as a course text not only in Data Mining but in
a variety of subjects. Giving a brief introduction to diverse areas
and guiding the reader to some advanced stages, the book has a great
reference value -- not only for beginners but also for specialists who
want to learn more about related areas. Finally, the book summarizes
the building stones of a relatively new discipline, and thereby sets
up the perspective for these separate (as for now) subjects to merge
and evolve into a new monolithic Knowledge Discovery theory.
Ludmila I. Kuncheva
School of Mathematics
University of Wales |