Fuzzy classifiers

Our aim is to develop an algorithm for obtaining accurate and interpretable fuzzy rule-based classifiers from labeled observation data. A step-wise approach for developing these classifiers is proposed. In the first step, the structure of the rule-model is initialized based on the statistical analysis of the labeled data and straightforward data-mining tools. Thus based on the data-set containing labeled data, one or more individual models are constructed for each single class. The next step of the algorithm transforms the extracted information into an initial rule-based fuzzy classifier. The obtained classifier is then optimized for accuracy by adapting the model parameters with a real-coded genetic algorithm (GA). The GA search space is limited by adding constraints to the parameter variations which improves both the GA's convergence and maintains the transparency properties of the initial classifier.

J. Abonyi, F. Szeifert, Supervised fuzzy clustering for the identification of fuzzy classifiers, Pattern Recognition Letters, Accepted, 2003 (MATLAB implementation)
J. Abonyi, J. A. Roubos, F. Szeifert, Data-driven generation of compact, accurate, and linguistically sound fuzzy classifiers based on a decision tree initialization, International Journal of Approximate Reasoning, (32) 1-21, 2003 , (MATLAB implementation
J.A. Roubos, M. Setnes, J. Abonyi, Learning fuzzy classification rules from labeled data, International Journal of Information Sciences, (150) 77–93, 2003
J. Abonyi, H. Roubos, Simple fuzzy classifier based on inconsistency analysis of labeled data", Chapter 12 in: CoIL Challenge 2000: The Insurance Company Case, Peter van der Putten and Maarten van Someren (eds), Sentient Machine Research, Amsterdam and Leiden Institute of Advanced Computer Science, Leiden, LIACS Technical Report 2000
J. Abonyi, B. Feil, Computational intelligence in data miningKDD, Computational Intelligence, Soft Computing, Fuzzy Classifier System, Rule Base Reduction, Visualization, December 20. 2004.
J. Abonyi and J.A. Roubos, Initialization of fuzzy classification rules, 5th Online World Conference on Soft Computing in Industrial Applications (WSC5), Internet, September 2000,

Knowledge extraction and transformation

A framework for the of extraction of linguistic rules from data was developed. Special attention was given to the interpretability of the models obtained by different computational intelligence techniques. Information transfer between different model types (decision trees, neural networks, and fuzzy models) has been studied. 

J. Abonyi, F. Szeifert, Supervised fuzzy clustering for the identification of fuzzy classifiers, Pattern Recognition Letters, Accepted, 2003 (MATLAB implementation)
J. Abonyi, J. A. Roubos, F. Szeifert, Data-driven generation of compact, accurate, and linguistically sound fuzzy classifiers based on a decision tree initialization , International Journal of Approximate Reasoning, (32) 1-21, 2003 , (MATLAB implementation
H. Roubos, M. Setnes, J. Abonyi, Learning fuzzy classification rules from labeled data, International Journal of Information Sciences, (150) 77–93, 2003
J. Abonyi, R. Babuska, F. Szeifert, Modified gath-geva fuzzy clustering for identification of takagi-sugeno fuzzy models, IEEE Trans. on Systems, Man and Cybernetics, Part B, Oct, 2002

Interpretable fuzzy feature extraction and selection methods

Extraction of characteristic features and projection of complex multivariate data on lower dimensional spaces is considered. By using the proposed algorithm, the nonlinearly correlated process data can be reduced by an inerpretable fuzzy model. 

J. Abonyi, J.A. Roubos, M. Oosterom, F. Szeifert, Compact TS-fuzzy models through clustering and OLS plus FIS model reduction, FUZZ-IEEE'01 Conference, Sydney, Australia, 2001, (MATLAB implementation)

A Simple Fuzzy Classifier based on Inconsistency Analysis of Labeled Data

An extremely simple fuzzy classifier is identified based on the inconsistency analysis of labelled training data. The method was applied to the COIL challenge 2000 Direct Mail problem and resulted in 121 selected caravan policies within the first 800 selected customers. As this result is identical to the result of the winner of the competition, the presented method is an example for how the try the simplest first approach can be effective in real-life problems.

J. Abonyi, H. Roubos, Simple fuzzy classifier based on inconsistency analysis of labeled data", Chapter 12 in: CoIL Challenge 2000: The Insurance Company Case, Peter van der Putten and Maarten van Someren (eds), Sentient Machine Research, Amsterdam and Leiden Institute of Advanced Computer Science, Leiden, LIACS Technical Report, 1-10, 2000 (MATLAB implementation)

Supervised clustering based decision tree induction

A new method based on supervised clustering was developed for the discretization of continuous features to form efficient fuzzy decision tree based classifiers. A proper classification rule structure is obtained by the feature discretization, rule-induction and rule-optimization procedures. The resulted fuzzy classifiers are very compact and well interpretable while the accuracy is still comparable to the best results reported in the literature.

F. P. Pach, J. Abonyi, Association rule and decision tree based methods for fuzzy rule base generation, Enformatika (Transactions on Engineering, Computing and Technology), Volume 13, 2006 45-50

Applications in Chemometrics

The Fuzzy c-Means (FCM) clustering models were used for the discrimination of organic compounds using piezoelectric chemical sensor array data of 14 analytes. Appropriate clusters are found by the sum of the weighted quadratic distances between data points and cluster prototypes. A priori known information can be integrated into the clustering algorithm by using constrained prototypes. A sensor array was built using piezoelectric quartz crystal sensors. Four AT-cut quartz crystals with 9 MHz fundamental frequencies were applied. Sensing materials were OV1, OV275, ASI50, and polyphenil-ether. The appropriate coating materials were found by a principal component analysis. The application of the fuzzy clustering method has been proved to be reliable way of identifying similar, pure organic compounds.

G. Barkó, J. Abonyi, J. Hlavay, Application of fuzzy clustering and piezoelectric chemical sensor array for investigation on organic compounds, Analytica Chimica Acta, 398 (2-3), 219-22, 1999

Applications in qualitative identification of clinkers

The trace element content of clinkers (and possibly of cements) can be used for the qualitative identification (i.e. manufacturing factory). Several hundred clinker sorts have been analysed (by replicated quarterly samples, collected from all Hungarian cement factories as well as from factories of 8 foreign countries) to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content.

F. D. Tamás, J. Abonyi, World map of clinkers - Visualization of trace element content of clinkers by self-organizing map, 11th International Congress on the Chemistry of Cement (ICCC), South Africa, 2003,
F. D. Tamás, J. Abonyi, Trace elements in clincker I. - A graphical representation, Cement and Concrete Research, 32/8, 1319-1323, 2002
F. D. Tamás, J. Abonyi, Trace elements in clincker II. – Qualitative identification by fuzzy clustering, Cement and Concrete Research, 32/8, 1325-1330, 2002