Fuzzy association rule mining for feature and model structure selection

Effective methods for feature and model structure selection are very important for data-driven modeling and system identification tasks. A new method for selecting important variables in nonlinear (dynamic) models with mixed discrete (categorical, fuzzy) and continuous inputs and outputs was developed. The method applies fuzzy association rule mining and the selection process of important variables (model structure) is based on two rule interesting measures. The method is able to select the most relevant variables in nonlinear feature selection problems. Moreover it selects the right model order of strongly nonlinear dynamical system, therefore it can be a very efficient tool for process modeling.

F. P. Pach, A. Gyenesei and J. Abonyi, MOSSFARM: Model structure selection by fuzzy association rule mining, Journal of Intelligent and Fuzzy Systems, pp. 399-407 (2008)

Compact and accurate fuzzy classifiers can be constructed by fuzzy association rule mining

The interpretability and accuracy are critical issues in many classification applications. Associative classifier methods can have high accuracy but these predictions are based on too large sets of rules. In contrast to them, a new method was developed which produces very compact and accurate fuzzy classifier systems at the same time. Therefore, it efficiently helps to understand the relationships of data and the predict mechanism in several types of classification problem.


Pach F.P., Gyenesei A., Abonyi J., Compact fuzzy association rule-based classifier, Expert systems with applications, 2008,34,4,2406-2416

Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data

During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in highdimensional data.The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data) and biclustering (applied to gene expression data analysis). The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner.The proposed algorithm has been implemented in the commonly used MATLAB environment.

Bit-table representation of market basket data.
A Király, A. Gyenesei, J. Abonyi, Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data, The Scientific World Journal, vol. 2014, Article ID 870406, 7 pages
Bittable_TID - a Bit-Table based biclustering tool
Bittable_TID is a biclustering tool written in MATLAB. It provides a fast solution for finding all biclusters within a binary data matrix.
Quick Download and running guide
You can download the MATLAB source code, the other software tools used for comparison and data sets from here:
After downloading and unpacking the program package, the program can be run by opening the file bittable_TID.m in MATLAB. The resulted closed itemsets are presented in the variable: itemsCell. Each row represents a closed itemset, where first column contains the involved rows while the second the involved columns.