Mining from incomplete quantitative data by fuzzy rough sets

作者:

Highlights:

摘要

Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more difficult than learning from complete data sets. In the past, the rough-set theory was widely used in dealing with data classification problems. Most conventional mining algorithms based on the rough-set theory identify relationships among data using crisp attribute values. Data with quantitative values, however, are commonly seen in real-world applications. In this paper, we thus deal with the problem of learning from incomplete quantitative data sets based on rough sets. A learning algorithm is proposed, which can simultaneously derive certain and possible fuzzy rules from incomplete quantitative data sets and estimate the missing values in the learning process. Quantitative values are first transformed into fuzzy sets of linguistic terms using membership functions. Unknown attribute values are then assumed to be any possible linguistic terms and are gradually refined according to the fuzzy incomplete lower and upper approximations derived from the given quantitative training examples. The examples and the approximations then interact on each other to derive certain and possible rules and to estimate appropriate unknown values. The rules derived can then serve as knowledge concerning the incomplete quantitative data set.

论文关键词:Rough set,Machine learning,Fuzzy certain rule,Fuzzy possible rule,Incomplete data,Quantitative data

论文评审过程:Available online 20 August 2009.

论文官网地址:https://doi.org/10.1016/j.eswa.2009.08.002