Logical analysis of binary data with missing bits

作者:

摘要

We model a given pair of sets of positive and negative examples, each of which may contain missing components, as a partially defined Boolean function with missing bits (pBmb) (T~,F~), where T~-⊂{0,1*}n and F~-⊂{0,1*}n, and “*” stands for a missing bit. Then we consider the problem of establishing a Boolean function (an extension) f : 0, 1n → 0, 1 belonging to a given function class C, such that f is true (respectively, false) for every vector in T~ (respectively, in F~. This is a fundamental problem, encountered in many areas such as learning theory, pattern recognition, example-based knowledge bases, logical analysis of data, knowledge discovery and data mining. In this paper, depending upon how to deal with missing bits, we formulate three types of extensions called robust, consistent and most robust extensions, for various classes of Boolean functions such as general, positive, Horn, threshold, decomposable and k-DNF. The complexity of the associated problems are then clarified; some of them are solvable in polynomial time while the others are NP-hard.

论文关键词:Knowledge discovery,Data mining,Logical analysis of data,Boolean functions,Partially defined Boolean functions,Missing bits,NP-hardness

论文评审过程:Received 21 November 1997, Available online 30 June 1999.

论文官网地址:https://doi.org/10.1016/S0004-3702(98)00110-6