Uncertain deductive databases: A hybrid approach

摘要

Uncertainty in deductive databases and logic programming has been modeled using a variety of (numeric and non-numeric) formalisms in the past, including probabilistic, possibilistic, and fuzzy set-theoretic approaches, and many valued logic programming. In this paper, we consider a hybrid approach to the modeling of uncertainty in deductive databases. Our model, called deductive IST (DIST) is based on an extension of the Information Source Tracking (IST) model, recently proposed for relational databases. The DIST model permits uncertainty to be modeled and manipulated in essentially qualitative terms with an option to convert qualitative expressions of uncertainty into numeric form (e.g., probabilities). An uncertain deductive database is modeled as a Horn clause program in the DIST framework, where each fact and rule is annotated with an expression indicating the “sources” contributing to this information and their nature of contribution. (1) We show that positive DIST programs enjoy the least model/least fixpoint semantics analogous to classical logic programs. (2) We show that top-down (e.g., SLD-resolution) and bottom-up (e.g., magic sets rewriting followed by semi-naive evaluation) query processing strategies developed for datalog can be easily extended to DIST programs. (3) Results and techniques for handling negation as failure in classical logic programming can be easily extended to DIST. As an illustration of this, we show how stratified negation can be so extended. We next study the problem of query optimization in such databases and establish the following results. (4) We formulate query containment in qualitative as well as quantitative terms. Intuitively, our qualitative sense of containment would say a query Q1 is contained in a query Q2 provided for every input database D, for every tuple t, t ϵ Q2(D) holds in every “situation” in which t ϵ Q1(D) is true. The quantitative notion of containment would say Q1 is contained in Q2 provided on every input, the certainty associated with any tuple computed by Q1 is no more than the certainty associated with the same tuple by Q2 on the given input. We also prove that qualitative and quantitative notions of containment (both absolute and uniform versions) coincide. (5) We establish necessary and sufficient conditions for the qualitative containment of conjunctive queries. (6) We extend the well-known chase technique to develop a test for uniform containment and equivalence of positive DIST programs. (7) Finally, we prove that the complexity of testing containment of conjunctive DIST queries remains the same as in the classical case when number of information sources is regarded as a constant (so, it's NP-complete in the size of the queries). We also show that testing containment of conjunctive queries is co-NP-complete in the number of information sources.