Bag Query Containment and Information Theory
The query containment problem is a fundamental algorithmic problem in data management. While this problem is well understood under set semantics, it is by far less understood under bag semantics. In this paper we unveil tight connections between information theory and the conjunctive query containment under bag semantics.
Authors: Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, Dan Suciu. 2020.
In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS ‘20).
The query containment problem is a fundamental algorithmic problem in data management. While this problem is well understood under set semantics, it is by far less understood under bag semantics. In particular, it is a long-standing open question whether or not the conjunctive query containment problem under bag semantics is decidable.We unveil tight connections between information theory and the conjunctive query containment under bag semantics.These connections are established using information inequalities, which are considered to be the laws of information theory. Our first main result asserts that deciding the validity of a generalization of information inequalities is many-one equivalent to the restricted case of conjunctive query containment in which the containing query is acyclic; thus, either both these problems are decidable or both are undecidable. Our second main result identifies a new decidable case of the conjunctive query containment problem under bag semantics. Specifically, we give an exponential-time algorithm for conjunctive query containment under bag semantics, provided the containing query is chordal and admits a simple junction tree.
Read the PDF: Bag Query Containment and Information Theory (opens in a new tab)
Related Posts
Computer Vision: Deep Dive into Object Segmentation Approaches
Join optimization has been dominated by Selinger-style, pairwise optimizers for decades. But, Selinger-style algorithms are asymptotically suboptimal for applications in graphic analytics. This suboptimality is one of the reasons that many have advocated supplementing relational engines with specialized graph processing engines.
Functional Aggregate Queries with Additive Inequalities
Motivated by fundamental applications in databases and relational machine learning, we formulate and study the problem of answering functional aggregate queries (FAQ) in which some of the input factors are defined by a collection of additive inequalities between variables.