Using slice join for efficient evaluation of multi-way joins

作者:

Highlights:

摘要

A standard hash join algorithm joins two relations at a time and requires reading the entire smaller input before results are generated. There has been recent focus on constructing join algorithms that produce results faster and can join more than two relations simultaneously. Early joins that are capable of producing results before reading the smaller relation are useful for network joins where the input arrival rates may vary as the operator can adapt without explicit query re-optimization. Multi-way joins improve performance by reducing the number of intermediate results generated and are more resilient to poor estimates by the query optimizer. The only join algorithm that combines the two features of multi-way support and early result production is limited to processing joins where all inputs are joined on the same attribute. In this work, we propose a new hash-based join algorithm called slice join. Slice join is an early, multi-way join algorithm capable of joining relations on common attributes and relations connected by a sequence of functional dependencies. Slice join is useful for a larger number of query plans, performs fewer disk operations, and has a simpler duplicate detection technique than previous approaches. Experimental results demonstrate that slice join outperforms other multi-way join operators and binary join plans.

论文关键词:Early join algorithm,Reading policy,Interactive querying,Adaptive,Hashing

论文评审过程:Received 6 December 2007, Revised 26 May 2008, Accepted 6 June 2008, Available online 20 June 2008.

论文官网地址:https://doi.org/10.1016/j.datak.2008.06.001