SharesSkew: An algorithm to handle skew for joins in MapReduce

作者:

Highlights:

• A novel algorithm to handle skew for multiway joins.

• Introduction of residual joins to deal with skew.

• Communication cost minimization based on reducer capacity.

• Examination of important classes of joins.

摘要

•A novel algorithm to handle skew for multiway joins.•Introduction of residual joins to deal with skew.•Communication cost minimization based on reducer capacity.•Examination of important classes of joins.

论文关键词:MapReduce,Data Skew,Parallel Join Processing

论文评审过程:Received 22 August 2016, Revised 13 February 2018, Accepted 4 June 2018, Available online 14 June 2018, Version of Record 27 June 2018.

论文官网地址:https://doi.org/10.1016/j.is.2018.06.005