FP-Hadoop: Efficient processing of skewed MapReduce jobs

作者:

Highlights:

• A novel approach for dealing with data skew in the reduce side of MapReduce.

• Parallel reducing of each key, using multiple reduce workers.

• Hierarchical execution of MapReduce jobs.

• Non-overwhelming reducing of intermediate data.

摘要

Highlights•A novel approach for dealing with data skew in the reduce side of MapReduce.•Parallel reducing of each key, using multiple reduce workers.•Hierarchical execution of MapReduce jobs.•Non-overwhelming reducing of intermediate data.

论文关键词:MapReduce,Data skew,Parallel data processing

论文评审过程:Received 10 February 2016, Accepted 23 March 2016, Available online 1 April 2016, Version of Record 14 April 2016.

论文官网地址:https://doi.org/10.1016/j.is.2016.03.008