ForkXplorer: an approach of fork summary generation

作者:Zhang Zhang, Xinjun Mao, Chao Zhang, Yao Lu

摘要

Pull-based development has become an important paradigm for distributed software development. In this model, each developer independently works on a copied repository (i.e., a fork) from the central repository. It is essential for developers to maintain awareness of the state of other forks to improve collaboration efficiency. In this paper, we propose a method to automatically generate a summary of a fork. We first use the random forest method to generate the label of a fork, i.e., feature implementation or a bug fix. Based on the information of the fork-related commits, we then use the TextRank algorithm to generate detailed activity information of the fork. Finally, we apply a set of rules to integrate all related information to construct a complete fork summary. To validate the effectiveness of our method, we conduct 30 groups of manual experiment and 77 groups of case studies on Github. We propose Feaavg to evaluate the performance of the generated fork summary, considering the content accuracy, content integrity, sentence fluency, and label extraction accuracy. The results show that the average of Feaavg of the fork summary generated by this method is 0.672. More than 63% of project maintainers and the contributors believe that the fork summary can improve development efficiency.

论文关键词:open source software, pull-based development, fork summary, distributed cooperative development

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-020-0047-4