Learning software requirements syntax: An unsupervised approach to recognize templates

作者:

Highlights:

摘要

Requirements are textual representations of the desired software capabilities. Many templates have been used to standardize the structure of requirement statements such as Rupps, EARS, and User Stories. Templates provide a good solution to improve different Requirements Engineering (RE) tasks since their well-defined syntax facilitates the different text processing steps in RE automation researches. However, many empirical studies have concluded that there is a gap between these RE researches and their implementation in industrial and real-life projects. The success of RE automation approaches strongly depends on the consistency of the requirements with the syntax of the predefined templates. Such consistency cannot be guaranteed in real projects, especially in large development projects, or when one has little control over the requirements authoring environment.In this paper, we propose an unsupervised approach to recognize templates from the requirements themselves by extracting their common syntactic structures. The resultant templates reflect the actual syntactic structure of requirements; hence it can recognize both standard and non-standard templates. Our approach uses techniques from Natural Language Processing and Graph Theory to handle this problem through three main stages (1) we formulate the problem as a graph problem, where each requirement is represented as a vertex and each pair of requirements has a structural similarity, (2) We detect main communities in the resultant graph by applying a hybrid technique combining limited dynamic programming and greedy algorithms, (3) finally, we reinterpret the detected communities as templates.Our experiments show that the suggested approach can detect templates that follow well-known standards with a 0.90 F1-measure. Moreover, the approach can detect common syntactic features for non-standard templates in more than 73.5% of the cases. Our evaluation indicates that these results are robust regardless of the number and the length of the processed requirements.

论文关键词:Requirements Engineering,Requirements templates recognition,Natural Language Processing (NLP),Syntax learning,Graph community detection

论文评审过程:Received 27 September 2021, Revised 4 April 2022, Accepted 27 April 2022, Available online 4 May 2022, Version of Record 13 May 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108933