Classifying emotions in Stack Overflow and JIRA using a multi-label approach

作者:

Highlights:

摘要

A forum or social media post can express multiple emotions, such as love, joy or anger. Emotion classification has been proven useful for measuring aspects such as user satisfaction. Despite its usefulness, research in emotion classification is limited, because the task is multi-label and publicly available data sets and lexica are very limited. A number of emotion classifiers for general-domain text have been proposed recently, but only a few for text in the domain of Open Source Software (OSS), such as EmoTxt. In this paper, we explore different lexica and two multi-label algorithms for classifying emotions in text related to OSS. We trained various multi-label classifiers using HOMER and RAkEL on a data set of Stack Overflow posts and a data set of JIRA Issue Tracker comments. The classifiers have been enriched with features derived from different state-of-the-art lexica. We achieved multi-label Micro F-scores up to 0.811 and Subset 0/1 Loss of 0.290. These results represent a statistically significant improvement over the state-of-the-art.

论文关键词:Multi-label classification,Emotion classification,Stack Overflow,Jira Issue Tracker

论文评审过程:Received 14 October 2019, Revised 4 February 2020, Accepted 7 February 2020, Available online 14 February 2020, Version of Record 4 April 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.105633