A machine learning-based investigation utilizing the in-text features for the identification of dominant emotion in an email

作者：

Highlights：

•

摘要

Identification of emotion hidden in limited text is an active research problem. This work presents a framework for the same using email text. The present work is based on machine learning methods and utilizes three classifiers and three feature selection methods. The novelty of the proposed framework is the utilization of in-text features to identify emotion contained in short texts and development of a dataset for this purpose. Six emotions, namely, neutral, happy, sad, angry, positively surprised, and negatively surprised are utilized here based on baseline theories on human emotion. Experiments are performed on three datasets including a benchmark and one local dataset. These experiments are performed by extracting 14 in-text features from the data. The proposed framework is evaluated using four standard evaluation metrics. Based on the feature selection results, experiments are performed on the datasets under consideration by vertically partitioning them into all features, top features, and bottom features. Qualitative and quantitative comparison of the proposed work is also made with two state-of-the-art methods. The obtained results suggest better performance of the current work with an average accuracy of 83%. The proposed framework can be utilized in an assortment of domains to identify human emotion by providing limited text as an input.

论文关键词：Emotion recognition,Email analysis,Supervised learning,Sentiment analysis

论文评审过程：Received 21 March 2020, Revised 19 August 2020, Accepted 1 September 2020, Available online 19 September 2020, Version of Record 21 September 2020.

论文官网地址：https://doi.org/10.1016/j.knosys.2020.106443