Incorporating geometry knowledge into an incremental learning structure for few-shot intent recognition

作者:

Highlights:

摘要

Few-shot incremental intent recognition aims at continually identifying users’ intents from utterances with limited labeled novel data. To mitigate the effect of catastrophic forgetting inherent in incremental learning, existing methods generally resort to storing informative base exemplars as the model memory that are further replied to when learning novel classes. However, merely preserving these base exemplars and ignoring the relationship between them can easily trigger some concerns in the incremental training process. First, in each iteration, the participation of novel samples further updates the embedding space to make the novel labels well separated, which instead poisons the learned labels, reducing their separability. Second, the overfitting risk is further amplified in incremental learning, causing models to be overconfident on seen base classes but hardly generalized to unseen novel classes. In view of these problems, a geometry-aware learning (GAL) model is proposed to address few-shot incremental intent recognition and bridge the two research gaps mentioned above. Specifically, in our proposal, for the embedding shifting issue, GAL constructs a geometric structure of selected exemplars based on their spatial distribution in the embedding space, which is treated as a strong constraint in the following training. For the overfitting issue, GAL introduces an episodic-based pretraining strategy as well as a multisource contrastive-based loss to enhance the lack of supervised signals in the classification of data-scarce novel labels. Experimental results on a public dataset OOS (CLINC-150) verify the effectiveness of our proposal by beating the state-of-the-art baselines. Specifically, our model outperforms the best baseline by 1.34% and 0.39% on the 5-way 1-shot meta task for the base and novel classes, respectively. It also outperforms the best baseline by 1.89% and 0.80% on the 5-way 5-shot meta task for the base and novel classes, respectively. Furthermore, cross-domain experiments between two datasets (CLINC-150 and ATIS) reflect that GAL has better generalization ability across different domains. The method is superior to traditional intent recognition models in the application scenario where the model is required to accurately distinguish both new and trained categories under the low-data dilemma. Analogously, this common application scenario can also provide a broad developing direction for the dialog recommender system, which helps it recommend continuously and accurately with only a few labeled samples.

论文关键词:Few-shot learning,Incremental learning,Intent recognition

论文评审过程:Received 22 March 2022, Revised 15 June 2022, Accepted 16 June 2022, Available online 21 June 2022, Version of Record 1 July 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109296