Zero-shot fine-grained entity typing in information security based on ontology
作者:
Highlights:
•
摘要
The field of information security suffers from the lack of labelled entities. This study proposes a zero-shot hybrid approach, combining a clustering algorithm with a method for representing category labels, to classify fine-grained entity typing based on unified cybersecurity ontology (UCO) to address this issue. However, certain category labels in UCO do not have distinct domain features, while certain abbreviations cannot be obtained directly from word embedding using Word2vec. Thus, we propose a new method, referred to as mixed entities and hierarchy of UCO (MEHC), to represent the category labels. Moreover, to further improve the performance of fine-grained entity typing we propose the triClustering algorithm to re-cluster coarse-grained classification results or determine corresponding types for new entities, based on the theorem that the sum of two sides of a triangle is greater than the third. The experimental results prove that our triClustering algorithm can effectively shorten the computation time and that the proposed hybrid method is superior to other baselines for information security applications.
论文关键词:Fine-grained entity typing,Clustering algorithm,Representation method for categories,Information security,Unified cybersecurity ontology
论文评审过程:Received 1 April 2021, Revised 6 September 2021, Accepted 7 September 2021, Available online 15 September 2021, Version of Record 24 September 2021.
论文官网地址:https://doi.org/10.1016/j.knosys.2021.107472