Information extraction and text summarization using linguistic knowledge acquisition

作者:

Highlights:

摘要

Storing and accessing texts in a conceptual format has a number of advantages over traditional document retrieval methods. A conceptual format facilitates natural language access to text information. It can support imprecise and inexact queries, conceptual information summarization, and, ultimately, document translation.The lack of extensive linguistic coverage is the major barrier to extracting useful information from large bodies of text. Current natural language processing (NLP) systems do not have rich enough lexicons to cover all the important words and phrases in extended texts. Two methods of overcoming this limitation are (1) to apply a text processing strategy that is tolerant of unknown words and gaps in linguistics knowledge, and (2) to acquire lexical information automatically from the texts.These two methods have been implemented in a prototype intelligent information retrieval system called SCISOR (System for Conceptual Information Summarization, Organization and Retrieval). This article describes the text processing, language acquisition, and summarization components of SCISOR.

论文关键词:

论文评审过程:Received 8 June 1988, Accepted 21 October 1988, Available online 19 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(89)90069-1