Anaphora in natural language processing and information retrieval

作者:

Highlights:

摘要

Anaphora is the discourse-level linguistic phenomenon of abbreviated subsequent reference, pronouns being the most commonly used anaphors. In that anaphora plays an essential role in human processors' production and understanding of texts, its appropriate recognition and resolution is essential to information retrieval systems that manipulate natural language texts. The approaches to anaphora undertaken in theoretical linguistics and NLP are surveyed and the results of research on anaphora as it impacts on information processes are presented with specific attention to the detailed studies conducted at Syracuse University. These studies have provided essential baseline data on the extent to which anaphora occur, their likelihood of referring to concepts integral to the toplc, their effect on a variety of term-weighting schemes, and their impact on retrieval results. Although the most effective means of processing anaphora may not have yet been determined, it is suggested that improved retrieval systems will need to represent the full meaning of natural language documents, including anaphoric references as well as all other discourse linguistic phenomena.

论文关键词:

论文评审过程:Received 7 February 1989, Accepted 13 July 1989, Available online 16 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(90)90008-P