On the notions of ambiguity and information loss

作者:

Highlights:

摘要

One of the fundamental problems in information science is to distinguish various objects (such as books or journal articles) on the basis of associated values (such as authors and titles). Where the values fail to distinguish two distinct objects we say that the objects are ambigious under the given value assignment. To obtain a measure of ambiguity, it is only necessary to count the number of ways that the objects can be arranged for each set of ambiguous objects, multiply these counts and take logarithms. It is shown that such an approach leads to a measure in the formal sense and that the measure depends only on the definition of equality of values so that it can be simply extended to sets of values and ordered sets of values. It is also shown that it is possible to construct a function of ambiguity that one can call “information” and that the information loss that occurs when distinct values are grouped into equivalence classes (as in the use of search and sort keys) is also a measure. Finally, it is shown that ambiguity and information as here defined are directly related to Shannon's definition of “information” thus tieing this approach to that portion of information theory associated with the derivation of optimal distributions frequently used in information science models.

论文关键词:

论文评审过程:Available online 15 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(77)90032-2