Taxonomy visualization in support of the semi-automatic validation and optimization of organizational schemas

作者:

Highlights:

摘要

Never before in history has mankind produced and had access to so much data, information, knowledge, and expertise as today. To organize, access, and manage these valuable assets effectively, we use taxonomies, classification hierarchies, ontologies, controlled vocabularies, and other approaches. We create directory structures for our files. We use organizational hierarchies to structure our work environment. However, the design and continuous update of these organizational schemas with potentially thousands of class nodes organizing millions of entities is challenging for any human being.The taxonomy visualization and validation (TV) tool introduced in this paper supports the semi-automatic validation and optimization of organizational schemas such as file directories, classification hierarchies, taxonomies, or other structures imposed on a data set for organization, access, and naming. By showing the “goodness of fit” for a schema and the potentially millions of entities it organizes, the TV tool eases the identification and reclassification of misclassified information entities, the identification of classes that grow too large, the evaluation of the size and homogeneity of existing classes, the examination of the “well-formedness” of an organizational schema, and more. As a demonstration, the TV tool is applied to display and examine the United States Patent and Trademark Office patent classification, which organizes more than three million patents into about 160,000 distinct patent classes. The paper concludes with a discussion and an outlook to future work.

论文关键词:Patents,Taxonomy,Ontology,Classification hierarchy,Visualization

论文评审过程:Received 27 July 2006, Revised 6 March 2007, Accepted 6 March 2007, Available online 7 May 2007.

论文官网地址:https://doi.org/10.1016/j.joi.2007.03.002