A Methodology for simplification and interpretation of backpropagation-based neural network models

摘要

A new methodology for building inductive expert systems known as neural networks has emerged as one of the most promising applications of artificial intelligence in the 1990s. The primary advantages of a neural network approach for modeling expert decision processes are: (1) the ability of the network to learn from examples of experts' decisions that avoids the costly, time consuming, and error prone task of trying to directly extract knowledge of a problem domain from an expert and (2) the ability of the network to handle noisy, incomplete, and distorted data that are typically found in decision making under conditions of uncertainty. Unfortunately, a major limitation of neural network-based models has been the opacity of the inference process. Unlike conventional expert system decision support tools, decision makers are generally unable to understand the basis of neural network decisions. This problem often makes such systems undesirable for decision support applications. A new methodology is presented that allows the development of highly simplified backpropagation neural network models. This methodology simplifies netw variables that are not contributing to the networks ability to produce accurate predictions. Elimination of unnecessary input variables directly reduces the number of network parameters that must be estimated and consequently the complexity of the network structure. A primary benefit of this development methodology is that it is based on a variable importance measure that addresses the problem of producing an interpretation of a neural network's functioning. Decision makers may easily understand the resulting networks in terms of the proportional contribution each input variable is making in the production of accurate predictions. Furthermore, in actual application the accuracy of these simplified models should be comparable to or better than the more complex models developed with the standard approach. This new methodology is demonstrated by two classification problems based on sets of actual data.