Evaluating local explanation methods on ground truth

作者：

摘要

Evaluating local explanation methods is a difficult task due to the lack of a shared and universally accepted definition of explanation. In the literature, one of the most common ways to assess the performance of an explanation method is to measure the fidelity of the explanation with respect to the classification of a black box model adopted by an Artificial Intelligent system for making a decision. However, this kind of evaluation only measures the degree of adherence of the local explainer in reproducing the behavior of the black box classifier with respect to the final decision. Therefore, the explanation provided by the local explainer could be different in the content even though it leads to the same decision of the AI system. In this paper, we propose an approach that allows to measure to which extent the explanations returned by local explanation methods are correct with respect to a synthetic ground truth explanation. Indeed, the proposed methodology enables the generation of synthetic transparent classifiers for which the reason for the decision taken, i.e., a synthetic ground truth explanation, is available by design. Experimental results show how the proposed approach allows to easily evaluate local explanations on the ground truth and to characterize the quality of local explanation methods.

论文关键词：Evaluating explanations,Explainable AI,Interpretable models,Open the black box,Local explanation

论文评审过程：Received 22 November 2019, Revised 26 October 2020, Accepted 7 November 2020, Available online 14 November 2020, Version of Record 18 November 2020.

论文官网地址：https://doi.org/10.1016/j.artint.2020.103428