Rule extraction with guarantees from regression models

作者:

Highlights:

• Almost all studies about rule extraction investigate classification. In this paper, we study rule extraction from opaque predictive regression models.

• Today, all black-box rule extraction methods suffer from potentially low fidelity on test data. By utilizing conformal prediction in a novel way, the fidelity can be guaranteed, thus solving the main problem with black-box rule extraction.

• Another problem with rule extraction for regression is the choice of representation language; a standard regression tree with point predictions in the leaves is typically both too weak and convey very little information, while more complex alternatives like model trees are not truly comprehensible.

• We suggest a new representation language for the extracted models; i.e., standard regression trees, but augmented with valid and sharp prediction intervals in the leaves.

• In the extensive empirical investigation, the validity of the extracted models is demonstrated.

• In addition, it is shown how normalization can be used to provide individualized prediction intervals, thus providing highly informative extracted models.

摘要

•Almost all studies about rule extraction investigate classification. In this paper, we study rule extraction from opaque predictive regression models.•Today, all black-box rule extraction methods suffer from potentially low fidelity on test data. By utilizing conformal prediction in a novel way, the fidelity can be guaranteed, thus solving the main problem with black-box rule extraction.•Another problem with rule extraction for regression is the choice of representation language; a standard regression tree with point predictions in the leaves is typically both too weak and convey very little information, while more complex alternatives like model trees are not truly comprehensible.•We suggest a new representation language for the extracted models; i.e., standard regression trees, but augmented with valid and sharp prediction intervals in the leaves.•In the extensive empirical investigation, the validity of the extracted models is demonstrated.•In addition, it is shown how normalization can be used to provide individualized prediction intervals, thus providing highly informative extracted models.

论文关键词:Rule extraction,Interpretability,Conformal prediction,Explainable AI,Predictive regression

论文评审过程:Received 17 September 2020, Revised 4 May 2021, Accepted 3 June 2021, Available online 2 February 2022, Version of Record 15 February 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108554