An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models

作者:

Highlights:

摘要

An important objective of data mining is the development of predictive models. Based on a number of observations, a model is constructed that allows the analysts to provide classifications or predictions for new observations. Currently, most research focuses on improving the accuracy or precision of these models and comparatively little research has been undertaken to increase their comprehensibility to the analyst or end-user. This is mainly due to the subjective nature of ‘comprehensibility’, which depends on many factors outside the model, such as the user's experience and his/her prior knowledge. Despite this influence of the observer, some representation formats are generally considered to be more easily interpretable than others. In this paper, an empirical study is presented which investigates the suitability of a number of alternative representation formats for classification when interpretability is a key requirement. The formats under consideration are decision tables, (binary) decision trees, propositional rules, and oblique rules. An end-user experiment was designed to test the accuracy, response time, and answer confidence for a set of problem-solving tasks involving the former representations. Analysis of the results reveals that decision tables perform significantly better on all three criteria, while post-test voting also reveals a clear preference of users for decision tables in terms of ease of use.

论文关键词:Data mining,Classification,Knowledge representation,Comprehensibility,Decision tables

论文评审过程:Received 26 November 2008, Revised 29 October 2010, Accepted 5 December 2010, Available online 10 December 2010.

论文官网地址:https://doi.org/10.1016/j.dss.2010.12.003