First order random forests: Learning relational classifiers with complex aggregates

作者：Anneleen Van Assche, Celine Vens, Hendrik Blockeel, Sašo Džeroski

摘要

In relational learning, predictions for an individual are based not only on its own properties but also on the properties of a set of related individuals. Relational classifiers differ with respect to how they handle these sets: some use properties of the set as a whole (using aggregation), some refer to properties of specific individuals of the set, however, most classifiers do not combine both. This imposes an undesirable bias on these learners. This article describes a learning approach that avoids this bias, using first order random forests. Essentially, an ensemble of decision trees is constructed in which tests are first order logic queries. These queries may contain aggregate functions, the argument of which may again be a first order logic query. The introduction of aggregate functions in first order logic, as well as upgrading the forest’s uniform feature sampling procedure to the space of first order logic, generates a number of complications. We address these and propose a solution for them. The resulting first order random forest induction algorithm has been implemented and integrated in the ACE-ilProlog system, and experimentally evaluated on a variety of datasets. The results indicate that first order random forests with complex aggregates are an efficient and effective approach towards learning relational classifiers that involve aggregates over complex selections.

论文关键词：Relational learning, Random forests, Aggregation, Decision tree learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10994-006-8713-9