Discovering simple rules in complex data: A meta-learning algorithm and some surprising musical discoveries

作者:

摘要

This article presents a new rule discovery algorithm named PLCG that can find simple, robust partial rule models (sets of classification rules) in complex data where it is difficult or impossible to find models that completely account for all the phenomena of interest. Technically speaking, PLCG is an ensemble learning method that learns multiple models via some standard rule learning algorithm, and then combines these into one final rule set via clustering, generalization, and heuristic rule selection. The algorithm was developed in the context of an interdisciplinary research project that aims at discovering fundamental principles of expressive music performance from large amounts of complex real-world data (specifically, measurements of actual performances by concert pianists). It will be shown that PLCG succeeds in finding some surprisingly simple and robust performance principles, some of which represent truly novel and musically meaningful discoveries. A set of more systematic experiments shows that PLCG usually discovers significantly simpler theories than more direct approaches to rule learning (including the state-of-the-art learning algorithm Ripper), while striking a compromise between coverage and precision. The experiments also show how easy it is to use PLCG as a meta-learning strategy to explore different parts of the space of rule models.

论文关键词:Machine learning,Data mining,Rule discovery,Ensemble methods,Meta-learning,Partial models,Expressive music performance

论文评审过程:Received 22 November 2001, Revised 8 November 2002, Available online 26 February 2003.

论文官网地址:https://doi.org/10.1016/S0004-3702(03)00016-X