Multi-Relational Learning, Text Mining, and Semi-Supervised Learning for Functional Genomics

作者:Mark-A. Krogel, Tobias Scheffer

摘要

We focus on the problem of predicting functional properties of the proteins corresponding to genes in the yeast genome. Our goal is to study the effectiveness of approaches that utilize all data sources that are available in this problem setting, including relational data, abstracts of research papers, and unlabeled data. We investigate a propositionalization approach which uses relational gene interaction data. We study the benefit of text classification and information extraction for utilizing a collection of scientific abstracts. We study transduction and co-training for using unlabeled data. We report on both, positive and negative results on the investigated approaches. The studied tasks are KDD Cup tasks of 2001 and 2002. The solutions which we describe achieved the highest score for task 2 in 2001, the fourth rank for task 3 in 2001, the highest score for one of the two subtasks and the third place for the overall task 2 in 2002.

论文关键词:propositionalization, information extraction, co-training

论文评审过程:

论文官网地址:https://doi.org/10.1023/B:MACH.0000035472.73496.0c