MULTITASK FEATURE SELECTION WITH TASK DESCRIPTORS

Victor Bellon, Veronique Stoven, Chloe-Agathe Azencott


MINES ParisTech, PSL-Research University, CBIO-Centre for Computational Biology
Email: victor.bellon@mines-paristech.fr

Pacific Symposium on Biocomputing 21:261-272(2016)

© 2016 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.


Abstract

Machine learning applications in precision medicine are severly limited by the scarcity of data to learn from. Indeed, training data often contains many more features than samples. To alleviate the resulting statistical issues, the multitask learning framework proposes to learn different but related tasks joinlty, rather than independently, by sharing information between these tasks. Within this framework, the joint regularization of model parameters results in models with few non-zero coef- ficients and that share similar sparsity patterns. We propose a new regularized multitask approach that incorporates task descriptors, hence modulating the amount of information shared between tasks according to their similarity. We show on simulated data that this method outperforms other multitask feature selection approaches, particularly in the case of scarce data. In addition, we demon- strate on peptide MHC-I binding data the ability of the proposed approach to make predictions for new tasks for which no training data is available.


[Full-Text PDF] [PSB Home Page]