TY - JOUR
T1 - A non-parametric Bayesian joint model for latent individual molecular profiles and survival in oncology
AU - Rincourt, Sarah Laure
AU - Michiels, Stefan
AU - Drubay, Damien
N1 - Publisher Copyright:
© 2022 The Author(s).
PY - 2022/10/1
Y1 - 2022/10/1
N2 - The development of prognostic molecular signatures considering the inter-patient heterogeneity is a key challenge for the precision medicine. We propose a joint model of this heterogeneity and the patient survival, assuming that tumor expression results from a mixture of a subset of independent signatures. We deconvolute the omics data using a non-parametric independent component analysis with a double sparseness structure for the source and the weight matrices, corresponding to the gene-component and individual-component associations, respectively. In a simulation study, our approach identified the correct number of components and reconstructed with high accuracy the weight (>0.85) and the source (>0.75) matrices sparseness. The selection rate of components with high-to-moderate prognostic impacts was close to 95%, while the weak impacts were selected with a frequency close to the observed false positive rate (<25%). When applied to the expression of 1063 genes from 614 breast cancer patients, our model identified 15 components, including six associated to patient survival, and related to three known prognostic pathways in early breast cancer (i.e. immune system, proliferation, and stromal invasion). The proposed algorithm provides a new insight into the individual molecular heterogeneity that is associated with patient prognosis to better understand the complex tumor mechanisms.
AB - The development of prognostic molecular signatures considering the inter-patient heterogeneity is a key challenge for the precision medicine. We propose a joint model of this heterogeneity and the patient survival, assuming that tumor expression results from a mixture of a subset of independent signatures. We deconvolute the omics data using a non-parametric independent component analysis with a double sparseness structure for the source and the weight matrices, corresponding to the gene-component and individual-component associations, respectively. In a simulation study, our approach identified the correct number of components and reconstructed with high accuracy the weight (>0.85) and the source (>0.75) matrices sparseness. The selection rate of components with high-to-moderate prognostic impacts was close to 95%, while the weak impacts were selected with a frequency close to the observed false positive rate (<25%). When applied to the expression of 1063 genes from 614 breast cancer patients, our model identified 15 components, including six associated to patient survival, and related to three known prognostic pathways in early breast cancer (i.e. immune system, proliferation, and stromal invasion). The proposed algorithm provides a new insight into the individual molecular heterogeneity that is associated with patient prognosis to better understand the complex tumor mechanisms.
KW - Independent component analysis
KW - prognosis
KW - survival
UR - http://www.scopus.com/inward/record.url?scp=85141939087&partnerID=8YFLogxK
U2 - 10.1142/S0219720022500226
DO - 10.1142/S0219720022500226
M3 - Article
C2 - 36287465
AN - SCOPUS:85141939087
SN - 0219-7200
VL - 20
JO - Journal of Bioinformatics and Computational Biology
JF - Journal of Bioinformatics and Computational Biology
IS - 5
M1 - 2250022
ER -