TY - JOUR
T1 - Prediction of breast cancer treatment–induced fatigue by machine learning using genome-wide association data
AU - Lee, Sangkyu
AU - Deasy, Joseph O.
AU - Oh, Jung Hun
AU - Di Meglio, Antonio
AU - Dumas, Agnes
AU - Menvielle, Gwenn
AU - Charles, Cecile
AU - Boyault, Sandrine
AU - Rousseau, Marina
AU - Besse, Celine
AU - Thomas, Emilie
AU - Boland, Anne
AU - Cottu, Paul
AU - Tredan, Olivier
AU - Levy, Christelle
AU - Martin, Anne Laure
AU - Everhard, Sibille
AU - Ganz, Patricia A.
AU - Partridge, Ann H.
AU - Michiels, Stefan
AU - Deleuze, Jean François
AU - Andre, Fabrice
AU - Vaz-Luis, Ines
N1 - Publisher Copyright:
© The Author(s) 2020. Published by Oxford University Press. All rights reserved.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Background: We aimed at predicting fatigue after breast cancer treatment using machine learning on clinical covariates and germline genome-wide data. Methods: We accessed germline genome-wide data of 2799 early-stage breast cancer patients from the Cancer Toxicity study (NCT01993498). The primary endpoint was defined as scoring zero at diagnosis and higher than quartile 3 at 1 year after primary treatment completion on European Organization for Research and Treatment of Cancer quality-of-life questionnaires for Overall Fatigue and on the multidimensional questionnaire for Physical, Emotional, and Cognitive fatigue. First, we tested univariate associations of each endpoint with clinical variables and genome-wide variants. Then, using preselected clinical (false discovery rate < 0.05) and genomic (P < .001) variables, a multivariable preconditioned random-forest regression model was built and validated on a hold-out subset to predict fatigue. Gene set enrichment analysis identified key biological correlates (MetaCore). All statistical tests were 2-sided. Results: Statistically significant clinical associations were found only with Emotional and Cognitive Fatigue, including receipt of chemotherapy, anxiety, and pain. Some single nucleotide polymorphisms had some degree of association (P < .001) with the different fatigue endpoints, although there were no genome-wide statistically significant (P < 5.00 × 10-8) associations. Only for Cognitive Fatigue, the predictive ability of the genomic multivariable model was statistically significantly better than random (area under the curve ¼ 0.59, P ¼ .01) and marginally improved with clinical variables (area under the curve ¼ 0.60, P ¼ .005). Single nucleotide polymorphisms found to be associated (P < .001) with Cognitive Fatigue belonged to genes linked to inflammation (false discovery rate adjusted P ¼ .03), cognitive disorders (P ¼ 1.51 × 10-12), and synaptic transmission (P ¼ 6.28 × 10-8). Conclusions: Genomic analyses in this large cohort of breast cancer survivors suggest a possible genetic role for severe Cognitive Fatigue that warrants further exploration.
AB - Background: We aimed at predicting fatigue after breast cancer treatment using machine learning on clinical covariates and germline genome-wide data. Methods: We accessed germline genome-wide data of 2799 early-stage breast cancer patients from the Cancer Toxicity study (NCT01993498). The primary endpoint was defined as scoring zero at diagnosis and higher than quartile 3 at 1 year after primary treatment completion on European Organization for Research and Treatment of Cancer quality-of-life questionnaires for Overall Fatigue and on the multidimensional questionnaire for Physical, Emotional, and Cognitive fatigue. First, we tested univariate associations of each endpoint with clinical variables and genome-wide variants. Then, using preselected clinical (false discovery rate < 0.05) and genomic (P < .001) variables, a multivariable preconditioned random-forest regression model was built and validated on a hold-out subset to predict fatigue. Gene set enrichment analysis identified key biological correlates (MetaCore). All statistical tests were 2-sided. Results: Statistically significant clinical associations were found only with Emotional and Cognitive Fatigue, including receipt of chemotherapy, anxiety, and pain. Some single nucleotide polymorphisms had some degree of association (P < .001) with the different fatigue endpoints, although there were no genome-wide statistically significant (P < 5.00 × 10-8) associations. Only for Cognitive Fatigue, the predictive ability of the genomic multivariable model was statistically significantly better than random (area under the curve ¼ 0.59, P ¼ .01) and marginally improved with clinical variables (area under the curve ¼ 0.60, P ¼ .005). Single nucleotide polymorphisms found to be associated (P < .001) with Cognitive Fatigue belonged to genes linked to inflammation (false discovery rate adjusted P ¼ .03), cognitive disorders (P ¼ 1.51 × 10-12), and synaptic transmission (P ¼ 6.28 × 10-8). Conclusions: Genomic analyses in this large cohort of breast cancer survivors suggest a possible genetic role for severe Cognitive Fatigue that warrants further exploration.
UR - http://www.scopus.com/inward/record.url?scp=85101394371&partnerID=8YFLogxK
U2 - 10.1093/JNCICS/PKAA039
DO - 10.1093/JNCICS/PKAA039
M3 - Article
AN - SCOPUS:85101394371
SN - 2515-5091
VL - 4
JO - JNCI Cancer Spectrum
JF - JNCI Cancer Spectrum
IS - 5
M1 - PKAA039
ER -