A comparison between different prediction models for invasive breast cancer occurrence in the French E3N cohort

Laureen Dartois, Émilien Gauthier, Julia Heitzmann, Laura Baglietto, Stefan Michiels, Sylvie Mesrine, Marie Christine Boutron-Ruault, Suzette Delaloge, Stéphane Ragusa, Françoise Clavel-Chapelon, Guy Fagherazzi

    Research output: Contribution to journalArticlepeer-review

    10 Citations (Scopus)

    Abstract

    Breast cancer remains a global health concern with a lack of high discriminating prediction models. The k-nearest-neighbor algorithm (kNN) estimates individual risks using an intuitive tool. This study compares the performances of this approach with the Cox and the Gail models for the 5-year breast cancer risk prediction. The study included 64,995 women from the French E3N prospective cohort. The sample was divided into a learning (N = 51,821) series to learn the models using fivefold cross-validation and a validation (N = 13,174) series to evaluate them. The area under the receiver operating characteristic curve (AUC) and the expected over observed number of cases (E/O) ratio were estimated. In the two series, 393 and 78 premenopausal and 537 and 98 postmenopausal breast cancers were diagnosed. The discrimination values of the best combinations of predictors obtained from cross-validation ranged from 0.59 to 0.60. In the validation series, the AUC values in premenopausal and postmenopausal women were 0.583 [0.520; 0.646] and 0.621 [0.563; 0.679] using the kNN and 0.565 [0.500; 0.631] and 0.617 [0.561; 0.673] using the Cox model. The E/O ratios were 1.26 and 1.28 in premenopausal women and 1.44 and 1.40 in postmenopausal women. The applied Gail model provided AUC values of 0.614 [0.554; 0.675] and 0.549 [0.495; 0.604] and E/O ratios of 0.78 and 1.12. This study shows that the prediction performances differed according to menopausal status when using parametric statistical tools. The k-nearest-neighbor approach performed well, and discrimination was improved in postmenopausal women compared with the Gail model.

    Original languageEnglish
    Pages (from-to)415-426
    Number of pages12
    JournalBreast Cancer Research and Treatment
    Volume150
    Issue number2
    DOIs
    Publication statusPublished - 1 Apr 2015

    Keywords

    • Breast cancer
    • Calibration
    • Discrimination
    • Gail model
    • Menopausal status
    • Nearest-neighbor algorithm
    • Postmenopausal women
    • Premenopausal women
    • Proportional hazard Cox regression
    • Risk score
    • Women

    Cite this