TY - JOUR
T1 - Handling missing covariates in observational studies
T2 - an illustration with the assessment of prognostic factors of survival outcomes in soft-tissue or visceral sarcomas in irradiated fields (SIF)
AU - Huchet, Noémie
AU - Penel, Nicolas
AU - Bonvalot, Sylvie
AU - Thariat, Juliette
AU - Ducimetière, Françoise
AU - Giraud, Antoine
AU - Toulmonde, Maud
AU - Le Cesne, Axel
AU - Blay, Jean Yves
AU - Bellera, Carine
N1 - Publisher Copyright:
© The Author(s), 2024.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Background: Missing covariates are common in observational research and can lead to bias and loss of statistical power. Limited data regarding prognostic factors of survival outcomes of sarcomas in irradiated fields (SIF) are available. Because of the long lag time between irradiation of first cancer and scarcity of SIF, missing data are a critical issue when analyzing long-term outcomes. We assessed prognostic factors of overall (OS), progression-free (PFS), and metastatic-progression-free (MPFS) survivals in SIF using three methods to account for missing covariates. Methods: We relied on the NETSARC French Sarcoma Group database, Cox (OS/PFS), and competitive hazards (MPFS) survival models. Covariates investigated were age, sex, histological subtype, tumor size, depth and grade, metastasis, surgery, surgical resection, surgeon’s expertise, imaging, and neo-adjuvant treatment. We first applied multiple imputation (MI): observed data were used to estimate the missing covariate. With the missing-data modality approach, a category missing was created for qualitative variables. With the complete-case (CC) approach, analysis was restricted to patients without missing covariates. Results: CC subjects (N = 167; 33%) presented more often with soft-tissue sarcoma (versus visceral sarcoma) and grade I–II tumors as compared to the 504 eligible cases. With MI (N = 504), factors associated with the worst outcome included metastasis (p = 0.04) and R1/R2 resection (p < 0.001) for OS; higher grade/non-gradable tumors (p = 0.002) and R1/R2 resection (p < 0.001) for PFS; and metastasis (p = 0.01) for M-PFS. The ‘missing-data modality’ approach (N = 504) led to different associations, including significance reached due to variables with the modality ‘missing’. The CC analysis led to different results and reduced precision. Conclusion: The CC population was not representative of the eligible population, introducing bias, in addition to worst precision. The ‘missing-data modality method’ results in biased estimates in non-randomized studies, as outcomes may be related to variables with missing values. Appropriate statistical methods for missing covariates, for example, MI, should therefore be considered.
AB - Background: Missing covariates are common in observational research and can lead to bias and loss of statistical power. Limited data regarding prognostic factors of survival outcomes of sarcomas in irradiated fields (SIF) are available. Because of the long lag time between irradiation of first cancer and scarcity of SIF, missing data are a critical issue when analyzing long-term outcomes. We assessed prognostic factors of overall (OS), progression-free (PFS), and metastatic-progression-free (MPFS) survivals in SIF using three methods to account for missing covariates. Methods: We relied on the NETSARC French Sarcoma Group database, Cox (OS/PFS), and competitive hazards (MPFS) survival models. Covariates investigated were age, sex, histological subtype, tumor size, depth and grade, metastasis, surgery, surgical resection, surgeon’s expertise, imaging, and neo-adjuvant treatment. We first applied multiple imputation (MI): observed data were used to estimate the missing covariate. With the missing-data modality approach, a category missing was created for qualitative variables. With the complete-case (CC) approach, analysis was restricted to patients without missing covariates. Results: CC subjects (N = 167; 33%) presented more often with soft-tissue sarcoma (versus visceral sarcoma) and grade I–II tumors as compared to the 504 eligible cases. With MI (N = 504), factors associated with the worst outcome included metastasis (p = 0.04) and R1/R2 resection (p < 0.001) for OS; higher grade/non-gradable tumors (p = 0.002) and R1/R2 resection (p < 0.001) for PFS; and metastasis (p = 0.01) for M-PFS. The ‘missing-data modality’ approach (N = 504) led to different associations, including significance reached due to variables with the modality ‘missing’. The CC analysis led to different results and reduced precision. Conclusion: The CC population was not representative of the eligible population, introducing bias, in addition to worst precision. The ‘missing-data modality method’ results in biased estimates in non-randomized studies, as outcomes may be related to variables with missing values. Appropriate statistical methods for missing covariates, for example, MI, should therefore be considered.
KW - competing risks
KW - irradiation
KW - sarcoma
KW - survival analysis
UR - http://www.scopus.com/inward/record.url?scp=85182815234&partnerID=8YFLogxK
U2 - 10.1177/17588359231220999
DO - 10.1177/17588359231220999
M3 - Article
AN - SCOPUS:85182815234
SN - 1758-8340
VL - 16
JO - Therapeutic Advances in Medical Oncology
JF - Therapeutic Advances in Medical Oncology
ER -