Methodological NEAR results: How to deal with missing data

Healthcare decisions should be based on high-quality evidence. Combining individual data from different studies to perform a single analysis, sometimes called individual participant data meta-analysis (IPDMA), can help achieve this goal. Apart from ensuring that the same statistical model and conditions are used across all studies, IPDMA can increase statistical power to detect more refined effects that a single study cannot find. Yet, practical challenges, such as systematically missing data can arise. In such circumstances, researchers must weigh the trade-offs between excluding data, thus (potentially) losing power or applying methods, such as multiple imputations, to infer missing values from observed data. However, there is a lack of research investigating multiple imputations for systematically missing data in IPDMA. This study explored three different multiple imputation methods of systematically missing data on gait speed concerning 5-year mortality in four population-based studies on older adults aged 59—≥90 years. Four NEAR studies were used: The Swedish National study on Aging and Care – Skåne (SNAC-S), Kungsholmen (SNAC-K), Blekinge (SNAC-B), 4) Nordanstig (SNAC-N).

Photo: Ann H

Conditional quantile imputation (CQI) performed best among all imputation methods
Gait speed was systematically missing from one study. Therefore, information from the other three studies with complete data on gait speed was used to perform three different multiple imputation strategies: fully conditional specification (FCS); multivariate normal (MVN); and conditional quantile imputation (CQI). It was found that all three different methods of multiple imputation performed relatively well in imputing missing gait speed data. However, the highest performance was found for conditional quantile multiple imputation (CQI). Overall, the development of appropriate methods of handling missing data is crucial for large pooling projects. However, these results should be further evaluated and replicated in other contexts. In the long run, missing data imputation in large pooling projects can help infer evidence-based healthcare.

Robert Thiesmeier, first author of the study. Photo: Stephanie Pitt

Congratulations Robert on your results! What was it like working with NEAR data?
The challenge was to simulate data from four databases within NEAR while keeping their statistical properties. By doing so, we could evaluate how multiple imputation strategies perform in our case of systematically missing gait speed data in one of the studies.

Most unexpected research finding:
Since one study completely lacked gait speed data, we expected it to have a more significant impact on the results. This is not to say that multiple imputations are not of importance. Rather, every study project must evaluate its use and application.

Best tips for working with NEAR data:
When using NEAR data, if you face methodological and practical challenges, such as missing data, it can be extremely beneficial to learn from other NEAR studies with similar data. This way you can maximize NEAR’s potential data use, as well as its data richness.

Best tips for improving your gait speed:
I leave this to aging research experts, although chasing missing data might certainly improve fitness 😊.

Best tips to regain focus:
Coffee and cinnamon buns😊.

 

Publication

Thiesmeier R, Abbadi A, Rizzuto D, Calderón-Larrañaga A, Hofer SM, Orsini N. Multiple imputation of systematically missing data on gait speed in the Swedish National Study on Aging and Care. Aging (Albany NY). 2024; 16(4):3056-67. https://doi.org/10.18632/aging.205552.