ChainImputer: A Neural Network-Based Iterative Imputation Method Using Cumulative Features

Published in MDPI Symmetry, 2025

The goal of missing value imputation is to replace the missing values in the dataset with specific values. In particular, this preprocessing step plays a crucial role in knowledge discovery and data mining, as most data analysis methods assume complete data and cannot be applied directly to datasets with missing entries. Among various approaches, the neural network-based missing value imputation method has recently gained significant attention due to its superior prediction accuracy based on its excellent capability to fit the given training dataset.

ChainImputer Architecture ChainImputer’s incremental feature construction process demonstrating the cumulative approach to missing value imputation

Specifically, these approaches conventionally begin by applying a naive missing value imputation to fill all missing entries in the dataset and then train the network on the completed data. Thus, the performance of missing value imputation can be limited because the neural network is trained on an unreliable dataset filled with roughly guessed values. Instead, we may consider an alternative strategy to use only the features without missing values or carefully imputed features obtained during the imputation process, which can be regarded as an asymmetric process because it progressively adds the newly imputed features into the training dataset.

In this study, we propose an effective neural network-based imputation method that incrementally constructs a cumulative feature set during training. The experimental results on 25 publicly available datasets showed that the proposed method outperforms conventional methods significantly.

Download paper here

Recommended citation: Seo, Wangduk, et al. "ChainImputer: A Neural Network-Based Iterative Imputation Method Using Cumulative Features." Symmetry 17.6 (2025): 869.
Download Paper