Comparison of MICE and Regression Imputation for Handling Missing Data

Berliana Devianti Putri, Hari Basuki Notobroto, Arief Wibowo

Abstract


Data collection activities have a higher risk of missing data. Missing data may produce biased estimates and standard errors increased, so imputation method is needed. The purpose of this study was to investigate which imputation method is the most appropriate to use for handling missing data. The strategies evaluated include complete case analysis, Multivariate Imputation by Chained Equation (MICE), and Regression Imputation. This study was non-reactive study and used raw data RPJMN 2015 Survey from BKKBN East Java Province. There were three incomplete data sets were generated from a complete raw dataset with 5%, 10%, and 15% missing data. Incomplete data sets were made missing completely at random. Based on Friedman Test, both of imputation methods produced estimates which was no different with complete raw data set. Based on Mean Square Error analysis, MICE provided MSE values less and more stable than Regression Imputation in all scenarios. Conclusion: Multivariate Imputation by Chained Equation (MICE) was the most recommended method to use for handling missing data less than 15%.
Keywords: Missing data, MICE, Regression imputation


Full Text:

PDF


DOI: https://doi.org/10.33846/hn.v2i2.119

Refbacks

  • There are currently no refbacks.


Copyright (c) 2018 Berliana Devianti Putri, Hari Basuki Notobroto, Arief Wibowo

"HEALTH NOTIONS" ISSN: 2580-4936 (online version only), published by Humanistic Network for Science and Technology    

Cemara street 25, 001/002, Dare, Ds./Kec. Sukorejo, Ponorogo, East Java, Indonesia, 63453
Phone/WhatsApp: +6282132259611