Comparison of Multiple Imputation and EM algorithm on Predicting Children’s Weight at School Entry using Penalized Regression: Ridge, Lasso and Elastic Net

Khuneswari Gopal Pillay, Sya Sya Avtar, John H. McColl, Charlotte Wright

Penalized regression has been put onto test to determine which regression (Ridge, Lasso, Elastic Net) will outperform compared to other methods when combining with missing data. Ridge regression will shrink coefficient of insignificant variables close to zero. While Lasso regression will convert the coefficient of insignificant variables to zero. Elastic Net will combine Ridge and Lasso regression method. Missing data will be imputed using two methods, Multiple Imputation and EM algorithm. The final model will be chosen by lowest means squared error of prediction value. Gateshead Millennium Study (GMS) data will be use to illustrate this problem. The result showed that Multiple imputation are better than EM algorithm for missing data imputation. While Lasso regression performed better compare to Ridge regression. Lastly the best model are from Elastic Net regression with α=0.9.

Volume 11 | 12-Special Issue

Pages: 178-185