Prediction can be safely used as a proxy for explanation in causally consistent Bayesian generalized linear models


Bayesian modeling provides a principled approach to quantifying uncertainty in model parameters and structure and has seen a surge of applications in recent years. Despite a lot of existing work on an overarching Bayesian workflow, many individual steps still require more research to optimize the related decision processes. In this paper, we present results from a large simulation study of Bayesian generalized linear models for double- and lower-bounded data, where we analyze metrics on convergence, parameter recoverability, and predictive performance. We specifically investigate the validity of using predictive performance as a proxy for parameter recoverability in Bayesian model selection. Results indicate that – for a given, causally consistent predictor term – better out-of-sample predictions imply lower parameter RMSE, lower false positive rate, and higher true positive rate. In terms of initial model choice, we make recommendations for default likelihoods and link functions. We also find that, despite their lacking structural faithfulness for bounded data, Gaussian linear models show error calibration that is on par with structural faithful alternatives.