-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise compare_models() for Bayesian models #716
Comments
I agree other than that I think we should keep LOOIC+SE (rather than ELPD). LOOIC is -2*ELPD, which puts it in the deviance metric that is also used in frequentist models with AIC, etc, and this allows us to show the stacking weights with our existing formatting functions |
Fair Should we also add some details in the documentation about how to interpret the weights, because I've been asked by students and I wasn't too clear in my answer... any rules of thumb? |
Moreover, in easystats/report#419, we compute the z-transformed ELPD diff (i.e., normalized by SE). But I think we might as well directly convert it to a p-value given that its asymptotic distribution, it would be the most easy and "intuitive" to read (rather than the z-value). EDIT: forget what I said about the duplicated code it wouldn't work |
Here is a skeleton, maybe it would make more sense to add it as a method of
.test_performance_bayesian <- function(objects, criterion="loo") {
# Compute differences ----
if (criterion %in% c("loo", "looic")) {
diff <- brms::loo_compare(lapply(objects, loo::loo))
col <- "looic"
} else if (criterion == "waic") {
diff <- brms::loo_compare(lapply(objects, loo::waic))
col <- "waic"
} else {
stop("Criterion not supported")
}
# Format ----
out <- as.data.frame(diff)
out$Name <- rownames(out)
# Select columns
out <- out[, c("Name", col, "elpd_diff", "se_diff")]
# Compute p-value ----
# ELPD difference is asympatotically normal, hence the z-score is computed
z <- out$elpd_diff[-1] / out$se_diff[-1]
out$p <- c(NA, 2 * pnorm(-abs(z)))
# Return
out
}
m1 <- brms::brm(wt ~ mpg, data=mtcars, refresh=0)
#> Compiling Stan program...
#> Start sampling
m2 <- brms::brm(wt ~ mpg + hp, data=mtcars, refresh=0)
#> Compiling Stan program...
#> Start sampling
m3 <- brms::brm(wt ~ mpg + hp + qsec, data=mtcars, refresh=0)
#> Compiling Stan program...
#> Start sampling
objects <- insight::ellipsis_info(m1, m2, m3, only_models = TRUE)
.test_performance_bayesian(objects)
#> Warning: Found 1 observations with a pareto_k > 0.7 in model 'X[[i]]'. We
#> recommend to set 'moment_match = TRUE' in order to perform moment matching for
#> problematic observations.
#> Warning: Found 1 observations with a pareto_k > 0.7 in model 'X[[i]]'. We
#> recommend to set 'moment_match = TRUE' in order to perform moment matching for
#> problematic observations.
#> Name looic elpd_diff se_diff p
#> m3 m3 45.99388 0.000000 0.000000 NA
#> m1 m1 50.78136 -2.393738 2.318327 0.30182476
#> m2 m2 53.35469 -3.680406 2.208913 0.09568128 Created on 2024-04-23 with reprex v2.0.2 |
I suggest we do LOOIC (with weights) and SE. If folks want ELPD or WAIC, they can request those instead. The SE is useful in the same way as for ELPD, whereas the weights are more like a transformed point estimate. |
The current output would benefit from some streamlining.
Suggestions:
What do you think?
The text was updated successfully, but these errors were encountered: