- Added new
method
argument tocalibrate()
to support isotonic regression calibration as described by van der Laan el al. (2024).
-
Fixes to tests for CRAN.
-
Improvements to weight calculation for continuous treatments with small densities.
-
vcov()
,summary()
,anova()
, andconfint()
forglm_weightit
objects (and their relatives) now have avcov
argument that can be used to specify how the variance matrix is computed. This makes it possible to compute a variance matrix different from the one specified in the model fitting call without having to refit the model.anova()
now displays which variance matrix was used. -
Added
update()
methods forglm_weightit
,multinom_weightit
,ordinal_weightit
, andcoxph_weightit
objects to update the model formula, dataset, or variance matrix. Updating the dataset also refits theweightit
object included, if any. -
anova()
forglm_weightit
objects gets its own help page athelp("anova.glm_weightit()")
. -
Changed defaults with
missing = "saem"
for binary and multi-category treatments to bypass a bug inmisaem
code. (#71) -
Preemptively fixed some bugs related to the use of
missing
, including whenmissing
is used withby
. -
The missingness method (if any) is now included in the output of
weightit()
,weightitMSM()
, andweightit.fit()
and is printed when using theprint()
method for these objects. -
When
missing = "saem"
, usingvcov = "FWB"
inglm_weightit()
, etc., now appropriately results in an error. (#71) -
model.matrix.ordinal_weightit()
now excludes the(Intercept)
column. -
Fixed a bug with
predict.multinom_weightit()
andpredict.ordinal_weightit()
when the outcome was not included innewdata
. -
Typo fixes in documentation.
-
Added
anova()
methods forglm_weightit
,multinom_weightit
,ordinal_weightit
, andcoxph_weightit
objects to perform Wald tests for comparing nested models. The models do not have to be symbolically nested. -
Added the new user-facing object
.weightit_methods
, which contains information on each method and the options allowed with it. This is used withinWeightIt
for checking arguments but can also be used by other package developers who call functions inWeightIt
. Seehelp(".weightit_methods")
for details. -
plot.weightit()
can be used withmethod = "optweight"
to display the dual variables. -
missing
no longer allows partial matching. -
moments
can now be set to 0 whenquantile
is supplied to ensure balance on the quantiles without the moments for the methods that acceptsquantiles
. Thanks to @BERENZ for the suggestion. -
For
ordinal_weightit
objects,summary()
now has the option to omit thresholds from the output. -
Fixed a bug in
ordinal_weightit()
where the Hessian (and therefore the HC0 robust variance) were calculated incorrectly when come coefficients were aliased (i.e., due to linearly dependent predictors). -
Fixed a bug in
print.summary.glm_weightit()
when confidence intervals were requested. A new printing function is used that produces slightly nicer tables. -
Fixes to vignettes and tests to satisfy CRAN checks.
-
Minor bug, performance, and readability fixes.
-
Added two new functions,
multinom_weightit()
andordinal_weightit()
for multinomial logistic regression and ordinal regression with capabilities to estimate a covariance matrix that accounts for estimation of the weights using M-estimation. Previously, multinomial logistic regression could be requested usingglm_weightit()
withfamily = "multinomial"
; this has been deprecated. -
M-estimation can now be used for weighting with ordinal regression for weights with multi-category ordered treatments with
method = "glm"
. -
M-estimation can now be used with bias-reduced regression as implemented in
brglm2
for the propensity score (method = "glm"
withlink = "br.{.}"
) and for the outcome model (glm_weightit()
withbr = TRUE
). Thanks to Ioannis Kosmidis for supplying some starter code to implement this. -
For any weighting methods with continuous treatments that support a
density
argument to specify the numerator and denominator densities of the weights,density
can now be specified as"kernel"
to request kernel density estimation. Previously, this was requested by settinguse.kernel = TRUE
, which is now deprecated. -
Standard errors are now correctly computed when an offset is included in
glm_weightit()
. Thanks to @zeynepbaskurt. (#63) -
Improved robustness of
get_w_from_ps()
to propensity scores of 0 and 1. -
Updates to
weightit()
withmethod = "gbm"
:use.offset
is now tunable.- The same random seed is used across specifications as requested by @mitchellcameron123. (#64)
- For binary and multi-category treatments with cross-validation used as the criterion,
class.stratify.cv
is now set toTRUE
by default to stratify on treatment. - For continuous treatments, the default density now corresponds to the distribution requested.
plot()
can be used on the output ofweightit()
to display the results of the tuning process; seehelp("plot.weightit")
for details.- Fixed a bug where
distribution
was not included in the output when tuned. - Fixed a bug when propensity scores were estimated to be 0 or 1. Thanks to @mitchellcameron123. Propensity scores are now shifted slightly away from 0 or 1. (#64)
-
When using
weightit()
withmethod = "super"
for binary and multi-category treatments, cross-validation now stratifies on treatment, as recommended by Phillips et al. (2023). -
Fixed a bug and clarified some error messages when using ordered treatments with
method = "glm"
. Thanks to Steve Worthington for pointing them out. -
Updated the help page of
get_w_from_ps()
to include formulas for the weights.
-
Added a new function,
coxph_weightit()
, for fitting Cox proportional hazards models in the weighted sample, with the option of accounting for estimation of the weights in computing standard errors via bootstrapping. This function uses thesummary()
andprint()
methods forglm_weightit
objects, which are different from those forcoxph
objects. -
glm_weightit()
gets a newprint()
method that omits some invalid statistics displayed by theprint()
method forglm
objects and displays the type of standard error estimated. -
summary.glm_weightit()
(which is also used forcoxph_weightit
objects) gets a new argument,transform
, which can be used to transform the displayed coefficients and confidence interval bounds (if requested), e.g., by exponentiating them. -
M-estimation is now supported for
method = "glm"
with continuous treatments. -
A new estimator is now used for
method = "cbps"
with longitudinal treatments (i.e., usingweightitMSM()
). Previously, the weights from CBPS applied to each time point were multiplied together. Now, balance at all time points is optimized using a single set of weights. This implementation is close to that described by Huffman and van Gameren (2018), not that of Imai and Ratkovic (2015). -
A new estimator is now used for
method = "cbps"
with continuous treatments. The unconditional mean and variance are now included as parameters to be estimated. For the just-identified CBPS, this will typically improve balance, but results will depart from those found usingCBPS::CBPS()
. -
For point treatments (i.e., using
weightit()
), thestabilize
argument has some new behavior. It can now be be specified as a formula, and the stabilization factor is estimated separately and included in the M-estimation if allowed. It can now only be used whenestimand = "ATE"
(weights for other estimands should not be stabilized). -
For binary treatments with
method = "glm"
,link
can now be specified as"flic"
or"flac"
to use Firth corrected logistic regression as implemented in thelogistf
package. -
With
method = "gbm"
, an error is now thrown ifcriterion
(formerly known asstop.method
) is supplied as anything other than a string. -
For binary and continuous treatments with
method = "gbm"
, a new argument,use.offset
, can be supplied, which, ifTRUE
, uses the linear predictor from a generalized linear model as an offset in the boosting model, which can improve performance. -
Added a section on conducting moderation analysis to the estimating effect vignette (
vignette("estimating-effects")
). -
Fixed a bug when using M-estimation for sequential treatments with
weightitMSM()
andstabilize = TRUE
. Standard errors incorrectly accounted for estimation of the stabilization factor; they are now correct. -
Fixed a bug when using
method = "ipt"
for the ATE. -
Fixed a bug when some coefficients were aliased for
glm_weightit()
. Thanks to @kkwi5241. -
Updated kernel balancing example in
method_user
. -
Improved warnings and errors for bad models throughout the package.
-
Added a new function,
glm_weightit()
(along with wrapperlm_weightit()
) and associated methods for fitting generalized linear models in the weighted sample, with the option of accounting for estimation of the weights in computing standard errors via M-estimation or two forms of bootstrapping.glm_weightit()
also supports multinomial logistic regression in addition to all models supported byglm()
. Cluster-robust standard errors are supported, and output is compatible with any functions that acceptglm()
objects. Not all weighting methods support M-estimation, but for those that do, a new component is added to theweightit
output object. Currently, GLM propensity scores, entropy balancing, just-identified CBPS, and inverse probability tilting (described below) support M-estimation-based standard errors withglm_weightit()
. -
Added inverse probability tilting (IPT) as described by Graham, Pinto, and Egel (2012), which can be requested by setting
method = "ipt"
. Thus is similar to entropy balancing and CBPS in that it enforces exact balance and yields a propensity score, but has some theoretical advantages to both methods. IPT does not rely on any other packages and runs very quickly. -
Estimating covariate balancing propensity score weights (i.e.,
method = "cbps"
) no longer depends on theCBPS
package. The default is now the just-identified versions of the method; the over-identified version can be requested by settingover = TRUE
. The ATT for multi-category treatments is now supported, as are arbitrary numbers of treatment groups (CBPS
only natively support up to 4 groups and only the ATE for multi-category treatments). For binary treatments, generalized linear models other than logistic regression are now supported (e.g., probit or Poisson regression). -
New function
calibrate()
to apply Platt scaling to calibrate propensity scores as recommended by Gutman et al. (2024). -
A new argument
quantile
can be supplied toweightit()
with all the methods that acceptmoments
andint
("ebal"
,"cbps"
,"ipt"
,"npcbps"
,"optweight"
, and"energy"
). This allows one to request balance on the quantiles of the covariates, which can add some robustness as demonstrated by Beręsewicz (2023). -
as.weightit()
now has a method forweightit.fit
objects, which now have additional components included in the output. -
trim()
now has adrop
argument; setting toTRUE
sets the weights of all trimmed units to 0 (effectively dropping them). -
When using
weightit()
with a continuous treatment and amethod
that estimates the generalized propensity score (e.g.,"glm"
,"gbm"
,"super"
), sampling weights are now be incorporated into the density whenuse.kernel = FALSE
(the default) when supplied tos.weights
. Previously they were ignored in calculating the density, but have always been and remain used in the modeling the treatment (when allowed). -
Fixed a bug when
criterion
was not specified when usingmethod = "gbm"
. -
Fixed a bug when
ps
was supplied for continuous treatments. Thanks to @taylordunn. (#53) -
Warning messages now display immediately rather than at the end of evaluation.
-
The vignettes have been changed to use a slightly different estimator for weighted g-computation. The estimated weights are no longer to be included in the call to
avg_comparisons()
, etc.; that is, they are only used to fit the outcome model. This makes the estimators more consistent with other software, includingteffects ipwra
in Stata, and most of the literature on weighted g-computation. Note this will not effect any estimates for the ATT or ATC and will only yield at most minor changes for the ATE. For other estimands (e.g., ATO), the weights are still to be included. -
The word "multinomial" to describe treatments with more than two categories has been replaced with "multi-category" in all documentation and messages.
-
Transferred all help files to Roxygen and reorganized package scripts.
-
Reorganization of some functions.
-
Fixed a bug when using
estimand = "ATC"
with multi-category treatments. (#47) -
Fixed a bug in the Estimating Effects vignette. (#46)
-
cobalt
version 4.5.1 or greater is now required. -
Fixed a bug when using balance Super Learner with
cobalt
4.5.1. -
Added a section to the Estimating Effects vignette (
vignette("estimating-effects")
) on estimating the effect of a continuous treatment after weighting.
-
Added energy balancing for continuous treatments, requested using
method = "energy"
, as described in Huling et al. (2023). These weights minimize the distance covariance between the treatment and covariates while maintaining representativeness. This method supports exact balance constraints, distributional balance constraints, and sampling weights. The implementation is similar to that in theindependenceWeights
package. See?method_energy
for details. -
Added a new vignette on estimating effects after weighting, accessible using
vignette("estimating-effects", package = "WeightIt")
. The new workflow relies on themarginaleffects
package. The main vignette (vignette("WeightIt")
) has been modernized as well. -
Added a new dataset,
msmdata
, to demonstrate capabilities for longitudinal treatments.twang
is no longer a dependency. -
Methods that use a balance criterion to select a tuning parameter, in particular GBM and balance Super Learner, now rely on
cobalt
'sbal.init()
andbal.compute()
functionality, which adds new balance criteria. Thestop.method
argument for these functions has been renamed tocriterion
andhelp("stop.method")
has been removed; the same page is now available athelp("bal.compute", package = "cobalt")
, which describes the additional statistics available. This also fixes some bugs that were present in some balance criteria. -
Renamed
method = "ps"
tomethod = "glm"
."ps"
continues to work as it always had for back compatibility."glm"
is a more descriptive name since many methods use propensity scores; what distinguishes this method is that it uses generalized linear models. -
Using
method = "ebcw"
for empirical balancing calibration weighting is no longer available because theATE
package has been removed. Usemethod = "ebal"
for entropy balancing instead, which is essentially identical. -
Updated the
trim()
documentation to clarify the form of trimming that is implemented (i.e., winsorizing). Suggested by David Novgorodsky. -
Fixed bugs when some
s.weights
are equal to zero withmethod = "ebal"
, "cbps"
, and"energy"
. Suggested by @statzhero. (#41) -
Improved performance of
method = "energy"
for the ATT. -
Fixed a bug when using
method = "energy"
withby
. -
With
method = "energy"
, settingint = TRUE
automatically setsmoments = 1
if unspecified. -
Errors and warnings have been updated to use
chk
. -
The missingness indicator approach now imputes the variable median rather than 0 for missing values. This will not change the performance of most methods, but change others, and doesn't affect balance assessment.
-
For ordinal multi-category treatments, setting
link = "br.logit"
now usesbrglm2::bracl()
to fit a bias-reduced ordinal regression model. -
Added the vignette "Installing Supporting Packages" to explain how to install the various packages that might be needed for
WeightIt
to use certain methods, including when the package is not on CRAN. See the vignette atvignette("installing-packages")
. -
Fixed a bug that would occur when a factor or character predictor with a single level was passed to
weightit()
. -
Improved the code for entropy balancing, fixing a bug when using
s.weights
with a continuous treatment and improving messages when the optimization fails to converge. (#33) -
Improved robustness of documentation to missing packages.
-
Updated the logo, thanks to Ben Stillerman.
-
Fixed a bug that would occur when the
formula.tools
package was loaded, which would occur most commonly whenlogistf
was loaded. It would cause the errorThe treatment and covariates must have the same number of units.
(#25) -
Fixed a bug where the
info
component would not be included in the output ofweightit()
when usingmethod = "super"
. -
Added the ability to specify
num.formula
as a list of formulas inweightitMSM()
. This is primarily to get around the fact that whenstabilize = TRUE
, a fully saturated model with all treatments is used to compute the stabilization factor, which, for many time points, is time-consuming and may be impossible (especially if not all treatment combinations are observed). Thanks to @maellecoursonnais for bringing up this issue (#27). -
ps.cont()
has been retired since the same functionality is available usingweightit()
withmethod = "gbm"
and in thetwangContinuous
package. -
With
method = "energy"
, a new argument,lambda
, can be supplied, which puts a penalty on the square of the weights to control the effective sample size. Typically this is not needed but can help when the balancing is too aggressive. -
With
method = "energy"
,min.w
can now be negative, allowing for negative weights. -
With
method = "energy"
,dist.mat
can now be supplied as the name of a method to compute the distance matrix:"scaled_euclidean"
,"mahalanobis"
, or"euclidean"
. -
Support for negative weights added to
summary()
. Negative weights are possible (though not by default) when usingmethod = "energy"
ormethod = "optweight"
. -
Fixed a bug where
glm()
would fail to converge withmethod = "ps"
for binary treatments due to bad starting values. (#31) -
miss = "saem"
can once again be used withmethod = "ps"
when missing values are present in the covariates. -
Fixed bugs with processing input formulas.
-
An error is now thrown if an incorrect
link
is supplied withmethod = "ps"
.
-
The use of
method = "twang"
has been retired and will now give an error message. Usemethod = "gbm"
for nearly identical functionality with more options, as detailed at?method_gbm
. -
With multinomial treatments with
link = "logit"
(the default), if themclogit
package is installed, it can be requested for estimating the propensity score by setting the optionuse.mclogit = TRUE
, which usesmclogit::mblogit()
. It should give the same results as the default, which usesmlogit
, but can be faster and so is recommended. -
Added a
plot()
method forsummary.weightitMSM
objects that functions just likeplot.summary.weightit()
for each time point. -
Fixed a bug in
summary.weightit()
where the labels of the top weights were incorrect. Thanks to Adam Lilly. -
Fixed a bug in
sbps()
when using a stochastic search (i.e.,full.search = FALSE
or more than 8 moderator levels). (#17) -
Fixed a bug that would occur when all weights in a treatment group were
NA
. Bad weights (i.e., all the same) now produce a warning rather than an error so the weights can be diagnosed manually. (#18) -
Fixed a bug when using
method = "energy"
withestimand = "ATE"
andimproved = TRUE
(the default). The between-treatment energy distance contribution was half of what it should have been; this has now been corrected. -
Added L1 median measure as a balance criterion. See
?stop.method
for details. -
Fixed a bug where logical treatments would yield an error. (#21)
-
Fixed a bug where
Warning: Deprecated
would appear sometimes whenpurrr
(part of thetidyverse
) was loaded. (#22) Thanks to MrFlick on StackOverflow for the solution.
-
Added support for estimating propensity scores using Bayesian additive regression trees (BART) with
method = "bart"
. This method fits a BART model for the treatment using functions in thedbarts
package to estimate propensity scores that are used in weights. Binary, multinomial, and continuous treatments are supported. BART uses Bayesian priors for its hyperparameters, so no hyperparameter tuning is necessary to get well-performing predictions. -
Fixed a bug when using
method = "gbm"
withstop.method = "cv{#}"
. -
Fixed a bug when setting
estimand = "ATC"
for methods that produce a propensity score. In the past, the output propensity score was the probability of being in the control group; now, it is the probability of being in the treated group, as it is for all other estimands. This does not affect the weights. -
Setting
method = "twang"
is now deprecated. Usemethod = "gbm"
for improved performance and increased functionality.method = "twang"
relies on thetwang
package;method = "gbm"
callsgbm
directly. -
Using
method = "ebal"
no longer requires theebal
package. Instead,optim()
is used, as it has been with continuous treatments. Balance is a little better, but some options have been removed. -
When using
method = "ebal"
with continuous treatments, a new argument,d.moments
, can now be specified. This controls the number of moments of the covariate and treatment distributions that are constrained to be the same in the weighted sample as they are in the original sample. Vegetabile et al. (2020) recommend settingd.moments
to at least 3 to ensure generalizability and reduce bias due to effect modification. -
Made some minor changes to
summary.weightit()
andplot.summary.weightit()
. Fixed how negative entropy was computed. -
The option
use.mnlogit
inweightit()
with multi-category treatments andmethod = "ps"
has been removed becausemnlogit
appears uncooperative. -
Fixed a bug (#16) when using
method = "cbps"
with factor variables, thanks to @danielebottigliengo. -
Fixed a bug when using binary factor treatments, thanks to Darren Stewart.
-
Cleaned up the documentation.
- Fixed a bug where treatment values were accidentally switched for some methods.
-
With
method = "gbm"
, added the ability to tune hyperparameters likeinteraction.depth
anddistribution
using the same criteria as is used to select the optimal tree. A summary of the tuning results is included ininfo
in theweightit
output object. -
Fixed a bug where
moments
andint
were ignored unless both were specified. -
Effective sample sizes now print only up to two digits (believe me, you don't need three) and print more cleanly with whole numbers.
-
Fixed a bug when using
by
, thanks to @frankpopham. (#11) -
Fixed a bug when using
weightitMSM
with methods that processint
andmoments
(though you probably shouldn't use them anyway). Thanks to Sven Reiger. -
Fixed a bug when using
method = "npcbps"
where weights could be excessively small and mistaken for all being the same. The weights now sum to the number of units.
-
Added support for energy balancing with
method = "energy"
. This method minimizes the energy distance between samples, which is a multivariate distance measure. This method uses code written specifically forWeightIt
(i.e., it does not call a package specifically designed for energy balancing) using theosqp
package for the optimization (same asoptweight
). See Huling & Mak (2020) for details on this method. Also included is an option to require exact balance on moments of the covariates while minimizing the energy distance. The method works for binary and multinomial treatments with the ATE, ATT, or ATC. Sampling weights are supported. Because the method requires the calculation and manipulation of a distance matrix for all units, it can be slow and/or memory intensive for large datasets. -
Improvements to
method = "gbm"
and tomethod = "super"
withSL.method = "method.balance"
. A new suite ofstop.method
s are allowed. For binary treatments, these include the energy distance, sample Mahalanobis distance, and pseudo-R2 of the weighted treatment model, among others. See?stop.method
for allowable options. In addition, performance for both is quite a bit faster. -
With multinomial treatments with
link = "logit"
(the default), if themnlogit
package is installed, it can be requested for estimating the propensity score by setting the optionuse.mnlogit = TRUE
. It should give the same results as the default, which usesmlogit
, but can be faster for large datasets. -
Added option
estimand = "ATOS"
for the "optimal subset" treatment effect as described by Crump et al. (2009). This estimand finds the subset of units who, with ATE weights applied, yields a treatment effect with the lowest variance, assuming homoscedasticity (and other assumptions). It is only available for binary treatments withmethod = "ps"
. In general it makes more sense to useestimand = "ATO"
if you want a low-variance estimate and don't care about the target population, but I added this here for completeness. It is available inget_w_from_ps()
as well. -
make_full_rank()
is now faster. -
Cleaning up of some error messages.
-
Fixed a bug when using
link = "log"
formethod = "ps"
with binary treatments. -
Fixed a bug when using
method = "cbps"
with continuous treatments and sampling weights. Previously the returned weights included the sampling weights multiplied in; now they are separated, as they are in all other scenarios and for all other methods. -
Improved processing of non-0/1 binary treatments, including for
method = "gbm"
. A guess will be made as to which treatment is considered "treated"; this only affects produced propensity scores but not weights. -
Changed default value of
at
intrim()
from .99 to 0. -
Added output for the number of weights equal to zero in
summary.weightit
. This can be especially helpful when using"optweight"
or"energy"
methods or when usingestimand = "ATOS"
.
-
Added support for entropy balancing (
method = "ebal"
) for continuous treatments as described by Tübbicke (2020). Relies on hand-written code contributed by Stefan Tübbicke rather than another R package. Sampling weights and base weights are both supported as they are with binary and multi-category treatments. -
Added support for Balance SuperLearner as described by Pirracchio and Carone (2018) with
method = "super"
. Rather than using NNLS to choose the optimal combination of predictions, you can now optimize balance. To do so, setSL.method = "method.balance"
. You will need to set an argument tostop.method
, which works identically to how it does formethod = "gbm"
. For example, forstop.method = "es.max"
, the predicted values given will be the combination of predicted values that minimizes the largest absolute standardized mean difference of the covariates in the sample weighted using the predicted values as propensity scores. -
Changed some of the statistics displayed when using
summary()
: the weight ratio is gone (because weights can be 0, which is not problematic but would explode the ratio), and the mean absolute deviation and entropy of the weights are now present. -
Added
crayon
for prettier printing ofsummary()
output.
-
Formula interfaces now accept
poly(x, .)
and other matrix-generating functions of variables, including therms
-class-generating functions from therms
package (e.g.,pol()
,rcs()
, etc.) (therms
package must be loaded to use these latter ones) and thebasis
-class-generating functions from thesplines
package (i.e.,bs()
andns()
). A bug in an early version of this was found by @ahinton-mmc. -
Added support for marginal mean weighting through stratification (MMWS) as described by Hong (2010, 2012) for
weightit()
andget_w_from_ps()
through thesubclass
argument (see References at?get_w_from_ps
). With this method, subclasses are formed based on the propensity score and weights are computed based on the number of units in each subclass. MMWS can be used with any method that produces a propensity score. The implementation here ensures all subclasses have a least one member by filling in empty subclasses with neighboring units. -
Added
stabilize
option toget_w_from_ps()
. -
A new
missing
argument has been added toweightit()
to choose how missing data in the covariates is handled. For most methods, only"ind"
(i.e., missing indicators with single-value imputation) is allowed, but for"ps"
,"gbm"
, and"twang"
, other methods are possible. Formethod = "ps"
, a stochastic approximation of the EM algorithm (SAEM) can be used through themisaem
package by settingmissing = "saem"
. -
For continuous treatments with the
"ps"
,"gbm"
, and"super"
methods (i.e., where the conditional density of the treatment needs to be estimated), the user can now supply their own density as a string or function rather than using the normal density or kernel density estimation. For example, to use the density of the t-distribution with 3 degrees of freedom, one can setdensity = "dt_3"
. T-distributions often work better than normal distributions for extreme values of the treatment. -
Some methods now have an
info
component in the output object. This contains information that might be useful in diagnosing or reporting the method. For example, whenmethod = "gbm"
,info
contains the tree that was used to compute the weights and the balance resulting from all the trees, which can be plotted usingplot()
. Whenmethod = "super"
,info
contains the coefficients in the stacking model and the cross-validation risk of each of the component methods. -
For
method = "gbm"
, the best tree can be chosen using cross validation rather than balance by settingstop.method = "cv5"
, e.g., to do 5-fold cross-validation. -
For
method = "gbm"
, a new optional argumentstart.tree
can be set to select the tree at which balance begins to be computed. This can speed things up when you know that the best tree is not within the first 100 trees, for example. -
When using
method = "gbm"
with multi-category treatments and estimands other than theATE
,ATT
, orATC
are used with standardized mean differences as the stopping rule, the mean differences will be between the weighted overall sample and each treatment group. Otherwise, some efficiency improvements. -
When using
method = "ps"
with multi-category treatments, the use ofuse.mlogit = FALSE
to request multiple binary regressions instead of multinomial regression is now documented and an associated bug is now fixed, thanks to @ahinton-mmc. -
When use
method = "super"
, one can now setdiscrete = TRUE
to use discrete SuperLearner instead of stacked SuperLearner, but you probably shouldn't. -
moments
andint
can now be used withmethod = "npcbps"
. -
Performance enhancements.
-
Fixed bug when using
weightit()
inside another function that passed aby
argument explicitly. Also changed the syntax forby
; it must now either be a string (which was always possible) or a one-sided formula with the stratifying variable on the right-hand side. To use a variable that is not indata
, you must use the formula interface. -
Fixed bug when trying to use
ps
withby
inweightit()
.
-
Added new
sbps()
function for estimating subgroup balancing propensity score weights, including both the standard method and a new smooth version. -
Setting
method = "gbm"
andmethod = "twang"
will now do two different things.method = "gbm"
usesgbm
andcobalt
functions to estimate the weights and is much faster, whilemethod = "twang"
usestwang
functions to estimate the weights. The results are similar between the two methods. Prior to this version,method = "gbm"
andmethod = "twang"
both did whatmethod = "twang"
does now. -
Bug fixes when
stabilize = TRUE
, thanks to @ulriksartipy and Sven Rieger. -
Fixes for using
base.weight
argument withmethod = "ebal"
. Now the supplied vector should have a length equal to the number of units in the dataset (in contrast to its use inebalance
, which requires a length equal to the number of control units). -
Restored dependency on
cobalt
for examples and vignette. -
When
method = "ps"
and the treatment is ordered (i.e., ordinal),MASS::polr()
is used to fit an ordinal regression. Make the treatment un-ordered to to use multinomial regression instead. -
Added support for using bias-reduced fitting functions when
method = "ps"
as provided by thebrglm2
package. These can be accessed by changing thelink
to, for example,"br.logit"
or"br.probit"
. For multinomial treatments, settinglink = "br.logit"
fits a bias-reduced multinomial regression model usingbrglm2::brmultinom()
. This can be helpful when regular maximum likelihood models fail to converge, though this may also be a sign of lack of overlap.
-
Bug fixes. Functions now work better when used inside other functions (e.g.,
lapply
). -
Behavior of
weightit()
in the presence of non-NULL
focal
has changed. Whenfocal
is specified,estimand
is assumed to beATT
. Previously,focal
would be ignored unlessestimand = "ATT"
. -
Processing of
estimand
andfocal
is improved. Functions are smarter about guessing which group is the focal group when one isn't specified, especially with non-numeric treatments.focal
can now be used withestimand = "ATC"
to indicate which group is the control group, so"ATC"
and"ATT"
now function more similarly. -
Added function
get_w_from_ps()
to transform propensity scores into weights (instead of having to go throughweightit()
). -
Added functions
as.weightit()
andas.weightitMSM()
to convert weights and treatments and other components intoweightit
objects so thatsummary.weightit()
can be used on them. -
Updated documentation to describe how missing data in the covariates is handled. Some bugs related to missing data have been fixed as well, thanks to Yong Hao Pua.
-
ps.cont()
had the "z-transformed correlation" options removed to simplify output. This function and its supporting functions will be deprecated as soon as the new version oftwang
is released. -
When using
method = "ps"
ormethod = "super"
with continuous treatments, settinguse.kernel = TRUE
andplot = TRUE
, the plot is now made withggplot2
rather than the base R plots. -
Added
plot.summary.weightit()
to plot the distribution of weights (a feature also inoptweight
). -
Removed dependency on
cobalt
temporarily, which means the examples and vignette won't run. -
Added
ggplot2
to Imports.
-
Fixed a bug when using the
ps
argument inweightit()
. -
Fixed a bug when setting
include.obj = TRUE
inweightitMSM()
. -
Added warnings for using certain methods with longitudinal treatments as they are not validated and may lead to incorrect inferences.
-
Added
super
method to estimate propensity scores using theSuperLearner
package. -
Added
optweight
method to estimate weights using optimization (but you should probably just use theoptweight
package). -
weightit()
now uses the correct formula to estimate weights for the ATO with multinomial treatments as described by Li & Li (2018). -
Added
include.obj
option inweightit()
andweightitMSM()
to include the fitted object in the output object for inspection. For example, withmethod = "ps"
, theglm
object containing the propensity score model will be included in the output. -
Rearranged the help pages. Each method now has its own documentation page, linked from the
weightit
help page. -
Propensity scores are now included in the output for binary treatments with
gbm
andcbps
methods. Thanks to @Blanch-Font for the suggestion. -
Other bug fixes and minor changes.
-
Added
trim()
function to trim weights. -
Added
ps.cont()
function, which estimates generalized propensity score weights for continuous treatments using generalized boosted modeling, as intwang
. This function uses the same syntax asps()
intwang
, and can also be accessed usingweightit()
withmethod = "gbm"
. Support functions were added to make it compatible withtwang
functions for assessing balance (e.g.,summary
,bal.table
,plot
). Thanks to Donna Coffman for enlightening me about this method and providing the code to implement it. -
The input formula is now much more forgiving, allowing objects in the environment to be included. The
data
argument toweightit()
is now optional. To simplify things, the output object no longer contains adata
field. -
Under-the-hood changes to facilitate adding new features and debugging. Some aspects of the output objects have been slightly changed, but it shouldn't affect use for most users.
-
Fixed a bug where variables would be thrown out when
method = "ebal"
.
-
Added new
moments
andint
options for someweightit()
methods to easily specify moments and interactions of covariates. -
Fixed bug when using objects not in the data set in
weightit()
. Behavior has changed to include transformed covariates entered in formula inweightit()
output. -
Fixed bug resulting from potential collinearity when using
ebal
orebcw
. -
Added a vignette.
-
Edits to code and help files to protect against missing
CBPS
package. -
Corrected sampling weights functionality so they work correctly. Also expanded sampling weights to be able to be used with all methods, including those that do not natively allow for sampling weights (e.g.,
ATE
). -
Minor bug fixes and spelling corrections.
-
Added
weightitMSM()
function (and supportingprint()
andsummary()
functions) to estimate weights for marginal structural models with time-varying treatments and covariates. -
Fixed some bugs, including when using CBPS with continuous treatments, and when using
focal
incorrectly.
-
Added
method = "sbw"
for stable balancing weights (now removed and replaced withmethod = "optweight"
) -
Allowed for estimation of multinomial propensity scores using multiple binary regressions if
mlogit
is not installed -
Allowed for estimation of multinomial CBPS using multiple binary CBPS for more than 4 groups
-
Added README and NEWS
- First version!