-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matching strategy for clusters that don't include both treatment groups - multilevel matching #188
Comments
That's a great question. I can think of an ad-hoc workaround that would be fairly straightforward to implement but would require some manual coding. Essentially, you do regular matching but put a large penalty on any between-cluster matches. The way you could implement this penalty would be by adding a large positive number to the distance between units in different clusters in a distance matrix. That way, between-cluster matches would only occur if the within-cluster match was impossible (e.g., because there were no units left or all remaining units were banned due to a caliper or other constraint). You would also need to match in order of closeness, i.e., by setting Here is how you might implement this using propensity score matching. #Compute PS
ps <- glm(A ~ X1 + X2 + cluster, data = data, family = binomial)$fitted
#Compute PS distance
dist <- euclidean_dist(treat ~ ps, data = lalonde)
#Create penalty matrix
cluster_dist <- euclidean_dist(treat ~ cluster, data = lalonde)
#Apply penalty matrix
dist[cluster_dist > 0] <- dist[cluster_dist > 0] + 100 * max(dist)
#Do matching
m <- matchit(A ~ X1 + X2 + cluster, data = data,
distance = dist, m.order = "closest")
#Find which treated units received matches outside their cluster
rownames(m$match.matrix)[cluster_dist[cbind(rownames(m$match.matrix), m$match.matrix[,1])] > 0] Setting the penalty to |
Thank you for this beautiful solution, Noah! Note, for some reason the code to find which treated units received matches outside their cluster does not work. It just produces a matrix of NAs. (Regardless whether I run the code on lalonde or my own test dataset.) The rest of it works perfectly. Here is the test data I'm using: And your code using the var names in the test dataset. The covariates included in the model below are just for testing purposes. The cluster variable indicating hospital is called 'DAG'.
|
Glad it worked! Change the |
Is there a workaround to get matchit to preferentially match within cluster and to find a match outside the cluster if one does not exist within? (Something similar to Cannas and Arpino (2019) CMatching "hybrid matching" that is no longer supported.)
My example,
I'm evaluating the effect of an intervention (treatment) applied to patients (subjects) in hospitals (clusters or groups). In more that a third of hospitals either all or none of the patients were exposed to the intervention. Strict within-cluster matching options require me to subset (= exclude) a large section of the study population.
I can group hospitals by hospital-level covariates to increase cluster size, but I was hoping there may be a more elegant approach to this problem that is common in my field.
The text was updated successfully, but these errors were encountered: