You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal is to allow parameters (possibly from different classes) to be transformed before they are used as arguments to distributions. For example, linear combinations of normally-distributed parameters can still be used as the mean of a Gaussian observation. This may be useful in cases where we want to model multiple causes; for example, the rent of an apartment may be normally distributed around a linear combination of parameters representing the effects of (1) the apartment's location, (2) the apartment's size, (3) the apartment's landlord, etc.
There are (at least) two general strategies we could take for supporting this:
We could maintain as global state during inference a mean vector and covariance matrix for the multivariate Gaussian posterior over all MeanParameters in a program, updating it as necessary when values are observed or unobserved. Then, we could resample all MeanParameters jointly in a single blocked Gibbs update. This is probably the best approach from an inference perspective, as it fully takes into account all posterior correlations between the variables. I haven't yet worked out the math for what the update rules would be, or how expensive they'd be. A useful reference would be Dario Stein's recent paper, which describes the implementation of a language with Gaussian variables and affine transformations that supports exact conditioning, and uses the "maintain a covariance matrix and mean vector" approach: https://arxiv.org/pdf/2101.11351.pdf
We could perform individual Gibbs updates separately on each MeanParameter. Then when observing that N(x1+x2, sigma) == y, we think of it as an observation of x1 as y-x2 when updating x1, and of x2 as y-x1 when updating x2. This requires fewer changes to the current architecture, at the cost of possibly worse inference (more Gibbs samples are needed to converge to the same local posterior that the blocked update would have sampled from directly).
The text was updated successfully, but these errors were encountered:
The goal is to allow parameters (possibly from different classes) to be transformed before they are used as arguments to distributions. For example, linear combinations of normally-distributed parameters can still be used as the mean of a Gaussian observation. This may be useful in cases where we want to model multiple causes; for example, the rent of an apartment may be normally distributed around a linear combination of parameters representing the effects of (1) the apartment's location, (2) the apartment's size, (3) the apartment's landlord, etc.
There are (at least) two general strategies we could take for supporting this:
We could maintain as global state during inference a mean vector and covariance matrix for the multivariate Gaussian posterior over all
MeanParameter
s in a program, updating it as necessary when values are observed or unobserved. Then, we could resample allMeanParameter
s jointly in a single blocked Gibbs update. This is probably the best approach from an inference perspective, as it fully takes into account all posterior correlations between the variables. I haven't yet worked out the math for what the update rules would be, or how expensive they'd be. A useful reference would be Dario Stein's recent paper, which describes the implementation of a language with Gaussian variables and affine transformations that supports exact conditioning, and uses the "maintain a covariance matrix and mean vector" approach: https://arxiv.org/pdf/2101.11351.pdfWe could perform individual Gibbs updates separately on each MeanParameter. Then when observing that N(x1+x2, sigma) == y, we think of it as an observation of x1 as y-x2 when updating x1, and of x2 as y-x1 when updating x2. This requires fewer changes to the current architecture, at the cost of possibly worse inference (more Gibbs samples are needed to converge to the same local posterior that the blocked update would have sampled from directly).
The text was updated successfully, but these errors were encountered: