Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add an explicit dummy prior for predictive modeling #7372

Open
jessegrabowski opened this issue Jun 18, 2024 · 1 comment
Open

ENH: Add an explicit dummy prior for predictive modeling #7372

jessegrabowski opened this issue Jun 18, 2024 · 1 comment

Comments

@jessegrabowski
Copy link
Member

Before

import pymc as pm
import numpy as np

with pm.Model() as m:
    mu = pm.Normal('mu')
    sigma = pm.Exponential('sigma', 1)
    obs = pm.Normal('obs', mu, sigma, observed=np.random.normal(size=(100,))
    idata = pm.sample()

# Do a "predictive model", changing the mean
with pm.Model() as predictive_model:
    mu = pm.Normal('new_mu', mu=10, sigma=1)    
    sigma = pm.Flat('sigma')
    idata_oos = pm.sample_posterior_predictive(idata, predictions=True)

After

import pymc as pm
import numpy as np

with pm.Model() as m:
    mu = pm.Normal('mu')
    sigma = pm.Exponential('sigma', 1)
    obs = pm.Normal('obs', mu, sigma, observed=np.random.normal(size=(100,))
    idata = pm.sample()

# Do a "predictive model", changing the mean
with pm.Model() as predictive_model:
    mu = pm.Normal('new_mu', mu=10, sigma=1)    
    sigma = pm.FromIData('sigma')
    idata_oos = pm.sample_posterior_predictive(idata, predictions=True)

Context for the issue:

It is extremely convenient to use pm.Flat as a "dummy distribution" when doing predictive modeling as shown above. This is because it has no random method, and thus ensures that an error will be raised by pm.sample_posterior_predictive if something goes wrong with name matching between the variables in idata and those declared in predictive_model. This is clearly an off-label use for pm.Flat, though. It's also not at all obvious why someone would want to do this without being in the know; the resulting code is not readable.

I propose to add a dummy distribution specifically for this purpose, that would make it obvious to a reader which variables are being targeted for sampling from the idata, and which are being given new values. I don't have any great ideas about the name, except to avoid the name pm.FromPosterior, which suggests to users that it could be used to do some kind of iterative sampling (like what pmx.prior_from_idata or does).

@jessegrabowski jessegrabowski changed the title ENH: Add an explicit pm.FromIdata dummy prior for predictive modeling ENH: Add an explicit dummy prior for predictive modeling Jun 18, 2024
@ricardoV94
Copy link
Member

Sounds good and the perform method can even raise a nice error message if a graph tries to evaluate it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants