Fully fledged marginal distributions #145

joelberkeley · 2021-02-12T12:50:28Z

joelberkeley
Feb 12, 2021

I'm thinking of changing the model API to more accurately represent the notion of a probabilistic model, and to enable more powerful marginal distributions. Specifically, a ProbabilisticModel would have a marginal distribution which can be more than just a mean and covariance.

The starting point of this exploration is a Distribution type that has a mean and covariance, and we can sample from it:

ASSUMPTION all joint distributions can (at least in theory) be sampled from, and have a mean and a covariance

class Distribution(ABC):
    mean: TensorType
    covariance: TensorType

    def sample(self, sample_size: int) -> TensorType:
        ...

class ProbabilisticModel:
    def predict(self) -> Distribution:
        ...

ASSUMPTION we can fully express GPflow models with this

There are several things to note here:

We do not sample from the probabilistic model, but from the model's marginal distribution.
Specific ProbabilisticModel classes can return whatever particular distribution type they want: Gaussian, Bernoulli etc.
If we want to re-use the functionality in tfp.distributions.Distribution, we can, but we must wrap them

Sometimes we wish to know what kind of marginal distribution we have. If we're making an algorithm that only works with Gaussian marginals, it would help to be able to say that. And if we have defined extra methods on our marginal, it would help to know we can use them. We can do this by parametrizing over it in ProbabilisticModel, such that ProbabilisticModel[MultivariateNormal].predict returns a multi-variate normal distribution.

ASSUMPTION TensorFlow Probability's distributions don't offer everything we need
ASSUMPTION TensorFlow Probability's Distribution is prohibitively large to implement custom distributions

It also seems unnecessary that we must wrap a tfp.distributions.Distribution to use it. At the same time, we do want to be able to implement our own distributions (e.g. to use GPflow's sampling algorithms), and for these distributions to be easy to implement (tfp.distributions.Distribution is a very large interface). We can do this by making Distribution a Protocol* and aligning it with a subset of tfp.distribution.Distribution. This means anything with the same API as a Distribution is a distribution, whether it inherits from it or not.

Bringing these together

class Distribution(Protocol):
    def mean(self) -> TensorType:
        ...

    def covariance(self) -> TensorType:
        ...

    def sample(self, sample_shape: ShapeLike | int) -> TensorType:
        ...

DistType = TypeVar("DistType", bound=Distribution)

class ProbabilisticModel(Generic[DistType]):
    def predict(self) -> DistType:
        ...

QUESTION sometimes we don't know how to sample from our distribution, e.g. with Bayesian inference with non-trivial likelihoods, and we may also not be able to provide a covariance for our marginal.

I'm undecided about how to approach this, but for the former, we could move sample to a subclass, and not require it on all marginals.

class DistributionModes(Protocol):
    def mean(self) -> TensorType:
       ...

    def covariance(self) -> TensorType:
       ...

class Distribution(DistributionModes):
    def sample(self, sample_shape: ShapeLike | int) -> TensorType:
        ...

DistType = TypeVar("DistType", bound=DistributionModes)

class ProbabilisticModel(Generic[DistType]):
    def predict(self) -> DistType:
        ...

However, until we have intersection types in Python, I'm wary that this will lead to a complex inheritance hierarchy

NOTE bounding to tfp.distributions.Distribution would not allow for distributions that we can't sample from

* A Protocol is a formal way of specifying structural sub-typing, or duck-typing. Any class with the same methods and attributes as a particular protocol conforms to that protocol, and wherever the protocol is expected, such as a function parameter, the class can be used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fully fledged marginal distributions #145

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Fully fledged marginal distributions #145

joelberkeley Feb 12, 2021

Replies: 0 comments

joelberkeley
Feb 12, 2021