Alternative encoding for must-sets #533

ThomasHaas · 2023-10-12T12:54:40Z

Context: We encode must-edges of a relation r via r(x, y) <=> exec(x) /\ exec(y). This is due to the fact that must-sets are conditional lower-bounds of the relation. It would likely result in a performance gain, if we could somehow justify an unconditional encoding r(x,y) <=> true in which case the relation variable can be removed altogether from the encoding.
Obviously, this is not generally sound. Consider the following program and the relation let fence = po;[F];po

e1: W(x, v);
if (...) {
   e2: Fence;
}
e3: R(y);

We have that all po-edges are in the must-set of po, and (e2, e2) is in the must-set of [F].
If we encode po;[F];po(e1, e3), we get (modulo an intermediate relation) po;[F];po(e1, e3) <=> po(e1, e2) /\ [F](e2, e2) /\ po(e2, e3), which would (wrongly) simplify to po;[F];po(e1, e3) <=> true with unconditional must-edges. The encoding would say that e1 and e3 are always separated by a fence, even if the fence was not executed.
Notice, however, that setting po(e1, e2) <=> true and po(e2, e3) <=> true alone does not cause any problems!

Solution idea: The above example shows that "spurious edges" can cause problems but they do not necessarily have to.
Observe that a spurious edge can only affect the verdict of the CAT-model if it is directly or indirectly (over derived relations) observable by the CAT-model's axioms.
If we encoded our axioms to ignore edges of non-executed events, e.g., if we encoded empty(r) = forall x,y: exec(x) /\ exec(y) => not r(x,y) then the axioms would not be affected by any spurious edge of non-executed events, so we can just set them to true.
This does not fix the above problem, because the edge po;[F];po(e1, e3) is also spurious but only involves executed events (assuming e2 was not executed).
So we need to prevent edges of non-executed events to affect edges of executed events. This can only occur in relation operators that are able to project away (non-executed) events, i.e, composition, transitive closure, range, and domain.
So all we need to do is change the encoding of those to make sure the intermediate events they reason over were indeed executed:
(a;b)(x, y) <=> exists z: a(x, z) /\ exec(z) /\ b(z, y) instead of just (a;b)(x, y) <=> exists z: a(x, z) /\ b(z, y).

One may think that the extra conditions in the axioms and the few operators counteract the gain of being able to make the encoding of must-edges unconditional. However, I believe we can actually optimize away most of these extra checks to get a striclty better encoding.
Consider, for example, empty(r) and an edge (x,y). If (x,y) is in the must-set, then the new encoding would yield exec(x) /\ exec(y) => false which is equivalent to not (exec(x) /\ exec(y)). The old encoding would have encoded not r(x,y) == not (exec(x) /\ exec(y)) directly (i.e. we get an identical encoding!). If (x,y) is not in the must-set, then we get exec(x) /\ exec(y) => not r(x,y) as opposed to just not r(x,y). However, we can still retain the invariant that for unknown edges, we have r(x,y) => exec(x) /\ exec(y), making exec(x) /\ exec(y) => not r(x,y) equivalent to not r(x, y) again!

I think it is easy to implement this and see if it gives any improvements. I don't expect too much of a benefit though, because I think our encoding already optimizes w.r.t. must-edges in many places.

The text was updated successfully, but these errors were encountered:

ThomasHaas · 2023-10-14T10:53:22Z

I tried implementing this, because of how simple it is. However, I noticed that our encoding procedure is quite inconsistent w.r.t. handling must-edges:

            EncodingContext.EdgeEncoder e0 = context.edge(rel);
            encodeSets.get(rel).apply((e1, e2) -> {
                BooleanFormula edge = e0.encode(e1, e2); // This call already returns "exec(e1) /\ exec(e2)" for must-edges.
                if (k.getMustSet().contains(e1, e2)) {
                    enc.add(bmgr.equivalence(edge, execution(e1, e2))); // This equivalence becomes trivially true "exec(e1) /\ exec(e2) <=> exec(e1) /\ exec(e2)"

Code as the above exists many times, i.e., we have many places where the EdgeEncoder already optimizes away the edge variable but the encoding still emits an (unnecessary) equivalence for the optimized variable.

While this does not cause any problems right now, the code is only correct because both the EdgeEncoder and the relation encoding agree on their optimization of must-edges.
If, for example, we define EdgeEncoder.encode(e1, e2) to return true for must-edges, the code becomes wrong because the relation definition still requires this to match with exec(e1) /\ exec(e2), i.e., we get true <=> exec(e1) /\ exec(e2).

We should choose one source of optimization: either the EdgeEncoder already returns optimized variables or the relation encoding is responsible for optimizing the variables, but not both.

ThomasHaas added the enhancement label Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative encoding for must-sets #533

Alternative encoding for must-sets #533

ThomasHaas commented Oct 12, 2023

ThomasHaas commented Oct 14, 2023 •

edited

Loading

Alternative encoding for must-sets #533

Alternative encoding for must-sets #533

Comments

ThomasHaas commented Oct 12, 2023

ThomasHaas commented Oct 14, 2023 • edited Loading

ThomasHaas commented Oct 14, 2023 •

edited

Loading