SSA-like form for better encodings #576

ThomasHaas · 2023-11-25T10:59:08Z

We have decided to not enforce SSA-form in our IR, however, this can cause worse encodings.
Whenever encoding a read access of register r, we need to iterate over all possible (last) assignments to r which can be O(n) many. With O(n) many registers, this can yield an encoding of size O(n^2).
This is expected, because the size of a Use-Def Graph can be quadratic for non-SSA programs.
However, it is known to be only linear for SSA programs.

Consider the following program:

int i
for (i = 0; i < N; i++) {
   if (__VERIFIER_nondet_bool()) {
       break;
   }
}
int square = 0;
for (int j; j < N; j++) {
    square += i;
}

In the second loop, each read of i may read N-many different values (one from each iteration of the first loop). The encoding is quadratic.
Introducing a temporary variable int temp = i; // ~PHI node after the first loop and reusing that one in the second loop will produce a linear encoding.

Since our parsed programs are originally in SSA form, one might think that only loop unrolling produces non-SSA parts of the program (which are rather small, given the usual loop bounds). However, we also do copy propagation which actually removes SSA form by eliminating phi nodes (it would remove the int temp = i; assignment from the above example).

So I suggest that we try to recover an SSA-like form after all processing (so that SCCP cannot destroy it again) by inserting simple register copy events (r_phi := r;) to act as phi nodes.

Edit: I quickly checked how many opportunities we have for this optimization, and the numbers are not really high.
Even in qspinlock there are just around 9 cases per thread, most of which have to do with the non-deterministic CAS I believe. Linux RCU had even less cases. Our usual LFDS benchmarks have 1-2 cases per thread (safe_stack is the biggest outlier with 8 cases per thread!).
The largest benchmark was ck_epoch with 33 cases per thread.
I believe this is the case because ck_epoch has a function that tries to find an element in a data structure (via a loop) and returns it. The value of the return register could come from any iteration, thus every read access to that register has many possible source events.
In conclusion, it seems that the room for optimizations is low and one would only expect benefit for benchmarks that search in data structures.
In case we try to implement this, safe_stack seems to be the best choice among our unit tests to measure the impact.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSA-like form for better encodings #576

SSA-like form for better encodings #576

ThomasHaas commented Nov 25, 2023 •

edited

Loading

SSA-like form for better encodings #576

SSA-like form for better encodings #576

Comments

ThomasHaas commented Nov 25, 2023 • edited Loading

ThomasHaas commented Nov 25, 2023 •

edited

Loading