-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of alias analysis #369
Comments
The C-code of the benchmark contains bit-fiddling on pointers, which our alias analysis cannot handle.
Our alias analysis will assume that loads/stores involving the returned pointer can address anything, due to the bit-fiddling on pointers. |
I understand this will have an effect on the precision of the analysis, but is this also related to why running the analysis is so slow? |
Let me correct what I said. While the bit-fiddling is problematic, the fact that we load a pointer from memory should already give us a pointer without any aliasing information, so the bit-fiddling doesn't make it worse, I guess. I don't know how exactly the performance correlates to the precision, since it might also get faster due not being able to derive anything. However, this code is interesting for the alias analysis in general. |
This benchmark has around 1200 memory addresses and I ran out of Java heap space while profiling, but I got some data from that. |
The last issue is just that we store a fresh Edit: PR #374 addresses the performance loss in |
@ThomasHaas you mentioned that you discussed (before the improvement that you already implemented) with Rene some potential improvements. Do those differ of what was already implemented? I.e. can we improve performance even further or should we close this issue? |
What I discussed with René is orthogonal to what I improved so far. Overall there are more optimizable points, but I don't mind closing this issue if you think it is good enough for now. |
Right now, it is less blocking than it used to be, but I would not complain if we can get the times faster. Since we already have ideas on how to improve this, let's keep this issue open and we get those done. |
@ThomasHaas can you add to this issue the llvm file of tree-RCU so we have a concrete test showing the problem? |
Here is a reduced version of the example: rcu-test-alias.ll.zip. Edit: The alias analysis on the reduced version seems to take around 30 minutes on my machine ( |
The alias analysis did not finish overnight for @xeren do you have any idea why the performance is so bad in this example? |
On what branch do you have that code? |
Sorry (spell checker). This code |
BTW ... both the field sensitive/insensitive analysis finish below 1 sec and seems to produce good results, e.g.
|
Due to the recent update to mem2reg, we get a lot of warnings regarding pontentially uninitialized variables. EDIT: Actually, in this example proving that we do not access uninitialized registers seems to be very hard. In fact, I don't understand how this code is correct. if (count >= q.mask) {
// == (count >= size-1)
int newsize = (q.mask == 0 ? q.InitialSize : 2 * (q.mask + 1));
assert(newsize < q.MaxSize);
Obj *newtasks[STATICSIZE]; // This gets mem2reg'd
int i;
for (i = 0; i < count; i++) {
int temp = (h + i) & q.mask;
newtasks[i] = q.elems[temp];
}
for (i = 0; i < newsize; i++) {
q.elems[i] = newtasks[i]; // This is only initialized by the above loop if "newsize <= count".
} Consider the entry condition |
I agree it is not trivial to know if the code has UB (even with the patch below). However, the patch below fixes all the warnings and the alias analysis still seems to have problems
|
I haven't checked Dartagnan on your patch, but I would be surprised if the patch really changed much. |
This program seems to contain negative weight cycles. This seems to put IBPA into an endless loop. The best fix appears to be setting |
Does this means that "in theory" it is not the case that the default algorithm is always better than the others? |
IIRC, the old algorithms were just unsound in those cases and that is why they were "more precise". |
Exactly. In terms of precision, "in theory" cannot really apply here. In practice, the fixed default is sometimes less precise. Version 1 ( Version 2 ( In Version 3 ( |
Our alias analysis is very slow when run on this program.
These are the times I get (running this requires
GlobalSettings.ARCH_PRECISION = 64
)The text was updated successfully, but these errors were encountered: