-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #12266 (Performance: valueFlowCondition slowdown) #5837
Conversation
lib/forwardanalyzer.cpp
Outdated
@@ -646,6 +646,9 @@ namespace { | |||
} | |||
} else if (tok->isControlFlowKeyword() && Token::Match(tok, "if|while|for (") && | |||
Token::simpleMatch(tok->next()->link(), ") {")) { | |||
const int branchCount = analyzer->countBranch(); | |||
if (settings.checkLevel == Settings::CheckLevel::normal && branchCount > 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should generate a message that informs the user of this so it is clear that the exhaustive mode needs to be used for the full analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also make the threshold configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
I wonder if this is the same issue I saw with 4d9e69e. |
lib/analyzer.h
Outdated
virtual ~Analyzer() = default; | ||
Analyzer(const Analyzer&) = default; | ||
protected: | ||
Analyzer() = default; | ||
private: | ||
int mBranchCount = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go into the ForwardTraversal
class instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay. Fixed by c517458
I do think 190 FNs seem like a lot, especially if these are real bugs. Thats a 35% chance of getting a FN in a package which seems kind of high. |
Another measure I think would be interesting is how many extra warnings you get with the extra analysis.. imagine there was 100 thousand warnings in total in those 537 packages then the exhaustive analysis will in general only give you 0.2% extra warnings for quite a substantial cpu penalty.. I don't have the exact numbers but I am thinking it would be interesting to write and run a script that creates such stats.. |
I modified the test-my-pr.py script and here are some preliminary stats:
I will publish the modified script later also. I have all the results, both normal and exhaustive, in case anybody wants to dig deeper and make some more stats.. |
Thats 5% for warnings and 2% for errors. Another measurement is the distribution across packages. If 90% of the FNs are in say 5% of the packages. Also what about tweaking the threshold? Would |
No wait.. it is 0.5% for warnings and 0.2% for errors.
I have only tried 4.. I will try 5 and see what happens.. |
I am running the comparison script right now.. Current stats. Changed the "4" to "5":
It's not the same packages as yesterday unfortunately so the stats can't be compared directly. Here are some timing measurements for some files: lib/tokenize.cpp: exhaustive=2m29s normal4=10.96s normal5=13.11s I got p1.c-p6.c from a customer. I am thinking that increasing the value from 4 to 5 seems to increase the analysis time with ~20% in these files. It's still considerably faster than the exhaustive mode. But still the payoff is not that huge imho it should be possible to detect more bugs (but not these particular bugs) with less cpu penalty. |
So thinking about this more, this could cause FPs because we rely on the lifetime being propagated to do better alias analysis. If the lifetimes bails at 4 conditions it could lead to us not recognizing that a variable was modified through an alias. Something like this: int f(int a, bool b, bool c, bool d, int e) {
int* g = &a;
if(b)
return 0;
if(c)
return 0;
if(d)
return 0;
if(e < 1)
return 0;
if(a != 0)
return 0;
*g = 1;
return d / a; // FP divide by zero
} But I havent tested this yet. Are there FPs in the run? If there are FPs then the FNs would actually be higher. |
ok so here are stats for each ID:
There seems to be at least extra const* , cstyleCast, duplicateCondition, identicalInnerCondition, legacyUninitVar, unreadVariable warnings in the "normal" checking. I will look into these.. |
I suspect there are some effects of |
I see a number of such diffs:
meaning there is no FN/FP but less details.. |
For information here is an example code where we get a
|
2823f17
to
348cdac
Compare
The results looks acceptable and give a significant speedup.. I apply this. |
I published #6025 to add the missing message about the disabled part of the valueflow. |
The tests regarding false negatives/positives are currently not representative because the analysis pass is broken as soon as an incomplete variable is being detected - see https://trac.cppcheck.net/ticket/12526. They need to be conducted again after #6153 has been merged. They might also change with each addition to the library configurations. So the generated statistics are just a current snapshot and the results could be quite different especially if we fix some of the top ranking |
The current diff values from daca:
Unless there were false positives fixed we have over 5000 false negatives introduced by this. |
I have executed test-my-pr.py on 532 packages
number of "main" warnings:
number of "your" warnings (genomicepidemiology-kma timed out without these changes):
=> There are some false negatives. But it's rougly ~ 190 in 532 packages.. so it's not a major problem imho.
I can see a speedup. For the packages that took more than 1 second to analyze the timing factor was: