Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --check-level=fast option that reuses much faster valueflow analysis from 1.90 #6097

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

danmar
Copy link
Owner

@danmar danmar commented Mar 7, 2024

A self check is completed much faster with --check-level=fast compared to --check-level=normal. For me:
linux: 5-6 times faster
windows: 10 times faster

Comparing warnings with --check-level=fast and --check-level=normal shows not a lot of differences.

@danmar
Copy link
Owner Author

danmar commented Mar 7, 2024

I ran --check-level=fast and --check-level=normal on 100 random packages. Total number of reports for each severity:

error:
1634 my_check_diff_fast.log
1665 my_check_diff_normal.log
warning:
5103 my_check_diff_fast.log
5112 my_check_diff_normal.log
style:
15629 my_check_diff_fast.log
15873 my_check_diff_normal.log
portability:
835 my_check_diff_fast.log
838 my_check_diff_normal.log
performance:
1038 my_check_diff_fast.log
1038 my_check_diff_normal.log

Assuming that all are true positives I think the fast results are pretty good. I assume that the fast analysis should have a good noise ratio but I will check it!

@danmar danmar marked this pull request as draft March 7, 2024 17:12
@danmar
Copy link
Owner Author

danmar commented Mar 7, 2024

Refactorings will be needed before I merge this to main. I feel that ideally there should not be lots of copy/pasted code between valueflow.cpp and valueflowfast.cpp. I want to reuse fast valueFlow.. functions in valueflow.cpp it's only the slow forward/reverse analysis that should be replaced.

@firewave
Copy link
Collaborator

firewave commented Mar 7, 2024

I have not looked at the changes yet but regardless here are some of my views on this option: #6025 (comment).

@danmar
Copy link
Owner Author

danmar commented Mar 7, 2024

ok --check-level=fast must be used explicitly by the user. It is not "fast" by default. Therefore in my opinion we should not write warnings that the analysis is fast and that there are slower options available.

@firewave
Copy link
Collaborator

IMO it is utter madness to have two different implementations of something as it means we have to provide performance and quality for both of them. And as we are just about 4 people actively working on this and essentially just a single person working on the ValueFlow I think this is not something that can realistically be achieved.

This also means we need to duplicate all testing (including daca) and that just seems mental. Especially since we might not have the proper testing coverage. We might check that a false positive does not occur but not the negative test that we detect the issue. So if the code is somehow no longer being triggered without feedback we would have no indication that we would never get any output even if it regresses in the future.

If just yesterday is an indicator we should not be adding any major features or code at all as existing parts might not have been working for years (or ever) and we should rather try to have less code and less jobs.

I still haven't looked at the code but a note on ProgramMemory.
As the copies of it have a performance impact I looked into this several times and only did things on the level the language allows and not changing the implementation. An idea to potentially improve that was to introduce an overlay for the map so data doesn't need to be copied.

Also offering even more options to the user with less feedback makes the support much harder. It also requires all plugins/integrations to change so people have the possibility to configure it and restore the previous behavior. And one of the main advantages of Cppcheck compared to most static analyzers is the low configuration approach.

@firewave
Copy link
Collaborator

I wonder if a better approach would be to defer the ValueFlow execution until we actually need the values. But since the various passes dependent on each other I doubt that is possible. But as they are usually run on scopes maybe that could be used as the entrypoint for a different approach. I am not very familiar with it so I am not sure if that would be possible.

@danmar
Copy link
Owner Author

danmar commented Apr 12, 2024

IMO it is utter madness to have two different implementations of something as it means we have to provide performance and quality for both of them.

Right now there is plenty of copy/paste we can get rid of easily. When I have finished refactoring, I am guessing there will be something like 500-1000 lines of code in valueflowfast.cpp instead of 5000.

Two different implementations makes sense because there are very different goals. We can continue to develop the normal/exhaustive analysis to detect more bugs and that will not have significant effect on "fast" analysis time. I will not actively improve the fast analysis to detect more bugs, I envision there will mostly be bug fixes in that.

Technically it would be possible to fork Cppcheck repo and provide this "fast" analysis in a separate repo. I am not totally against that option.

And as we are just about 4 people actively working on this and essentially just a single person working on the ValueFlow I think this is not something that can realistically be achieved.

As I read this.. this is a question if there are resources for this. This is important for Cppcheck Solutions AB and we can provide resources. For information we have paid for several bug fixes in normal/exhaustive valueflow analysis.

@firewave
Copy link
Collaborator

I will provide a proper reply later - I have not been feeling too well this week so looking into non-trivial stuff has been challenging.

Just a collection of existing ideas in helping with this:

@firewave
Copy link
Collaborator

See neutrinolabs/xrdp#3037 for the mess we are currently in...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants