Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid some redundant checks in TokenList::readfile() #324

Merged
merged 1 commit into from
Oct 8, 2023

Conversation

firewave
Copy link
Collaborator

@firewave firewave commented Sep 28, 2023

calling isspace() is unnecessary since we handle all non-ASCII characters and already have special handling of all remaining whitespaces according to https://en.cppreference.com/w/cpp/string/byte/isspace.

Runinng with -q -D__GNUC__ -Ilib lib/valueflow.cpp

Clang 16 276,826,655 -> 270,642,840
GCC 13 283,762,095-> 280,179,302

@firewave
Copy link
Collaborator Author

@@ -692,7 +690,7 @@ void simplecpp::TokenList::readfile(Stream &stream, const std::string &filename,
continue;
}

if (std::isspace(ch)) {
if (ch <= ' ') {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I am not sure. The compiler knows the ascii codes. But if we assume the developer does not.. isn't it more explicit that '\t' '\n' '\r' are handled using std::isspace(ch) than ch <= ' '.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically separate the ASCII table is three parts with our code:

>= 0x80 - non-ASCII characters we bail out on

< ' ' - whitespaces and control characters which we treat as allowed and converted to ' '

... and the printable ASCII characters

So we only have either ' ' or a printable ASCII so the only character isspace() can match is ' '. So we get rid of the conversion step and simply exclude the lower part of the table.

Copy link
Owner

@danmar danmar Oct 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but with these changes you assume that < ' ' will match \t and \r and \n ... technically it does. but if we pretend it didn't then your changes would break things.

on the other hand we did assume that < ' ' would match all other control characters already.

I don't have a strong negative opinion about this .. just wanted to point out that there might be a good reason to be explicit. It extends to the tickets you wrote.. technically there are some false negative but I am skeptic about Cppcheck warnings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong negative opinion about this .. just wanted to point out that there might be a good reason to be explicit

I think the code is explicit enough since we are limiting us to the printable ASCII characters. If we weren't ìsspace() would have been necessary.

@danmar danmar merged commit edb7b86 into danmar:master Oct 8, 2023
13 checks passed
@firewave firewave deleted the readfile branch October 8, 2023 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants