Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addons/namingng.py: Improve file name checking feature. #5802

Merged
merged 1 commit into from
Dec 27, 2023

Conversation

mvds00
Copy link
Contributor

@mvds00 mvds00 commented Dec 23, 2023

(note: comment updated after force push; initial PR was incomplete)

namingng.py attempted to derive the source filename from the name of the dumpfile. However, the dumpfile is not necessarily named according to this pattern, e.g. cppcheck will add the pid to the filename, making RE_FILE rules
fail. Taking the first item of data.files seem to be more robust.

To get the basename of the file, os.path.basename() is used. This solves (theoretical) issues on platforms with a different path separator.

With this patch, all filenames are checked, not just those provided on the cppcheck command line. This is useful as header files will now also be part of this check, even if not explicitly specified on the command line.

The "RE_FILE" key of the configuration JSON may contain a list of regular expressions, where any match will lead to acceptance of the filename.

Both the full path and the basename of the files are tested.

One use case for this combination of features is:

"RE_FILE":[
    "/.*\\.h\\Z",
    "[a-z][a-z0-9_]*[a-z0-9]\\.[ch]\\Z"
]

This will accept any file naming convention of the platform used (assuming platform files are all referenced using an absolute path), while enforcing a particular naming scheme for project files.

namingng.py attempted to derive the source filename from the name of the
dumpfile. However, the dumpfile is not necessarily named according to this
pattern, e.g. cppcheck will add the pid to the filename, making RE_FILE rules
fail. Taking the first item of data.files seem to be more robust.

To get the basename of the file, os.path.basename() is used. This solves
(theoretical) issues on platforms with a different path separator.

With this patch, all filenames are checked, not just those provided on the
cppcheck command line. This is useful as header files will now also be part of
this check, even if not explicitly specified on the command line.

The "RE_FILE" key of the configuration JSON may contain a list of regular
expressions, where any match will lead to acceptance of the filename.

Both the full path and the basename of the files are tested.

One use case for this combination of features is:

"RE_FILE":[
    "/.*\\.h\\Z",
    "[a-z][a-z0-9_]*[a-z0-9]\\.[ch]\\Z"
]

This will accept any file naming convention of the platform used (assuming
platform files are all referenced using an absolute path), while enforcing
a particular naming scheme for project files.
@mvds00 mvds00 changed the title addons/namingng.py: Take filename from data.files. addons/namingng.py: Improve file name checking feature. Dec 23, 2023
good = False
for exp in conf["RE_FILE"]:
good |= bool(re.match(exp, source_file))
good |= bool(re.match(exp, basename))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not very obvious to me what basename does.. can you give an example why you want to match both source_file and basename?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, there are multiple use cases. For clarity: the original code attempted to distill the basename from the dump filename (by removing the .dump suffix) and only tested that. With this patch the code tests filenames of all files, also including header files. Both the basename and the full path (source_file) are tested, which gives some added flexibility.

On my system I get file naming violations on system includes, such as /...../gcc-arm-none-eabi-10.3-2021.10/arm-none-eabi/include/sys/_sigset.h that don't follow my rules because the basename starts with _. Of course I can make a suppression for it, but a different approach is to make a pattern that matches and accepts such filenames. In the example this is any absolute path ("/.*\\.[ch]\\Z") but it may also be based on a particular path (".*arm-none-eabi/include").

Another use case is where under a particular path a different convention is followed. Again from own experience, when developing for embedded systems we get a lot of boilerplate, e.g. CMSIS and the STM32 HAL, that we don't necessarily want to completely ignore or rename. So we can make a few different rules, such as "Drivers/CMSIS/<insert regexp>", "Drivers/STM32[^/]*/<insert regexp>", etc.

Yet another use case is that this allows to validate the naming and structure of directories as well, e.g. by having all RE_FILE rules prefixed, like "[A-Z][a-z]*/[a-z][a-z0-9_]*[a-z0-9]\\.[ch]\\Z" which would allow "Core/file_1.c" but not "tmp/hack.c" or "Core/tmp/quickfix.c" or "HEADERS/hdr.h".

Of course there are various other ways to solve such cases, where different paths need different rules, or where a different tool should validate directory naming and structure, but as we're validating the filename of all files, it seemed natural to (ab)use RE_FILE for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ps. I will add a unit test, but that depends on the namingng CLI patch going through first.

@danmar danmar merged commit 4c7aae3 into danmar:main Dec 27, 2023
68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants