-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn on Fullwidth Exclamation Mark (U+FF01) in comment #134810
Comments
Hundreds of affected projects: |
Yes, we should either treat the characters identically (normalizing both to match
and then notices they're writing using a CJK input method which will preferentially emit fullwidth characters, even when using "English" punctuation inputs. They then switch linguistic gears (or at least, those of their HID/IME), back up, and write,
If they don't delete the entire line, they leave behind the original fullwidth character. I'm not aware of a fullwidth |
We do error for the fullwidth exclamation mark if we encounter it in normal Rust code: fn foo() -> ! {
let v = „hi“;
} output (expand)
We could maybe make a specialized lint like how there is text_direction_codepoint_in_comment. This also gives an error btw: fn foo() {
/∗∗/
} output (expand)
This doesn't lint either and on a first glance it looks like
There is no lint here either (I don't have a chinese keyboard, maybe one can hit this when trying for `?): /*!
```
assert_eq!(true, false);
```
*/ or here (less visible and occurs once in the wild): /*!
ˋˋˋ
assert_eq!(true, false);
ˋˋˋ
*/
//⁄ hello
fn foo() {} Maybe one could make a |
While specialty keyboards for these languages do exist, using them is actually... er... a specialization? Like steganography keyboards are to English. These input methods are mostly software-driven, so people often use perfectly ordinary QWERTY keyboards, and the "switch input method" keys that are sometimes present (in e.g. the 109 key layout) can be replicated by chorded commands using some combination of the Shift, Ctrl, Alt, and Fn keys. |
So while unlikely, doesn't doing so technically warn/lint on "valid" comments? E.g. what if I happen to intentionally write a non-ASCII exclamation mark? Likely more unsual (but putting on an adversarial user's hat)
(The punctuation being formatted like this is not usual, but it is "technically" valid) I'd be personally in favor of adding such a warning, just thinking if it can be broken (specifically false positives). |
"proper" formatting of comments puts a space after the |
For the above Rust source code, rustdoc produces documentation containing only "A C". The line containing "B" is only a comment. The comment starts with https://www.compart.com/en/unicode/U+FF01.
I caught this in a real-world documentation PR from a Chinese contributor. See #134241 (review).
In GitHub, the distinction is practically invisible.
In my text editor, it is more obvious.
Would it be reasonable for rustc and/or rustdoc to have reported a lint on such code?
The text was updated successfully, but these errors were encountered: