Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify relationship to PEP-0383 / UTF-8b #16

Open
Ericson2314 opened this issue Jun 24, 2022 · 1 comment
Open

Clarify relationship to PEP-0383 / UTF-8b #16

Ericson2314 opened this issue Jun 24, 2022 · 1 comment

Comments

@Ericson2314
Copy link

I just learned about these older things. The idea seems similar, but I cannot readily tell how similar. It would be great if multiple implementations were converging / did converge on the same thing.

@SimonSapin
Copy link
Owner

I don’t easily find a definition for UTF-8b, is it the same as PEP-0383?

PEP-0383 defines a superset of UTF-32 that can losslessly round-trip an arbitrary byte sequence [u8] by interpreting it as potentially-ill-formed UTF-8 and preserving the meaning of the well-formed parts.

WTF-8 defines a superset of UTF-8 that can losslessly round-trip an arbitrary code unit sequence [u16] (a.k.a. "wide string") by interpreting it as potentially-ill-formed UTF-16 and preserving the meaning of the well-formed parts.


PEP-0383 and WTF-8 take a similar approach in how to solve problems, but they solve fundamentally different problems to begin with. What does "converging" even mean? I’m a bit confused at what you’re expecting here.

In any case, even if we could find a potentially-beneficial change to either of them, PEP-0383 and WTF-8 are names for specific encoding/behaviors that already have implementations in use. Redefining the name to some other encoding would be harmful. If you come up with a different encoding that could be interesting, give it another name. https://github.com/kennytm/omgwtf8 is an example where this happened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants