Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test in _is_valid_language_code(s) is not complete #29

Open
ThomasHoppe opened this issue Oct 7, 2022 · 0 comments
Open

Test in _is_valid_language_code(s) is not complete #29

ThomasHoppe opened this issue Oct 7, 2022 · 0 comments

Comments

@ThomasHoppe
Copy link

ThomasHoppe commented Oct 7, 2022

Valid language tags lik3 "en-GB", "de-AT" are not recognized.

The test for valid language tags is buggy. IETF BCP 47 says that language tags consist of a country component (the first two chars) and a region component (the fourth and fifth chars) separated by a hyphen ('-'), not underscore ('_').

https://www.w3.org/TR/ltli/ says that "Specifications for the Web that require language identification MUST refer to [BCP47] ". Since ontologies and rdf are specifications for the web, the function needs to be corrected.

I think this bug is caused by the pythonian way you like to access ontologies. I.e. concept.label.en Clearly, concept.label.en-GB wouldn't work, since - is not a valid char in python names. On the other hand ontologies and rdf may contain label following with the BCP47 spec. So the test needs to be augmented for handling both cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant