-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#103] drop utf8-string dependency #104
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally okay with this change. But I would like to see, whether performance and behavior is different from utf8-string
package. I think it's still a good idea to provide fast conversion between different data types even if it adds one extra dependency. But if the performance is almost the same and if the results of two functions are also the same, then it's okay to remove this dependency 👍
Thanks @chshersh , it does seem like a sensible approach. On this note I created a small repo to test both performance and correctness here. |
@sphaso Thanks for great work! I will review benchmarks and tests more carefully in nearest couple days. If this is indeed even faster, than it looks awesome! 👍 |
@sphaso Unit test looks ok. The more robust test will be something with the Regarding your benchmarks: they contain single error, because of which the results you've observed are not reliable. Specifically, you used After changing this for every benchmarks and running for GHC 8.4.3 and GHC 8.6.1 I've observed the following interesting behavior (numbers are slightly different for different GHC version, but relation is the same):
So, we can't say yet that the new implementation is strictly better... But I think that the problem with |
@chshersh I wasn't sure |
@chshersh Hi! I implemented a very basic test with
Let me know if I can do anything else to help with this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sphaso I think that the tests results are ok because text like "\55296"
is not a valid Text
at the end, because it's only a half of a surrogate pair.
I think that you only need to update the CHANGELOG and this can be merged 👍
@sphaso I don't understand one thing: the error in your test case is on the |
@chshersh oops! you're right, interestingly I found this issue in hedgehog. It's funny how this character keeps popping up. |
@sphaso Regarding unicode: this is indeed strange. The fix is merged to the Regarding changelog: this can go to the |
@chshersh I'll add the change to the changelog as soon as I get home from work. I overlooked one major detail regarding our mystery character: the PR in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maintainer of utf8-string
package confirmed that it's better to use text
package. So well done 👍 Sometimes couple lines of code lead to rabbit hole and unexpected discoveries. The discussion around this issue is more valuable then the contribution itself, so don't be upset if it took so much time 👌
Resolves #103
Let me know if that's what you had in mind and if this change warrants a line in the CHANGELOG. If so, I'll be happy to add it.
Checklist:
HLint
hlint.dhall
accordingly to my changes (add new rules for the new imports, remove old ones, when they are outdated, etc.)..hlint.yaml
file (see this instructions).General
stylish-haskell
file.[ci skip]
text to the docs-only related commit's name.