-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve PyThaiNLP performance #685
Comments
I propose to remove
And it is confusing that while The performance improvement gained by using What is your opinion? If you agree, I would be happy to work on this and send an PR. |
I am agree. |
When working on #691, I found a problem relating to the version checking behavior of the corpus downloader. The downloader checks both the name and the version here, so later if the corpus is found in the local database and a re-download is not forced, the versions should always match. I'm not sure what the expected behavior is here. If a corpus with the same name but different version is found in local database (and |
Yes, It is not force download that you can see https://github.com/PyThaiNLP/pythainlp/runs/7957249794?check_suite_focus=true#step:5:5219 and https://github.com/PyThaiNLP/pythainlp/blob/dev/tests/test_corpus.py#L86. |
@wannaphong I mean that since both the name and the version of the corpus is checked, the else block would never be reached, so the user will never be notified that a newer version of the requested corpus is available. That is, in cases that the corpus is found in the local database, but the version do not match, the latest version of the corpus would be silently downloaded and rewritten. If this is indeed the expected behavior, the else block would be redundant. If the user should be notified in these cases, the checking logic should be modified. |
OK. I'm agree. |
@wannaphong So what is the expected behavior? If the user should be notified to use |
I'm agree. The user should get notified when newer versions are available. |
@wannaphong Should be fixed in #692, please review the PR. |
I'm doing reduce import time. #719 |
PyThaiNLP wants you help us to improve the performance. You can fork this git, coding new code to improve the performance, and send your pull request to PyThaiNLP.
These are some lists for you.
The text was updated successfully, but these errors were encountered: