-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topic Modeling Crash #820
Comments
First of all, don't use another Corpus between Preprocess Text and Topic Modelling because you will override all the preprocessing results. Second, I believe this is the reason for the crash. You did the preprocessing, but overrode it and now the default preprocessing is run, resulting a large number of tokens which don't fit into your RAM. Decrease the number of tokens and try again. |
I've collated all the reports of the same kind. Two other uses say Topic Modelling crashes on Windows (doesn't happen on Mac). One user doesn't even have a large corpus (174 documents and 909 tokens). We need to research why this happens. @noahnovsak @PrimozGodec |
@djukicn already identified the reason for crashes. I think also the crash described in this issue have the same source. The problem is that Ayway we got the idea for different solutions which can solve all that problems and minimize the probability for errors: we would deprecate I think this solution would minimize the probability that something does not work and even give users the option to manipulate with bag-of-words features before the topic modeling. |
Hi, I am sorry, perhaps I am in the wrong place, but is there a solution for this? I can't recover my work, orange crashes after the latest update when running topic modeling. I use LDA... |
I have the same problem - Topic Modelling crushes when I want to run it. |
Hi, any news on this? Thanks... |
We just released Orange3-Text v. 1.10.0. Please update the add-on and let us know if it works. If not, we would appreciate if you could provide a workflow, data sample (if possible) and the pip freeze output, if you installed Orange via the terminal. |
@ajdapretnar I updated add-on and it still crushes. I am sending additional information bellow. What's your environment/workflow?
How you installed Orange:
My data sample is here. Pip freeze output: |
@NAsic123 What happens it you select to wait? |
Also, I tried it on Mac with your data. I am assuming you are using the default preprocessing and LDA? It works normally for me. |
@ajdapretnar thank you for your answer and help. I will run LDA and I will leave it running and see what will happen. Then I will report what happens. Maybe it needs extra time. No, I am not using default preprocessing, I am using these preprocessors: |
Two comments, unrelated to the crashing widget. In preprocess, you don't need Regexp, because tokenization you've set already omits all punctuation. Also, your POS tag filter doesn't do anything, because your data is not tagged, so filtering cannot work. |
@ajdapretnar thank you. I will correct it. And I let the LDA run for one hour and it was still on 0 % and then it crushed. |
@NAsic123 Is you gensim version currenlty 4.2.0? If so, could you please install 4.1.2 and see whether the same problem occurs? |
@djukicn I am sorry for the very basic question, but where do I install gensim? In the Orange Command Promt I have to type: ? |
@NAsic123 Sorry, this was a bit technical. I would be wary of tampering with the set up environment. |
@ajdapretnar thank you for your instruction. I ran pip freeze and got this information below (I copied and saved it in .txt). I hope it is useful and thank you again. |
Ok, it does indeed seem like you have |
@ajdapretnar thank you. I ran LDA and now it works. I get the results so not it works. Thank you so much for help. |
@ajdapretnar I was actually able reproduce the error (although to me the results were produced after clicking "Wait" a few times) on Ubuntu so it's not just a Windows issue. Somewhere in the background gensim raises an exception. I'll look into it today and see what can be done. |
@djukicn Fantastic! Thanks! |
It would be highly appreciated if you could also provide info on updating that gensim library. I understand that's where the issue might be, but I don't know how to update it. Thank you. |
We have reported the issue to Gensim (the library which computes topics), hope they will consider it soon piskvorky/gensim#3368 |
@nadiaelen If you are on Windows, you could try opening Orange Command Prompt (a separate program available from Start menu). Then enter |
Hey, I tried source installation with the merged files from #885 (newest biolab repository), but it didnt work. Orange still crashes. I also tried different versions of orange3 with older add-on versions, didnt work either. |
@Katzengurke When you say Orange crashes, do you mean the software or the topic modelling widget? |
The software crashes. Edit: Tried the same on a windows 10 laptop where I didnt temper with any files whatsoever, and it works there. Is it maybe Windows 11? |
It might be. Does it happen even if you uninstall orange3-text add-on? |
Can you try running |
Yes, sadly. Ill uninstall my Python, my Anaconda and Orange, and then try again with a new installation, and let you know in a couple of minutes.
Same with that. |
Ok, so it is a Python bug not an Orange bug. Does running |
Alright, I got a bit farther, but the bug got stranger too. Python is 3.8.8 Edit: I actually got a log in the shell, although only for the topic modeler without the connection to the corpus I think C:\Users\xxx\AppData\Local\Programs\Orange\lib\site-packages\xgboost\compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead. return tuple(filter(None, resolve_types(types))) Edit 2: Yep, retried it a couple of times, no chance to get the log of the crash. As soon as I connect corpus to topic modeler, "Python stopped working" |
So, the same thing remains: even with the new gensim library, it runs into the same problem, it just hangs forever, after all install, deinstall, etc...:( |
The fix #885 definitely works under Windows 10, I was able to install it and work with it, without crashes so far. You have to use the source installation with git and the orange command prompt. Doesnt work with my Windows 11 laptop though. Edit: Nevermind, works now. I have to open Orange via python -m Orange.canvas though. I uninstalled a bunch of programs like node.js though |
Hey, I am using orange data mining for my master thesis. However, when I try topic modelling, it crashes. I already read this thread but it was a bit too technical for me to understand since I really do not have any experience with Orange or Python. Can someone help me out? My environment: Windows 11 Home 64Bit |
I really love Orange and appreciate you, your work and everything, but, truly, when it comes to topic modelling, which is the hottest topic right now, one weeks works, one week crashes and stays like that for a month... |
@JosieVor and @nadiaelen, sorry for the late response. I tried to reproduce the error on MacOS and Windows, and it works for me. Can you please give me more information so I can dig deeper into the problem?
Thank you in advance. |
I am using a Twitter Dataset, which I scrapped directly on Orange. My dataset is quite big (about 30.000 Tweets) but I also have tried topic modeling on smaller datasets (about 100 Tweets, and it still did not work). I am using LDA, which crashes every time. LSI sometimes works, but more often than not it does not work, either. Text Add-On Version: 1.12.0 Orange Version: 3.34.0 Gensim Version: the pip freeze command does not work for me, but when I use pip list, it says my version is 4.1.2 |
Hi, I'm using windows 10 and also have the same issues. I already tried the pip freeze using orange command but it still crashed. I also tried to reinstall but still have the same result. Is there any other solution? Orange version: 3.34.0 |
Thank you @JosieVor and @calliope212, for the additional information. We noticed that the newest release didn't support using genim>=4.3.0 (on the master branch we already switched to >=4.3.0). We fixed the release. Can you please update the Text addon to version 1.12.2 and try again? Please let us know if it helps. |
Thank you @PrimozGodec for the suggestion. I tried it but unfortunately, it still won't work. When I check the pip list, it says my gensim version is 4.3.1. Is that affect the result? |
Hi, I also have similar issues when I run Topic Modelling, it hangs for more than 2 minutes and I have to kill it ultimately. Another laptop has the same problem. It is running on Windows 10. Is there any solution for this problem? |
Hi! Orange version 3.35 Do you already have any solution for this problem? |
The new version of Gensim (4.3.2) is out. Can you try with this version? We still cannot reproduce the bug on the new or previous version.
|
I am closing this as stale. If the error persists, please open a new issue. |
2024 and the error persist. If preprocess uses n_gram range, it crashes. |
Is this still a Windows issue? Because I ran grimm-tales with n-grams and TM worked. I have the latest gensim installed (gensim==4.3.2). |
What's wrong?
Topic Modeling Crash when used with Twitter Widget
How can we reproduce the problem?
Please see the Screen Shoot, i cannot save the owl with Topic Modeling connected since the App will freeze and then Not Responding after connecting the Widget from Corpus
What's your environment?
The text was updated successfully, but these errors were encountered: