TypeError: 'spacy.tokens.token.Token' object is not iterable #3547
-
How can I untokenize the output of this code? class Core: def init(self, user_input): MosesDetokenizer .join() But I have this error for my last code (from this post): TypeError: 'spacy.tokens.token.Token' object is not iterable token.text == string AttributeError: 'spacy.tokens.token.Token' object has no attribute 'join' And for MosesDetokenizer: text = u" {} ".format(" ".join(tokens)) TypeError: can only join an iterable Everything else I try, I see this TypeError: 'spacy.tokens.token.Token' object is not iterable Operating System: |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
As I already mentioned in #3537, a token in spaCy is a I'm not 100% sure what you are trying to achieve with the detokenizer. But it sounds like the solution might be a lot easier than you think? If you're looking for the original text of a doc = nlp("hello world!")
print([token.text for token in doc]) # ['Hello', 'world', '!']
print(doc.text) # 'Hello world!' You might find the documentation helpful, which explains spaCy's API in more detail.
|
Beta Was this translation helpful? Give feedback.
-
print([token.text for token in subject]) For me this is not working |
Beta Was this translation helpful? Give feedback.
-
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Beta Was this translation helpful? Give feedback.
As I already mentioned in #3537, a token in spaCy is a
Token
object. It's not a string and it's not a list – so you can't iterate over it or join it. To get a token's text, you can use theToken.text
attribute.I'm not 100% sure what you are trying to achieve with the detokenizer. But it sounds like the solution might be a lot easier than you think? If you're looking for the original text of a
Doc
, you can use thedoc.text
attribute.You might find the documentation helpful, which explains spaCy's API in more detail.