Skip to content

TypeError: 'spacy.tokens.token.Token' object is not iterable #3547

Discussion options

You must be logged in to vote

As I already mentioned in #3537, a token in spaCy is a Token object. It's not a string and it's not a list – so you can't iterate over it or join it. To get a token's text, you can use the Token.text attribute.

I'm not 100% sure what you are trying to achieve with the detokenizer. But it sounds like the solution might be a lot easier than you think? If you're looking for the original text of a Doc, you can use the doc.text attribute.

doc = nlp("hello world!")
print([token.text for token in doc])  # ['Hello', 'world', '!']
print(doc.text)  # 'Hello world!'

You might find the documentation helpful, which explains spaCy's API in more detail.

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / doc Feature: Doc, Span and Token objects
2 participants
Converted from issue

This discussion was converted from issue #3547 on December 10, 2020 13:45.