TypeError: 'spacy.tokens.token.Token' object is not iterable #3547

podarrazvan · 2019-04-05T15:38:05Z

podarrazvan
Apr 5, 2019

How can I untokenize the output of this code?

class Core:

def init(self, user_input):
pos = pop(user_input)
subject = ""
for token in pos:
if token.dep == nsubj:
subject = untokenize.untokenize(token)
subject = S(subject)
I tried: https://pypi.org/project/untokenize/

MosesDetokenizer

.join()

But I have this error for my last code (from this post):

TypeError: 'spacy.tokens.token.Token' object is not iterable
This error for .join():

token.text == string

AttributeError: 'spacy.tokens.token.Token' object has no attribute 'join'

And for MosesDetokenizer: text = u" {} ".format(" ".join(tokens)) TypeError: can only join an iterable

Everything else I try, I see this

TypeError: 'spacy.tokens.token.Token' object is not iterable

Operating System:
Windows
Python Version Used:
3.7.2
spaCy Version Used:
2.0.18

Answered by ines

Apr 5, 2019

As I already mentioned in #3537, a token in spaCy is a Token object. It's not a string and it's not a list – so you can't iterate over it or join it. To get a token's text, you can use the Token.text attribute.

I'm not 100% sure what you are trying to achieve with the detokenizer. But it sounds like the solution might be a lot easier than you think? If you're looking for the original text of a Doc, you can use the doc.text attribute.

doc = nlp("hello world!")
print([token.text for token in doc])  # ['Hello', 'world', '!']
print(doc.text)  # 'Hello world!'

You might find the documentation helpful, which explains spaCy's API in more detail.

API docs: https://spacy.io/api
spaCy 101: https:/…

View full answer

ines · 2019-04-05T16:47:56Z

ines
Apr 5, 2019
Maintainer

As I already mentioned in #3537, a token in spaCy is a Token object. It's not a string and it's not a list – so you can't iterate over it or join it. To get a token's text, you can use the Token.text attribute.

I'm not 100% sure what you are trying to achieve with the detokenizer. But it sounds like the solution might be a lot easier than you think? If you're looking for the original text of a Doc, you can use the doc.text attribute.

doc = nlp("hello world!")
print([token.text for token in doc])  # ['Hello', 'world', '!']
print(doc.text)  # 'Hello world!'

You might find the documentation helpful, which explains spaCy's API in more detail.

API docs: https://spacy.io/api
spaCy 101: https://spacy.io/usage/spacy-101

0 replies

podarrazvan · 2019-04-08T11:44:42Z

podarrazvan
Apr 8, 2019
Author

print([token.text for token in subject])
TypeError: 'spacy.tokens.token.Token' object is not iterable

For me this is not working

0 replies

2019-05-08T12:44:39Z

lock[bot]
bot May 8, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: 'spacy.tokens.token.Token' object is not iterable #3547

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

TypeError: 'spacy.tokens.token.Token' object is not iterable #3547

podarrazvan Apr 5, 2019

Replies: 3 comments

ines Apr 5, 2019 Maintainer

podarrazvan Apr 8, 2019 Author

lock[bot] bot May 8, 2019

podarrazvan
Apr 5, 2019

ines
Apr 5, 2019
Maintainer

podarrazvan
Apr 8, 2019
Author

lock[bot]
bot May 8, 2019