Expected? json data generated from convert conll-u shows unicode? #13544
yosiasz
started this conversation in
Language Support
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
hello,
I am converting amharic conllu file using the following command
spacy convert --merge-subtokens '.\corpus\conllu\am_att-ud-test.conllu' -c conllu ./output/ -t json
and it generates this
See
orth
andlemma
It is not showing me actual Amharic words but
uxyz
is that to be expected or is this a bug in spaCy? Hard to read this unless I use another external tool to parse these?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions