Custom tokenizer for trf models from hf #13562
Unanswered
K-Grachev-2106756
asked this question in
Help: Model Advice
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a task to build a ner pipeline. It consists of a transformer and ner pipelines. I used to use a standard tokenizer for Chinese.
After that, I wondered if I was doing the right thing by submitting Docs with non-native marked tokens of another tokenizer to the transformer model.
Please tell me how the internal learning process of the model works. Is it important to write a custom tokenizer, or do tokens become understandable to the transformer in any way during the learning process?
Beta Was this translation helpful? Give feedback.
All reactions