WordSegmenterApproach spark nlp annotators #7346

geowynn · 2022-03-18T06:27:13Z

geowynn
Mar 18, 2022

Hi there,

I'm new to using sparknlp and have questions regarding the sparknlp.annotator.WordSegmenterModel. Applying this on english texts seems to be impossible as the pretrained models are only in Chinese/Japanese/Korean https://nlp.johnsnowlabs.com/models?task=Word+Segmentation

My texts could contain words like "hellogoodday" and "howareyou", and I need to detect the separate words as tokens. Does anyone know how to get the english language model for this annotator, or is there some other annotator that I should be using? If not, is it feasible to raise this as a new feature request?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WordSegmenterApproach spark nlp annotators #7346

{{title}}

Replies: 0 comments

Select a reply

WordSegmenterApproach spark nlp annotators #7346

geowynn Mar 18, 2022

Replies: 0 comments

geowynn
Mar 18, 2022