Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language recognition error #159

Open
xujiaw opened this issue Nov 17, 2022 · 1 comment
Open

Language recognition error #159

xujiaw opened this issue Nov 17, 2022 · 1 comment

Comments

@xujiaw
Copy link

xujiaw commented Nov 17, 2022

LanguageDetector detector = LanguageDetectorBuilder.fromLanguages(ENGLISH,CHINESE , THAI, VIETNAMESE).build();
SortedMap<Language, Double> languageDoubleSortedMap = detector.computeLanguageConfidenceValues("ี่มีประสิทธิภาพหลอดไฟพลังงานแสงอาทิตย์กลางแจ้งเซ็นเซอร์ตรวจจับการเคลื่อนไหวสวนกันน้ำ LED พลังงานแสงอาทิตย์โคมไฟสปอร์ตไลท์สำหรับ Garden เส้นทางถนนแบ็คดรอปเป่าลม Led Light");
System.out.println(languageDoubleSortedMap);

The following information is printed : {ENGLISH=1.0, VIETNAMESE=0.5658177137374878}
I think it's Thai, but I can recognize English, even Vietnamese, and Thai doesn't
version is : 1.2.2

@pbcornelius
Copy link

I'm not sure if it's helpful, but I also encountered some fairly straight-forward misclassifications:

Good Luck Sarah ... "break a leg!"

TAGALOG=1.0, ENGLISH=0.9973366856575012, GERMAN=0.9332742094993591, ...

Thank you, Krista!

FINNISH=1.0, ENGLISH=0.9905743598937988, ESPERANTO=0.9733119606971741

@Evar So exciting!

TAGALOG=1.0, ENGLISH=0.9913086891174316, ESPERANTO=0.9132570028305054

To me, these do not seem like border-line cases (e.g., shared words across languages).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants