You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your convenient and fast Korean parsing system.
However, I found there exists a wrong rule on stemming system, which does not follow the Korean spelling system.
For example,
"조사된" has root form of "조사(하다) + 되 + ㄴ", so correct stemming form should be one of "조사+되다" or "조사되다", while the stemming system understands that word as "조사돼다" (which means "조사하다+되+어+다", with redundant ending token "어").
There is another example like:
"대상이 되다" --> (Correct form) "대상 + 이 + 되다", so correct form should be "대상+이+되다" (System says) "대상 + 이 + 돼다"
"판정됐다" -> (Correct form) "판정 + 되 + 었 + 다", so correct form should be "판정되다" or "판정하다" (System says) "판정돼다"
Stemming system only misgiving this type of spelling error, for "되/돼" variations. Because "돼 = 되+어", usually "되" is the basic form ("어" is a ending morpheme to make the end of sentence or word).
Although it's very common error in Korean spelling system and doesn't related with the developing issue, I think you should provide it correctly.
FYI, here(korean) is an semi-official article from the National Institute of the Korean Language.
Thanks again with the convenient system.
다른 개발진에 외국인이 있을지 몰라서 영어로 적었습니다.
정리하자면 Stemming system이 "돼/되"를 반대로 구분하고 있습니다.
"조사된"의 기본형은 "조사되다"임에도 "조사돼다"로 출력되고,
"대상이 되다" 역시 "대상이 돼다"로 출력되며,
"판정됐다" 역시 "판정돼다"로 출력됩니다.
데이터 분석작업이나 개발에는 큰 문제가 없겠지만, 맞춤법을 따르도록 고쳐주시는 것이 어떤가 합니다.
감사합니다.
The text was updated successfully, but these errors were encountered:
Thanks for your convenient and fast Korean parsing system.
However, I found there exists a wrong rule on stemming system, which does not follow the Korean spelling system.
For example,
"조사된" has root form of "조사(하다) + 되 + ㄴ", so correct stemming form should be one of "조사+되다" or "조사되다", while the stemming system understands that word as "조사돼다" (which means "조사하다+되+어+다", with redundant ending token "어").
There is another example like:
"대상이 되다" --> (Correct form) "대상 + 이 + 되다", so correct form should be "대상+이+되다" (System says) "대상 + 이 + 돼다"
"판정됐다" -> (Correct form) "판정 + 되 + 었 + 다", so correct form should be "판정되다" or "판정하다" (System says) "판정돼다"
Stemming system only misgiving this type of spelling error, for "되/돼" variations. Because "돼 = 되+어", usually "되" is the basic form ("어" is a ending morpheme to make the end of sentence or word).
Although it's very common error in Korean spelling system and doesn't related with the developing issue, I think you should provide it correctly.
FYI, here(korean) is an semi-official article from the National Institute of the Korean Language.
Thanks again with the convenient system.
다른 개발진에 외국인이 있을지 몰라서 영어로 적었습니다.
정리하자면 Stemming system이 "돼/되"를 반대로 구분하고 있습니다.
"조사된"의 기본형은 "조사되다"임에도 "조사돼다"로 출력되고,
"대상이 되다" 역시 "대상이 돼다"로 출력되며,
"판정됐다" 역시 "판정돼다"로 출력됩니다.
데이터 분석작업이나 개발에는 큰 문제가 없겠지만, 맞춤법을 따르도록 고쳐주시는 것이 어떤가 합니다.
감사합니다.
The text was updated successfully, but these errors were encountered: