Experimental Support for Dictionary Building Added
One feature fugashi hasn't had until now is the ability to build user dictionaries. This feature can be important for improving tokenization quality in many applications. This release adds fugashi-build-dict
, a wrapper for MeCab's mecab-dict-index
command. You can use it like this:
fugashi-build-dict -d [system-dic-dir] -u mydic.dic input.csv
If you're familiar with MeCab's user dictionary creation process nothing has changed, so any feedback on use or any errors you encounter would be appreciated. If you're not familiar with the dictionary process, just wait a bit - a guide should be released soon.