🚤 Vaporetto models

This repository provides word segmentation models available in the fast tokenizer Vaporetto, as well as a set of programs for creating each model.

Usage

Create the resources directory directly under the repository root, copy *.xml files contained in the BCCWJ M-XML directory and lex_3_1.csv contained in UniDic 3.1.1 into it, and then run build.sh in the models directory.

License

Licensed under either of

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contribution

See the guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
convert_bccwj_xml		convert_bccwj_xml
convert_unidic_csv		convert_unidic_csv
models		models
vaporetto @ f09ce86		vaporetto @ f09ce86
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

🚤 Vaporetto models

Usage

License

Contribution

About

Licenses found

Releases 1

Packages

Contributors 2

Languages

License

Licenses found

daac-tools/vaporetto-models

Folders and files

Latest commit

History

Repository files navigation

🚤 Vaporetto models

Usage

License

Contribution

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages