Change the repository type filter
All
Repositories list
13 repositories
vibrato
Public🎤 vibrato: Viterbi-based accelerated tokenizerdaachorse
Public🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure in Rust.vaporetto
Public🛥 Vaporetto: Very accelerated pointwise prediction based tokenizerpython-vaporetto
Public🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.python-vibrato
PublicViterbi-based accelerated tokenizer (Python wrapper)trie-match
PublicFast match expression optimized for string comparisonvaporetto-models
Publiccrawdad
Public🦞 Rust library of natural language dictionaries using character-wise double-array tries.include-bytes-zstd
Publicguidelines
Publicpython-daachorse
Public🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)rucrf
Publicfind-simdoc
PublicFinding all pairs of similar documents time- and memory-efficiently