Skip to content

V2.0.0: Faster Grammar Compilation; Cleaner Codebase; Preparation For New Features

Compare
Choose a tag to compare
@daanzu daanzu released this 21 Mar 15:20
· 32 commits to master since this release

You can subscribe to announcements on GitHub (see Watch panel above), or on Gitter (see instructions Gitter)

Donations are appreciated to encourage development.

Donate Donate Donate Donate

Added

  • Native FST support, via direct wrapping of OpenFST, rather than Python text-format implementation
    • Eliminates grammar (G) FST compilation step
  • Internalized many graph construction steps, via direct use of native Kaldi/OpenFST functions, rather than invoking separate CLI processes
    • Eliminates need for many temporary files (FSTs, .confs, etc) and pipes
  • Example usage for allowing mixing of free dictation with strict command phrases
  • Experimental support for "look ahead" graphs, as an alternative to full HCLG compilation
  • Experimental support for rescoring with CARPA LMs
  • Experimental support for rescoring with RNN LMs
  • Experimental support for "priming" RNNLM previous left context for each utterance

Changed

  • OpenBLAS is now the default linear algebra library (rather than Intel MKL) on Linux/MacOS
    • Because it is open source and provides good performance on all hardware (including AMD)
    • Windows is more difficult for this, and will be implemented soon in a later release
  • Default tmp_dir is now set to [model_dir]/cache.tmp
  • tmp_dir is now optional, and only needed if caching compiled FSTs (or for certain framework/option combinations)
  • File cache is now stored at [model_dir]/file_cache.json
  • Optimized adding many new words to the lexicon, in many different grammars, all in one loading session: only rebuild L_disambig.fst once at the end.
  • External interfaces: Compiler.__init__(), decoding setup, etc.
  • Internal interfaces: wrappers, etc.
  • Major refactoring of C++ components, with a new inheritance hierarchy and configuration mechanism, making it easier to use and test features with and without "activity"
  • Many build changes

Removed

  • Python 2.7 support: it may still work, but will not be a focus.
  • Google cloud speech-to-text removed, as an unneeded dependency. Alternative dictation is still supported as an option, via a callback to an external provider.

Deprecated

  • Separate CLI Kaldi/OpenFST executables
  • Indirect AGF graph compilation (framework==agf-indirect)
  • Non-native FSTs
  • parsing_framework==text

Artifacts

  • Models are available here
  • kaldi-dragonfly-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2. Just unzip and run!
  • kaldi-caster-winpython: A self-contained, portable, batteries-included (python & libraries & model) distribution of kaldi-active-grammar + dragonfly2 + caster. Just unzip and run!

If you have trouble downloading, try using wget --continue.