Skip to content

Latest commit

 

History

History
52 lines (40 loc) · 2.32 KB

README.md

File metadata and controls

52 lines (40 loc) · 2.32 KB

LinSpell

Fast Spelling correction & Approximate string search

The LinSpell spelling correction algorithm does not require edit candidate generation or specialized data structures like BK-tree or Norvig's algorithm. In most cases LinSpell is faster and requires less memory compared to BK-tree or Norvig's algorithm. LinSpell is language and character set independent.


Copyright (C) 2017 Wolf Garbe
Version: 1.0
Author: Wolf Garbe <[email protected]>
Maintainer: Wolf Garbe <[email protected]>
URL: https://github.com/wolfgarbe/linspell
Description:
https://seekstorm.com/blog/symspell-vs-bk-tree/
License:
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License, 
version 3.0 (LGPL-3.0) as published by the Free Software Foundation.
http://www.opensource.org/licenses/LGPL-3.0

Usage

single word + Enter: Display spelling suggestions
Enter without input: Terminate the program

Performance

Benchmark
Benchmark 1

Applications

  • Query correction (10–15% of queries contain misspelled terms),
  • Chatbots,
  • OCR post-processing,
  • Automated proofreading.

Frequency dictionary

The word frequency list was created by intersecting the two lists mentioned below. By reciprocally filtering only those words which appear in both lists are used. Additional filters were applied and the resulting list truncated to ≈ 80,000 most frequent words.

Blog Posts: Algorithm, Benchmarks, Applications

SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking


LinSpell is contributed by SeekStorm - the high performance Search as a Service & search API