Skip to content

sholiday/cppBKTree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cppBKTree

By Stephen Holiday 2011

(Exception, Distance Algorithm by Anders Sewerin Johansen)

The code is under the Apache 2.0 license.

This a C++ implementation of a BKTree (Burkhard-Keller). Essentially it allows searching for an object in a metric space.

One common use is for fuzzy string matching, like in a spell checker. The search is performed by looking at the distance of the test with the current node and moving according to the distance of the test to the children.

This technique is faster than brute force as it does not need to look at every possible node in the space.

Implemented according to this post.

I also wrote a Python implementation.

Tools

Aside from the data structure itself, I wrote two tools to play with BKTrees.

Too make the tools just run: make tools

This assumes that you have the boost serialization libraries [installed and compiled]((http://www.boost.org/doc/libs/release/more/getting_started/index.html).

bkLoad

This is a utility for creating a BKTree from a text file and saving to disk. The input files should have one entry per line.

./bkLoad [from] [to]
./bkLoad /usr/share/dict/words wordTree.dat

bkSearch

This is a utility for loading a BKTree from a serialized file.

./bkSearch [serialized tree] [word] [max edxit distance]
./bkSearch bkTree.dat word 3

Comments/Pull Requests

If you think something could be improved, I'm happy to accept pull requests.

Releases

No releases published

Packages

No packages published

Languages