- What is this?
- When should I use this?
- Install
- Use
- API
- CLI
- Types
- Compatibility
- Related
- Contribute
- Security
- License
This package exposes a stemming algorithm. That means it gets a certain string (typically an English word), and turns it into a shorter version (a stem), which can then be compared to other stems (of other words), to check if they are both (likely) the same term.
You’re probably dealing with natural language and know you need this if you’re here!
This package is ESM only. In Node.js (version 16+), install with npm:
npm install lancaster-stemmer
In Deno with esm.sh
:
import {lancasterStemmer} from 'https://esm.sh/lancaster-stemmer@2'
In browsers with esm.sh
:
<script type="module">
import {lancasterStemmer} from 'https://esm.sh/lancaster-stemmer@2?bundle'
</script>
import {lancasterStemmer} from 'lancaster-stemmer'
console.log(lancasterStemmer('considerations')) // => 'consid'
console.log(lancasterStemmer('detestable')) // => 'detest'
console.log(lancasterStemmer('vileness')) // => 'vil'
console.log(lancasterStemmer('giggling')) // => 'giggl'
console.log(lancasterStemmer('anxious')) // => 'anxy'
// Case insensitive
console.log(lancasterStemmer('analytic') === lancasterStemmer('AnAlYtIc')) // => true
This package exports the identifier lancasterStemmer
.
There is no default export.
Get the stem from a given value.
value
(string
, required) — value to stemoptions
(Options
, default:{}
) — configuration
Stem for value
(string
).
Configuration (TypeScript type).
style
(Style
, default:'c'
) — style of algorithm
Style of algorithm (TypeScript type).
There are small algorithmic differences between how the algorithm was implemented over the years. Looking at Algorithm Implementations on the archived website, there are four styles available, in addition to the original paper.
The only difference currently implemented in this package is whether a final
s
is kept before stopping (paper
) or dropped before stopping (c
).
'c'
— rules from the ANSI C (Stark, 1994) and Perl (Taffet, 2001) implementations (compensation
->compen
)'paper'
— rules from the original paper (1990), and Pascal (Paice/Husk) and Java (O’Neill, 2000) implementations (compensation
->compens
)
Usage: lancaster-stemmer [options] <words...>
Lancaster stemming algorithm
Options:
-h, --help output usage information
-v, --version output version number
Usage:
# output stems
$ lancaster-stemmer considerations
consid
# output stems from stdin
$ echo "detestable vileness" | lancaster-stemmer
detest vil
This package is fully typed with TypeScript.
It exports the additional types Options
and
Style
.
Projects maintained by the unified collective are compatible with maintained versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of
Node.
This means we try to keep the current release line, lancaster-stemmer@^2
,
compatible with Node.js 12.
stemmer
— porter stemmer algorithmdouble-metaphone
— double metaphone algorithmsoundex-code
— soundex algorithmdice-coefficient
— sørensen–dice coefficientlevenshtein-edit-distance
— levenshtein edit distancesyllable
— syllable count of English words
Yes please! See How to Contribute to Open Source.
This package is safe.