Skip to content

Matcher on lower lemma #5630

Discussion options

You must be logged in to vote

This is more of a usage question, so let's keep this on SO. My answer is copied below for reference:

{'LOWER': {'LEMMA': 'education'}} isn't a valid pattern, and unless you turn on validation (see below), the Matcher silently discards ill-formed attributes, so in effect this pattern is treated like {}, which matches any token, which is why you get so many results.

You can use either

{'LOWER': 'education'}
{'LEMMA': 'education'}

but they can't be nested.

Use Matcher(nlp.vocab, validate=True) for more thorough validation when writing patterns. (It's off by default because it makes adding patterns a lot slower.)

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / matcher Feature: Token, phrase and dependency matcher
2 participants
Converted from issue

This discussion was converted from issue #5630 on December 11, 2020 00:17.