Retrieve tokens on the path to a matching #5

mpomarlan · 2019-07-26T13:20:37Z

It's often interesting to know not only the leaf token as selected by a pattern, but also some of the intermediate steps. Python's re package even provides functionality-- named groups-- to identify particular parts of a matching that may be interesting separately.

An example of how this might look like for grammaregex would be:

example sentence: "Mrs. Robinson graduated from the Wharton School of the University of Pennsylvania in 1980."
pattern: ?PVBD/prep/?PIN/pobj/?P*
matchings: [{"root": "graduated", "prep": "from", "where": "School"}, {"root": "graduated", "prep": "in", "where": "1980"}]

An example implementation of such behavior (with backwards compatibility: if no ?P<> appears in the pattern to match, just return tokens as before) can be found at this branch.

mpomarlan · 2019-07-26T13:49:13Z

meant to say,

pattern: ?P<root>VBD/prep/?P<prep>IN/pobj/?P<where>*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieve tokens on the path to a matching #5

Retrieve tokens on the path to a matching #5

mpomarlan commented Jul 26, 2019

mpomarlan commented Jul 26, 2019

Retrieve tokens on the path to a matching #5

Retrieve tokens on the path to a matching #5

Comments

mpomarlan commented Jul 26, 2019

mpomarlan commented Jul 26, 2019