Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interfaces/prompting/patterns: add checks for duplicate pattern variants #14319

Conversation

bboozzoo
Copy link
Contributor

@bboozzoo bboozzoo commented Aug 7, 2024

Add a specific unit for parsing and then rendering a pattern which produces many identical variants.

Thanks for helping us make a better snapd!
Have you signed the license agreement and read the contribution guide?

@bboozzoo bboozzoo added Simple 😃 A small PR which can be reviewed quickly Skip spread Indicate that spread job should not run labels Aug 7, 2024
Copy link
Contributor

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay optimization pass :-)

Copy link
Member

@olivercalder olivercalder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the options in the simple case end up being reduced by the optimization pass. But if there are non-trivial groups, this might not be possible.

Comment on lines 99 to 131
// all variants are the exact same path
pattern := "/{fo{obar},foo{b{ar,ar},bar,{ba}r}}"
scanned, err := scan(pattern)
c.Assert(err, IsNil)
parsed, err := parse(scanned)
c.Assert(err, IsNil)

expectedVariants := []string{
"/foobar",
}

variants := make([]string, 0, len(expectedVariants))
renderAllVariants(parsed, func(index int, variant PatternVariant) {
variants = append(variants, variant.String())
})

c.Check(variants, DeepEquals, expectedVariants)
c.Check(variants, HasLen, parsed.NumVariants())
c.Check(variants, HasLen, 1)
}
Copy link
Member

@olivercalder olivercalder Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// all variants are the exact same path
pattern := "/{fo{obar},foo{b{ar,ar},bar,{ba}r}}"
scanned, err := scan(pattern)
c.Assert(err, IsNil)
parsed, err := parse(scanned)
c.Assert(err, IsNil)
expectedVariants := []string{
"/foobar",
}
variants := make([]string, 0, len(expectedVariants))
renderAllVariants(parsed, func(index int, variant PatternVariant) {
variants = append(variants, variant.String())
})
c.Check(variants, DeepEquals, expectedVariants)
c.Check(variants, HasLen, parsed.NumVariants())
c.Check(variants, HasLen, 1)
}
// all variants are the exact same path
for _, testCase := range []struct {
pattern string
expectedVariants []string
}{
{
"/{fo{obar},foo{b{ar,ar},bar,{ba}r}}",
[]string{"/foobar"},
},
{
"/{{foo/{bar,baz},123},{123,foo{/bar,/baz}}}",
[]string{
"/foo/bar",
"/foo/baz",
"/123",
"/123",
"/foo/bar",
"/foo/baz",
},
},
} {
scanned, err := scan(testCase.pattern)
c.Assert(err, IsNil)
parsed, err := parse(scanned)
c.Assert(err, IsNil)
variants := make([]string, 0, len(testCase.expectedVariants))
renderAllVariants(parsed, func(index int, variant PatternVariant) {
variants = append(variants, variant.String())
})
c.Check(variants, DeepEquals, testCase.expectedVariants)
c.Check(variants, HasLen, parsed.NumVariants())
}
}

Copy link
Member

@olivercalder olivercalder Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to sort the expected variants and received variants as well, unless we figure out what the ordering ought to be. Or perhaps it's correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've tested this locally, the new suggestion is what actually happens, we end up with duplicate patterns. I will look into whether this is okay, but if it's not, we'll want to add a uniqueness check so we don't self-conflict by accident.

@ernestl
Copy link
Collaborator

ernestl commented Sep 19, 2024

@olivercalder, what next?

@ernestl ernestl modified the milestones: 2.66, 2.67 Sep 19, 2024
bboozzoo and others added 4 commits September 19, 2024 23:39
Add a specific unit for parting and then rendering a pattern which produces many
identical variants.

Signed-off-by: Maciej Borzecki <[email protected]>
There's not a great way to identify whether there are duplicate patterns
when they are not caught by the optimization phase. Rather than omit
duplicate variants and thus have `NumVariants` not reflect the number of
patterns which will be returned by `RenderAllVariants`, instead allow
duplicate patterns in the output.

Signed-off-by: Oliver Calder <[email protected]>
@olivercalder olivercalder force-pushed the bboozzoo/prompting-patterns-rule-optimization-test branch from 681a235 to 09a6c16 Compare September 20, 2024 05:01
@olivercalder
Copy link
Member

So the optimization pass doesn't (can't?) catch all duplicate patterns, and if we add a deduplication pass in RenderAllVariants, then NumVariants doesn't match. So I'd rather add deduplication at the (1) call site, which is in requestrules: if a pattern variant conflicts with an existing variant but that variant belongs to the rule we're trying to add, ignore it.

I've made these changes, and rebased on master. Ready for re-review @bboozzoo. This is a small change, but it does fix a bug, so ideally it would be nice to land this in 2.66. CC @ernestl

Copy link

codecov bot commented Sep 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.88%. Comparing base (04d0ab0) to head (17711d9).
Report is 69 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #14319      +/-   ##
==========================================
+ Coverage   78.83%   78.88%   +0.04%     
==========================================
  Files        1078     1082       +4     
  Lines      145096   145954     +858     
==========================================
+ Hits       114389   115130     +741     
- Misses      23546    23629      +83     
- Partials     7161     7195      +34     
Flag Coverage Δ
unittests 78.88% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zyga
Copy link
Contributor

zyga commented Sep 20, 2024

@olivercalder can we sync on this? I'd like to understand how to progress.

You are right that the optimization pass doesn't reduce all the duplicates today, but I believe that is not a fundamental limitation.

It looks like what we need is recursive reduction. I will look after planning, if you want we can stay and chat.

Copy link
Contributor

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a pretty long call about details and I think the following are true:

  • There was never a promise that the set of strings produced by the pattern renderer contains no duplicates
  • The extra precautions added to requestrules are correct
  • We can improve the optimizer for patterns (separately) reducing the complexity of the tree at the cost of making individual strings longer.

Here's a link to what we discussed: #14526

@olivercalder
Copy link
Member

Similar safety checks are required when removing a rule from the tree, so we don't treat absence of a variant we already rendered and removed from the tree as internal inconsistency.

Copy link
Member

@olivercalder olivercalder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, we now have checks in place in both places we call RenderAllVariants so we don't throw an error. This was a legitimate bug, which this PR fixes. I think this PR is good to land, pending approval from @bboozzoo.

The further optimizations discussed in #14526 can be added in a follow-up, if desired.

@olivercalder olivercalder reopened this Sep 25, 2024
@olivercalder olivercalder changed the title interfaces/prompting/patterns: add unit test for conflicting patterns interfaces/prompting/patterns: add checks for duplicate pattern variants Sep 25, 2024
@olivercalder olivercalder modified the milestones: 2.67, 2.66 Sep 26, 2024
@@ -461,6 +461,9 @@ func (rdb *RuleDB) addRulePermissionToTree(rule *Rule, permission string) []prom
switch {
case !exists:
newVariantEntries[variantStr] = newEntry
case conflictingVariantEntry.RuleID == rule.ID:
// Rule has duplicate variant, so ignore it
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the new test TestAddRuleDuplicateVariants isn't covering this?? Investigating...

@olivercalder olivercalder removed the Skip spread Indicate that spread job should not run label Sep 27, 2024
@olivercalder olivercalder reopened this Sep 27, 2024
@olivercalder olivercalder modified the milestones: 2.66, 2.67 Sep 30, 2024
Copy link
Member

@olivercalder olivercalder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh I see. It's that we're always looking at the existing permVariants.VariantEntries, but we're only adding entries to the newVariantEntries variable, which is used to update the real permVariants.VariantEntries only after it has been determined that there's no conflict. So we only ever end up potentially overwritng an entry in newVariantEntries with an identical entry which was previously added, which isn't a problem. So this switch case is unnecessary, and should be replaced with a comment in case the logic is ever changed to iteratively add new entries to the real entries map (which is unlikely). Also, #14538 proposes a rewrite of the addVariant closure, so the changes there should be examined to ensure that duplicate variants in a rule doesn't self-conflict.

Most recent commit removes the superfluous check here. This now looks all good to me and ready to merge if @bboozzoo agrees.

Copy link
Contributor Author

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

(note I can't self-approve)

@olivercalder olivercalder merged commit 0a99a8d into canonical:master Oct 1, 2024
54 of 56 checks passed
@olivercalder olivercalder modified the milestones: 2.67, 2.66 Oct 2, 2024
@bboozzoo bboozzoo deleted the bboozzoo/prompting-patterns-rule-optimization-test branch October 3, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Simple 😃 A small PR which can be reviewed quickly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants