Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards Handling Big LHS #361

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zhengyang92
Copy link
Contributor

@zhengyang92 zhengyang92 commented Oct 3, 2018

Always feed getGuesses() with Vars in LHS: the original findCands() for generating synthesis candidates are called with MaxCand = 15, this may miss the input variables declared at the beginning of a large program. This patch fixes this.

Add a switch for big query: In some cases the huge big query may abort the Souper

Enable the ctpop test case.

@zhengyang92 zhengyang92 changed the title Handle Big LHS Towards Handling Big LHS Oct 3, 2018
@zhengyang92
Copy link
Contributor Author

Solving the conflicts.

findCands(LHS, Inputs, /*WidthMustMatch=*/false, /*FilterVars=*/false, MaxLHSCands);
std::vector<Inst *> Vars;
findVars(LHS, Vars);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should talk about this part. why do you think it's a good idea? do you have data supporting that it does a better job?

Copy link
Contributor Author

@zhengyang92 zhengyang92 Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because findCands() stops when it reaches MaxLHSCands(which is set to 15 by default). However, this might miss the input variable declared at the top of program. Due to this limitation, the synthesizer cannot gives correct ctpop result since the input %x is not harvested as a synthesis component.

Exhaustive synthesis has a call findVars() after doing the big query, hoisting up this call does not have much impact to the compilation performance, compared with the potential benefit of getting better synthesis results.

This approach is stupid I admit, it basically runs the traversal on the LHS twice. I can write a better version of the findCands() which runs once.

Copy link
Contributor Author

@zhengyang92 zhengyang92 Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code basically guarantees all the program inputs are collected as components in the synthesis.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that tuning MaxLHSCands will be useful, but just harvesting the vars doesn't sound like the right solution.
for one thing, what if there are 500 vars? then synthesis of just one select instruction is going to involve evaluating 500x500x500 guesses

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea behind the hard bound is that it prevents these degenerate cases, but your patch will just open us up to them again

Copy link
Contributor Author

@zhengyang92 zhengyang92 Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this makes sense. But instead of stopping at a hard bounded limit when doing findCands(), can we traverse the whole tree and hard bound the result by the descending order by the "benefit" of each candidates? There is already a "benefit" tagging algorithm there implemented in findCands().

Copy link
Contributor Author

@zhengyang92 zhengyang92 Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current "benefit" tagging algorithm simply follows: the deeper the traverse goes, the higher benefit the candidate gets. We may further customize this benefit function to fit our needs. Say some algorithm can tell some kind of candidates are more likely to be used as a component in synthesis, then we may increase the "benefit" of this kind of candidates, to reduce the synthesis time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the benefit is a property of the RHS cost vs. LHS cost, not a property of the choice of inputs, which is what we're talking about here.

my observation is that most rewrites are going to occur at the bottom of a souper LHS, which is why I wrote the code to get inputs using DFS on the LHS.

if we pick the ones likely to lead to the highest benefit, you're saying to pick inputs from the top of the LHS. I'm not opposed to this, but I think it is a mistake to make the change before we have data showing that the change is a good one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway I'm saying:

  • this is really two patches, please split them
  • for each patch, I want you to justify it using data instead of making guesses about what will work best

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will split this pr and collect data

@@ -32,6 +32,9 @@ namespace {
"The larger the number is, the more fine-grained debug "
"information will be printed"),
cl::init(0));
static cl::opt<bool> NoBigQuery("souper-exhaustive-synthesis-no-big-query",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get some data about this also -- maybe run synthesis with and without it and see how it changes the CPU time used

@regehr
Copy link
Collaborator

regehr commented Oct 3, 2018

I think you should separate the bigquery part of this PR from the changes to how LHSs are chosen

@regehr
Copy link
Collaborator

regehr commented Oct 3, 2018

for bigquery, we should choose the default to be the one that runs faster-- but to make that decision we need data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants