Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Prioritize nodes with more waiting parents when adding new candidates #4155

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

acmorrow
Copy link
Contributor

@acmorrow acmorrow commented May 20, 2022

Contributor Checklist:

  • I have created a new test or updated the unit tests to cover the new/changed functionality.
  • I have updated CHANGES.txt (and read the README.rst)
  • I have updated the appropriate documentation

for p, subtract in parents.items():
p.ref_count = p.ref_count - subtract
if T: T.write(self.trace_message('Task.postprocess()',
p,
'adjusted parent ref count'))
if p.ref_count == 0:
self.tm.candidates.append(p)
new_candidates.append(p)
self.tm.candidates.extend(sorted(new_candidates, key = lambda c: len(c.waiting_parents)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to append, then sort the whole array by # of parents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The candidates list can be very long. I don't think I'd want to pay to re-sort it each time we completed a node. The goal of this change is just to ensure that if post-processing a new node unblocks several others that we stage the newly unblocked node that will itself unblock the most work at the top of the candidates stack.

We could investigate whether using a more sophisticated data structure could be used here, but I'd want to do that as a separate PR. At minimum, it would be important to consider what effects that might have on Random. The current approach implements Random by randomizing the order in which things are added to the candidates list, but if that list is something like a priority queue where "number of things unblocked" is the prioritization, that'd mean that the insertion order doesn't matter. There is also the issue that the "number of things unblocked" is adjusted dynamically, since we may not yet have evaluated all the candidates that feed into that metric.

@acmorrow
Copy link
Contributor Author

Let's hang back on merging this. I do think there is something to be done along these lines, but I 1) agree that this is at best a partial implementation because it doesn't globally prioritize, and 2) I haven't been able to demonstrate the increased build throughput that I hoped it would provide.

@mwichmann mwichmann changed the title Prioritize nodes with more waiting parents when adding new candidates [WIP] Prioritize nodes with more waiting parents when adding new candidates Oct 24, 2022
@mwichmann
Copy link
Collaborator

I've added a WIP to the title to that end

@bdbaddog
Copy link
Contributor

@acmorrow - is this PR still valid/useful/ updatable with other changes already merged?

@acmorrow
Copy link
Contributor Author

I still think there is something to this idea, since, given the option, it always seems better to act on nodes with lots of waiting parents preferentially, but without a meaningful performance testing environment for SCons it is hard to show whether it actually improves throughput.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants