Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #12235 performance regression (hang) in 2.13dev #5715

Merged
merged 14 commits into from
Dec 15, 2023

Conversation

chrchr-github
Copy link
Collaborator

No description provided.

@chrchr-github
Copy link
Collaborator Author

I'm not quite sure about the fix. There might also be something else that's going wrong.

@chrchr-github chrchr-github marked this pull request as draft December 1, 2023 17:39
@chrchr-github
Copy link
Collaborator Author

We have this in findVariableChanged():

    for (Token *tok = start; tok != end; tok = tok->next()) {
        if (isExpressionChangedAt(getExprTok, tok, indirect, exprid, globalvar, settings, cpp, depth))
            return tok;
    }
    return nullptr;

It seems that we (incorrectly?) returned true from isExpressionChanged() for [, whereas we now iterate over the whole init list, where we then have to find the beginning again for every token.
Lower down in the call stack is bifurcate(), which is looking for a change to an instance variable in the global scope. That seems dubious as well. @pfultz2

@chrchr-github
Copy link
Collaborator Author

When getExprTok() returns nullptr (which it does for any Token not in a function scope), we assume an alias and call the expensive isVariableChanged().
I guess at least commas and literals can never alias anything? Or should this be fixed elsewhere?

@chrchr-github
Copy link
Collaborator Author

The isVariableChanged() test used to fail before 77bfec4.

@firewave
Copy link
Collaborator

@danmar @pfultz2 could you please chime in on this? I consider this a release blocker.

@@ -152,6 +152,45 @@ def test_progress_j(tmpdir):
assert stdout == "Checking {} ...\n".format(test_file)
assert stderr == ""

@pytest.mark.timeout(10)
def test_slow_initlist_varchanged(tmpdir):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the test-performance.py for the performance tests.

@danmar
Copy link
Owner

danmar commented Dec 14, 2023

looks good to me.

lib/astutils.cpp Outdated
@@ -2806,7 +2806,8 @@ Token* findVariableChanged(Token *start, const Token *end, int indirect, const n
if (depth < 0)
return start;
auto getExprTok = memoize([&] {
return findExpression(start, exprid);
const auto* e = findExpression(start, exprid);
return e ? e : start;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't return start. It supposed to return the token that has that exprid or nullptr if it can't be found.

I dont really understand how this would fix a hang though as you are still calling findExpression. As a perf improvement we could probably skip searching of a variable changed if its already a unique expression id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I see your other comment on why you did this, but it still shouldn't return start as that is wrong.

@pfultz2
Copy link
Contributor

pfultz2 commented Dec 14, 2023

I guess at least commas and literals can never alias anything? Or should this be fixed elsewhere?

Yea we could skip those in isExpressionChangedAt, something like:

static bool isExpressionChangedAt(const F& getExprTok,
                                  const Token* tok,
                                  int indirect,
                                  const nonneg int exprid,
                                  bool globalvar,
                                  const Settings* settings,
                                  bool cpp,
                                  int depth)
{
    if (depth < 0)
        return true;
    if (tok->isLiteral() || tok->isKeyword() || Token::Match(tok, ",|;|:|%type%"))
        return false;
    if (tok->exprId() != exprid) {
        ...
    }
    return (isVariableChanged(tok, indirect, settings, cpp, depth));
}

This could probably be extended to skip more things such as variable declarations, case labels or control flow parenthesis(ie Token::Match(tok->previous(), "%name% (") || tok->isControlFlowKeyword()).

@chrchr-github chrchr-github marked this pull request as ready for review December 14, 2023 21:25
@chrchr-github
Copy link
Collaborator Author

I guess at least commas and literals can never alias anything? Or should this be fixed elsewhere?

Yea we could skip those in isExpressionChangedAt, something like:

    ...
    if (tok->isLiteral() || tok->isKeyword() || Token::Match(tok, ",|;|:|%type%"))
        return false;

Looks like that was enough to pass the test. %type% causes test failures related to global variables. Maybe isStdType() would work? And what about a huge init list like { 1 + 1, 2 + 2, 3 + 3, ...}?
It might still be worth looking at bifurcate() and the memoize() function. It seems once we have found a nullptr, we are stuck with that and assume an alias(?).

@pfultz2
Copy link
Contributor

pfultz2 commented Dec 15, 2023

It might still be worth looking at bifurcate() and the memoize() function. It seems once we have found a nullptr, we are stuck with that and assume an alias(?).

We probably need to use findExpressionChanged which will take a token so we dont need to use this memoized function to find a token for the expression. We can probably update bifurcate() to pass var->nameToken() to it. There may be other places where isVariableChanged or findVariableChanged could be updated to use findExpressionChanged.

Another thing, is to call isVariableChanged in the symboldatabase and set flags if it changed(based on indirect 0,1,2) and if its inconclusive. Then calls to check if it mutates the token at that point would be very fast.

@chrchr-github
Copy link
Collaborator Author

Running the test with #define ROW 1 + 1 , 2 + 2 , 3 + 3 , 4 + 4 , 5 + 5 , 6 + 6 , 7 + 7 , 8 + 8 , still results in a hang.

@chrchr-github chrchr-github merged commit 02fed7a into danmar:main Dec 15, 2023
68 checks passed
@chrchr-github chrchr-github deleted the chr_Fix12235 branch December 15, 2023 10:01
@firewave
Copy link
Collaborator

Great work, everybody!

@danmar I know a release is imminent but we should let this soak a few days so we get updated daca results and can see if there is something else left in terms of performance regressions. I already started a run to update the known regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants