Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yul: Avoid expensive operations in functions called very often #15574

Merged
merged 2 commits into from
Nov 20, 2024

Conversation

blishko
Copy link
Contributor

@blishko blishko commented Nov 15, 2024

Two changes are proposed here:

  1. Do not include expensive assertion check in the equality operator.
  2. Avoid expansive regex match (std::regex) whenever possible.

Add 1: The equality comparison is called very often in CommonSubexpressionEliminator, presumably because it is the comparison operator used for the map CommonSubexpressionEliminator::m_replacementCandidates. The method validLiteral is nontrivial and should not be called (twice!) on every call to the equality comparison.
It might be desirable to re-introduce the check in a more suitable place.

Add 2: Method Parser::fetchDebugDataFromComment was matching against a regex, but most of the time it was trying to match against an empty string. std::regex is known to be extremely slow. This would not be such a problem if this method was not called after every parsed token, see Parser::advance.
I believe even better solution would be to move the call to fetchDebugDataFromComment to a more appropriate place; my intuition says it is currently misplaced.

@blishko
Copy link
Contributor Author

blishko commented Nov 15, 2024

Benchmarking on my local laptop shows consistent improvement on external benchmarks:

Before:

File Pipeline Time Memory (peak) Exit code
openzeppelin-5.0.2 ir 14 s 465 MiB 0
openzeppelin-4.9.0 ir 18 s 573 MiB 0
liquity-2024-10-30 ir 19 s 802 MiB 0
openzeppelin-4.7.0 ir 26 s 592 MiB 0
openzeppelin-4.8.0 ir 29 s 692 MiB 0
uniswap-v4-2024-06-06 ir 53 s 1533 MiB 0
eigenlayer-0.3.0 ir 221 s 3059 MiB 0
sablier-v2-1.2.0 ir 627 s 6192 MiB 0

After:

File Pipeline Time Memory (peak) Exit code
openzeppelin-5.0.2 ir 14 s 498 MiB 0
openzeppelin-4.9.0 ir 18 s 607 MiB 0
liquity-2024-10-30 ir 18 s 812 MiB 0
openzeppelin-4.7.0 ir 24 s 595 MiB 0
openzeppelin-4.8.0 ir 28 s 715 MiB 0
uniswap-v4-2024-06-06 ir 50 s 1479 MiB 0
eigenlayer-0.3.0 ir 211 s 3167 MiB 0
sablier-v2-1.2.0 ir 592 s 6312 MiB 0

@blishko blishko marked this pull request as ready for review November 15, 2024 02:51
@blishko blishko force-pushed the optimize-optimizator branch from 3c8a332 to 846d615 Compare November 15, 2024 08:08
Copy link
Collaborator

@matheusaaguiar matheusaaguiar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Add 2 is fine and we could investigate your suggestion of relocating it to a better place.
I am not sure about the changes related to Add 1. Do you have the benchmarks with only Add 2?

Comment on lines +67 to +68
assert(validLiteral(_lhs));
assert(validLiteral(_rhs));
Copy link
Collaborator

@matheusaaguiar matheusaaguiar Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really significant? validLiteral is still being called twice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd myself just have removed these entirely. Any assertion that we deem ok to just do non-release we might as well not do at all in our context. But whatever, no need to keep this waiting.

Copy link
Member

@cameel cameel Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any assertion that we deem ok to just do non-release we might as well not do at all in our context.

I don't agree. The standard assert macro is perfectly fine as a baseline for sanity checks and even when disabled still adds value by documenting the assumptions. Having asserts checked in release builds is definitely nice to have and saved us many times, but when they're too costly I'd still rather fall back to using assert than remove them completely.

The method validLiteral is nontrivial and should not be called (twice!) on every call to the equality comparison.
It might be desirable to re-introduce the check in a more suitable place.

We're asserting validity here because == does not compare the whole LiteralValue. This is fine for valid literals but not in general and I don't think that's obvious.

A better place for these checks would be the implementation of ==, but that would introduce a circular dependency. And would not make it less costly anyway.

@blishko
Copy link
Contributor Author

blishko commented Nov 20, 2024

I don't have benchmarking results for the changes separately, although I could compute them if necessary.
Regarding the first point: We discussed this with @cameel. assert is called only in Debug mode. In Release mode, the macro results in no-op, so the calls to validLiteral disappear. yulAssert is always called. So with the proposed change we still get the sanity checks in Debug mode, but in Release mode, that is, in production, the compiler will no longer suffer from the performance penalty.

The equality comparison is called very often in
CommonSubexpressionEliminator, presumably because it is the comparison
operator used for the map
`CommonSubexpressionEliminator::m_replacementCandidates`.
The method `validLiteral` is nontrivial and should not be called
(twice!) on every call to the equality comparison.
@blishko blishko force-pushed the optimize-optimizator branch from 846d615 to 5150015 Compare November 20, 2024 16:30
@ekpyron ekpyron merged commit 4e954a3 into develop Nov 20, 2024
71 of 72 checks passed
@ekpyron ekpyron deleted the optimize-optimizator branch November 20, 2024 17:31
@cameel
Copy link
Member

cameel commented Nov 21, 2024

I am not sure about the changes related to Add 1. Do you have the benchmarks with only Add 2?

This is what it looks like for me:

IR

File Time (before) Time (commit 1) Time (after)
uniswap-v4-2022-06-16 17 s 19 s 18 s
openzeppelin-5.0.2 28 s 27 s 28 s
openzeppelin-4.9.0 36 s 35 s 36 s
liquity-2024-10-30 39 s 38 s 43 s
openzeppelin-4.7.0 51 s 50 s 52 s
openzeppelin-4.8.0 56 s 54 s 57 s
uniswap-v4-2024-06-06 112 s 105 s 118 s
eigenlayer-0.3.0 497 s 479 s 511 s
sablier-v2-1.2.0 1378 s 1371 s 1347 s

Legacy

File Time (before) Time (commit 1) Time (after)
uniswap-v4-2022-06-16 5 s 5 s 5 s
openzeppelin-5.0.2 9 s 9 s 10 s
openzeppelin-4.9.0 11 s 12 s 12 s
liquity-2024-10-30 10 s 9 s 10 s
openzeppelin-4.7.0 17 s 17 s 19 s
openzeppelin-4.8.0 20 s 20 s 21 s
uniswap-v4-2024-06-06 20 s 20 s 22 s
eigenlayer-0.3.0 75 s 70 s 78 s
sablier-v2-1.2.0 199 s 212 s 209 s

@blishko
Copy link
Contributor Author

blishko commented Nov 21, 2024

I am not sure about the changes related to Add 1. Do you have the benchmarks with only Add 2?

This is what it looks like for me:

Oh, this almost suggests we should revert the second commit.
And I am buffled how the first commit could have made sablier-v2-1.2.0 on legacy slower, because we are just removing an operation. Moreover on a code path that is not executed in legacy mode, no? That seems to be a fluke?

@cameel
Copy link
Member

cameel commented Nov 21, 2024

Probably a fluke given what's in the PR. I would not expect it slow anything down so I'd not worry too much about it. The differences are still just a few percent. Unfortunately these numbers are not consistent enough to evaluate very small differences with a single run (at least on my hardware, @clonker seems to be getting better results). Running them multiple times and averaging helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants