-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runtime optimizations for math operations #1733
Conversation
Use methods of the Math class that should be a tiny bit more efficient
return Integer.valueOf((int) r); | ||
try { | ||
return Math.addExact(i1, i2); | ||
} catch (ArithmeticException ae) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering, if we should inline parts of the "addExact" here and do the fallback instead of throwing an exception?
Definition of Math.addExact:
@IntrinsicCandidate
public static int addExact(int x, int y) {
int r = x + y;
// HD 2-12 Overflow iff both arguments have the opposite sign of the result
if (((x ^ r) & (y ^ r)) < 0) {
throw new ArithmeticException("integer overflow");
}
return r;
}
- The overflow case is imho rare, but I think it is slower of some orders of magnitude now as before.
- Overflow detection is a "bit" magic with only one "if" branch.
- The Math operations are IntrinsicCandidate, so they might be faster than inlining the same java code by hand.
Do you think it is worth to call Math.addExact (and hope that it is faster through the @IntrinsicCandidate
) or take only the idea of the overflow detection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like you, I was assuming that, first, using a function of the Math class directly would be good because it allows JVM implementers in the future to implement it more efficiently, and second, that the overflow case would be very rare and therefore the exception path wouldn't be so bad.
Previously, I had been doing the arithmetic as a long and checking later, and I thought that "addExact" would be faster, but it wasn't. I hadn't benchmarked an overflow, though -- I might try that over the next few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holy crud, that exception behavior is much worse than I thought by two orders of magnitude!
Here are three functions:
private Object addIntsAsLongs(Integer a, Integer b) {
long r = a.longValue() + b.longValue();
if ((r > Integer.MAX_VALUE) || (r < Integer.MIN_VALUE)) {
return Double.valueOf(r);
}
return Integer.valueOf((int)r);
}
private Object addIntsTrickily(Integer a, Integer b) {
int r = a + b;
if (((a ^ r) & (b ^ r)) < 0) {
return Double.valueOf(a.longValue() + b.longValue());
}
return r;
}
private Object addIntsExact(Integer a, Integer b) {
try {
return Math.addExact(a, b);
} catch (ArithmeticException ae) {
long r = a.longValue() + b.longValue();
return Double.valueOf((double)r);
}
}
On my Intel box with Java 21, here are the results of a little benchmark.
With no overflow (2 + 2), all three run in about 0.203 nanoseconds (all are well within the standard deviation). (In other words, trying to do 32-bit addition versus 64-bit addition doesn't really matter.)
With overflow (1<<31 + 1<<31):
- addIntsAsLongs: 1.648 nanoseconds
- addIntsTrickily: 1.681 nanoseconds
- addIntsExact: 7173 nanoseconds (not a typo -- 7.173 microseconds)
So I think I will revert to the code that we had before!
These have very bad worst-case performance
I'm not hearing too many comments here so unless someone else wants to try this themselves I'm probably going to merge it in the next day or so. On to new things! |
Add type-optimized linkers for invokedynamic operations focused around math operations.
Many of these are implemented using complex trees of if...then statements in
ScriptRuntime. By having a suite of type-specific linkers, we can skip right
to the correct branch most of the time, while still being able to fall back
to generic operations that work on every type.
This set of optimizations improves a few of the V8 benchmarks from 5% to 30%.
Since lots of Rhino code uses math operations like comparisons even if it is not
"doing lots of math" this should help in lots of places.