Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exp and pow return infinity too early #600

Open
MattyMuir opened this issue Oct 31, 2024 · 14 comments
Open

Exp and pow return infinity too early #600

MattyMuir opened this issue Oct 31, 2024 · 14 comments

Comments

@MattyMuir
Copy link

Describe the bug
The exp function (for SIMD and pure C) returns infinity for all values past a certain threshold, but this threshold is slightly lower than it should be. Looking at the code for double precision, infinity is returned for all x > 709.78271114955742909217217426. However, there are values past this limit whose exponential can still be represented with a finite double. All standard libraries I've tested use the following, slightly larger, threshold instead x > 709.782712893384 (0x1.62e42fefa39efp+9 in binary). This doesn't appear to be an issue for the single precision implementation.

This same threshold is also used in the pow function, so the bug is present there as well.

Command lines, logs and environment
All the build steps, environment and configuration are the same as in my other issue, #570, though they shouldn't matter in this case.

To Reproduce

int main()
{
	double x = 709.782712;
	std::cout << "libc:  " << exp(x) << '\n';
	std::cout << "Sleef: " << Sleef_exp_u10(x) << '\n';
}

This produces the following output on my machine

libc:  1.79769e+308
Sleef: inf

Fixing this issue might be as simple as just changing the constant in the implementation, provided the series is still accurate past this point.

@blapie
Copy link
Collaborator

blapie commented Oct 31, 2024

Hello @MattyMuir, thanks for reporting this!

We will try to adjust the threshold so it is a little closer to general expectation. I also think that constants should be represented using hexfloat, they currently use too many digits, which I don't think is necessary.

Please consider the 2 following notes just FYI

Note 1: I take back my original note, the ieee754 standard does mandate overflow for exp and pow (table 9.1) in the way defined in section 7.4:

The overflow exception shall be signaled if and only if the destination format’s largest finite number is exceeded in magnitude by what would have been the rounded floating-point result

Note 2: Other math libraries like Arm Optimized routines have a more elaborate approach, it evacuates the early overflow to a separate branch where it is fixed, see https://github.com/ARM-software/optimized-routines/blob/master/math/aarch64/sv_exp_1u5.c.

@blapie
Copy link
Collaborator

blapie commented Oct 31, 2024

Just edited my notes above, as the first one was mistaken.

@MattyMuir
Copy link
Author

Just had a look at some of the single precision functions. Looks like similar bugs are present in log1pf, sinhf, and coshf. I think the tester needs to be updated to catch these, especially if overflow is mandated by the IEEE standard.

@blapie
Copy link
Collaborator

blapie commented Nov 18, 2024

Hello @MattyMuir, Thank you for reporting these too. It does seem to impact several routines indeed.
I'll start looking into fixing implementation and updating tests shortly.

@blapie
Copy link
Collaborator

blapie commented Nov 20, 2024

I'll soon upload a PR to fix this and add some tests. I just realised #523 reported the same issue, and confirms that the ulp error is still 1.0 in the interval between the old and new thresholds.

@blapie
Copy link
Collaborator

blapie commented Nov 21, 2024

Similarly the log1pf issue was pointed out in #473.

@blapie
Copy link
Collaborator

blapie commented Nov 21, 2024

Hello @shibatch, We are looking at a range of functions for which SLEEF implements an early overflow bound (as in earlier than what the standard recommends). I suspect these bounds were set to maintain accuracy, could you please confirm that? If so we need to be careful if we are going to update them.
Were there any other reasons driving these choices of bounds?

Luckily it turns out that exp still provides 1ULP when threshold is changed to one meeting the standard, but it is notoriously hard to assess accuracy for bivariate functions like pow or functions for which the number of extra finite point to test is too large to be computed exhaustively (e.g. log1p).

I will run tests as thoroughly as I can, but I would appreciate your insight on this.

If you have specific comments on exp or pow please comment on #523, and on log1p(f) please comment on #473.

@shibatch
Copy link
Owner

Hello,
I don't see any problem in changing the overflow threshold.
I am aware of pretty many minor accuracy problems, and I have been fixing them bit by bit.
On the other hand, when I heard requests from users, they were mainly talking about how to make the computation faster rather than about those minor accuracy issues, so fixing them was a low priority.

@blapie
Copy link
Collaborator

blapie commented Nov 21, 2024

Ok, good to know. I think it is fair, they are relatively minor issues. Thanks!

blapie added a commit to blapie/sleef that referenced this issue Nov 28, 2024
Issue shibatch#600 highlighted a limit of current
implementation where early overflow is noticed,
which diverges from the C99 standard.

This patch adds test to catch this type of cases.

It only checks a few points therefore we don't
know for sure if overflow occurs at the right time
but we know if something is wrong in the overflow
region.

Fix both scalar and simd implementation (double precision)

Fixes shibatch#523.
@blapie
Copy link
Collaborator

blapie commented Nov 28, 2024

I've just created a PR to fix the threshold. As indicated in #523 we can confidently say that the change does not affect overall accuracy in exp.

I have checked pow accuracy with our in-house ulp error assessment program, and here is what I get

...
Largest error so far is 0.50511 at [0x1.7fed001e5f0edp-1, 0x1.1b5ce4d1fb0aep+11] ([0x3fe7fed001e5f0ed, 0x40a1b5ce4d1fb0ae])
  Got  0x1.6e9d97108d4e2p-942
  Want 0x1.6e9d97108d4e1p-942
Largest error so far is 0.57493 at [0x1.7f136e35a1af6p-1, 0x1.2c2f3c91cf6c5p+11] ([0x3fe7f136e35a1af6, 0x40a2c2f3c91cf6c5])
  Got  0x1.ee75e5d42b4bp-1006
  Want 0x1.ee75e5d42b4afp-1006
Largest error so far is 0.59458 at [0x1.7e7a67798b72dp-1, 0x1.e0157ee6672fbp+10] ([0x3fe7e7a67798b72d, 0x409e0157ee6672fb])
  Got  0x1.fb663ae3f3a63p-809
  Want 0x1.fb663ae3f3a62p-809
Largest error so far is 0.65361 at [0x1.7f5c8e80a3cf7p-1, 0x1.235db085e49b7p+11] ([0x3fe7f5c8e80a3cf7, 0x40a235db085e49b7])
  Got  0x1.f992991fa3f97p-974
  Want 0x1.f992991fa3f96p-974
Largest error so far is 0.74696 at [0x1.7ff1b57d71188p-1, 0x1.2e8d51b04ab8p+11] ([0x3fe7ff1b57d71188, 0x40a2e8d51b04ab80])
  Got  0x1.e75a8477de9e7p-1006
  Want 0x1.e75a8477de9e6p-1006
Largest error so far is 0.74696 at [0x1.7ff1b57d71188p-1, 0x1.2e8d51b04ab8p+11] ([0x3fe7ff1b57d71188, 0x40a2e8d51b04ab80])
  Got  0x1.e75a8477de9e7p-1006
  Want 0x1.e75a8477de9e6p-1006

@shibatch Can you please confirm wether the 1.0ULP threshold include the extra 0.5ULP offset introduced by the rounding of the input value (worst case in round to nearest)?

  • If so then, our result indicates that there are a few points above 1ULP.
  • If it does not, then the effective accuracy is 1.5ULP, so our result simply confirms that with high confidence pow is under 1.0ULP(+0.5ULP) error.

These max are independent of the change in oflow threshold, since |ylogx| < 709, for instance

x = 7.498910e-01
y = 2.420416e+03
ylogx = -6.966623e+02 // |ylogx| < 709
ret = 2.776058e-303 (0x1.e75a8477de9e7p-1006) versus exact return value = 0x1.e75a8477de9e6p-1006

Note that it is quite difficult to isolate cases for which ylog(x) > thres.

@shibatch
Copy link
Owner

I have confirmed that the error exceeds 1ulp when arguments are (0x1.7fed001e5f0edp-1,0x1.1b5ce4d1fb0aep+11).

pow : arg0 = 0x1.7fed001e5f0edp-1 (0.749855), arg1 = 0x1.1b5ce4d1fb0aep+11 (2266.9), ulp = 1.00511, t = 3.852255144842898e-284, c = 3.85226e-284

@blapie
Copy link
Collaborator

blapie commented Nov 29, 2024

Thanks for confirming.
Would you be able to confirm these 2 affirmations?

  • So the u10 routines are effectively 1.5ULP (u35 are effectively 4ULP) when considering worst-case rounding of the input?

  • Do you also confirm the maximum reported error of 0.74696 ULP (+0.5ULP)?

Largest error so far is 0.74696 at [0x1.7ff1b57d71188p-1, 0x1.2e8d51b04ab8p+11] ([0x3fe7ff1b57d71188, 0x40a2e8d51b04ab80])
  Got  0x1.e75a8477de9e7p-1006
  Want 0x1.e75a8477de9e6p-1006

@shibatch
Copy link
Owner

Rounding of inputs is not taken into account in calculating accuracy.
If we are going to take that into account, what happens when we give a very large argument for sin?

@MattyMuir
Copy link
Author

I have confirmed that the error exceeds 1ulp when arguments are (0x1.7fed001e5f0edp-1,0x1.1b5ce4d1fb0aep+11).

pow : arg0 = 0x1.7fed001e5f0edp-1 (0.749855), arg1 = 0x1.1b5ce4d1fb0aep+11 (2266.9), ulp = 1.00511, t = 3.852255144842898e-284, c = 3.85226e-284

Testing with these same arguments in the latest version also exceeds 1ULP error. It looks like this inaccuracy is unrelated to the new PR.

blapie added a commit that referenced this issue Dec 3, 2024
Issue #600 highlighted a limit of current
implementation where early overflow is noticed,
which diverges from the C99 standard.

This patch adds test to catch this type of cases.

It only checks a few points therefore we don't
know for sure if overflow occurs at the right time
but we know if something is wrong in the overflow
region.

Fix both scalar and simd implementation (double precision)

Fixes #523.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants