Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nv2a: Adjust NaN handling to be similar to HW #913

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

abaire
Copy link
Contributor

@abaire abaire commented May 11, 2022

Fixes #365 by forcing treatment of NaN values generated in the vertex shader to follow the same pattern as HW.

Unfortunately this does not address the fact that on HW, +/- NaN * 0 (or very close to 0) should be 0. This difference in behavior is also responsible for #1008.

This also does not address the same issue for +/- inf and the shader version currently used by xemu explicitly notes that operations on inf/NaN's are undefined (which may explain why Otogi textures are not blacked out on M1 mac).

Tests
HW results

@abaire
Copy link
Contributor Author

abaire commented May 11, 2022

Hey @NZJenkins I wasn't sure if you were already looking at a fix, but I started to suspect this might be partly to blame for #801 so I threw together a partial fix to unblock my debugging there.

This is mostly a proof of concept at the moment as it doesn't fully match the HW behavior and because some investigation will need to be done on the performance impact (and possibly some optimization)

" return src;\n"
" }\n"
" ivec4 signs = floatBitsToInt(src);\n"
" vec4 dest;\n"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff can be vectorized, assuming the logic will stay the same:

" return mix(src, mix(vec4(1.0), vec4(0.0), lessThan(signs, ivec4(0))), nans);\n"

@ko81e24wy
Copy link

ko81e24wy commented May 11, 2022

glitch in NGB this PR
may relate to issue #783
2022-05-11 11-48-33 mp4_20220511_115252406
same glitch in DOA2U
xemu_2022_05_11_12_52_47_379

@abaire
Copy link
Contributor Author

abaire commented May 11, 2022

@ko81e24wy where is the nearest save point so I can reproduce that?

@ko81e24wy
Copy link

@ko81e24wy where is the nearest save point so I can reproduce that?

I think you just need to start a new game

@NZJenkins
Copy link

NZJenkins commented May 11, 2022

Hey @NZJenkins I wasn't sure if you were already looking at a fix, but I started to suspect this might be partly to blame for #801 so I threw together a partial fix to unblock my debugging there.

No not looking at a fix.
Did something similar for cxbxr where all outputs are de-NaN-ified, but probably need more tests to determine the correct behaviour.

For texcoord and fog outputs, additional processing happens on the VS output values before they are used in the PS, which may interpret NaN, INF, etc. differently to the mapping that happens when they are passed directly from the VS to the PS.
Not sure how Xemu shaders are set up but that might need to be accounted for here.

Unfortunately this does not address the fact that on HW, +/- NaN * 0 (or very close to 0) should be 0.

The Min test is using "subnormal" float values- it might be interesting to try it with the minimum "normal" values instead, and see if they behave differently to 0.
Apparently it's more complex to support subnormal numbers correctly in hardware, so for a GPU it's likely not implemented
(added test cases - subnormals appear to be treated as 0 abaire/nxdk_pgraph_tests#65)

@abaire
Copy link
Contributor Author

abaire commented May 16, 2022

@ko81e24wy where is the nearest save point so I can reproduce that?

I think you just need to start a new game

There was a stupid mistake in my code that is now fixed, can you test again when you have a chance?

@ko81e24wy
Copy link

I am just going to work and test with a combine version with this PR merged
that white glitch was gone,and I dont know if this PR relate to #783
xemu-2022-05-17-07-42-08

@abaire
Copy link
Contributor Author

abaire commented May 17, 2022

@ko81e24wy cool, thanks for double checking!

I have not tried to debug 783 yet, so if this PR happens to fix the missing lighting that'd be a nice bonus. I'm not exactly sure what that screenshot should look like compared to HW though; are you saying it looks more correct with this PR than before?

@ko81e24wy
Copy link

@ko81e24wy cool, thanks for double checking!

I have not tried to debug 783 yet, so if this PR happens to fix the missing lighting that'd be a nice bonus. I'm not exactly sure what that screenshot should look like compared to HW though; are you saying it looks more correct with this PR than before?

that white issue remind me of that light effect,the last PR just as before with that effect missing.

@mysteria25
Copy link

Was hoping this would fix the black graphics of Inside Pitch 2003 but guess that is a separate issue from this pr since that game still looks like this:

inside pitch 2003

@abaire
Copy link
Contributor Author

abaire commented May 17, 2022

Was hoping this would fix the black graphics of Inside Pitch 2003 but guess that is a separate issue from this pr since that game still looks like this:

@mysteria25 as far as I can see from the compatibility page there is no bug tracking this problem, please file one and it'll get looked into eventually (PRs are definitely not the place to report untracked issues). Grabbed by the Ghoulies has a black texture problem related to texture compression; there are many ways for things to fail so it's entirely possible what you're seeing with that game is something novel that will need a specific fix.

@abaire abaire marked this pull request as ready for review May 23, 2022 03:35
@abaire
Copy link
Contributor Author

abaire commented May 23, 2022

This doesn't entirely match HW behavior on my GTX 1070 but it is much closer and fixes #365 despite the remaining departure around infinite values and zeros.

" return src;\n"
" }\n"
" ivec4 signs = floatBitsToInt(src);\n"
" return mix(src, mix(vec4(1.0), vec4(0.0), lessThan(signs, ivec4(0))), nans);\n"
Copy link
Member

@mborgerson mborgerson May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The sign bit vectorization is neat, nice :)
  • So "negative NaN" (msb set) gets 0 and "positive NaN" gets 1.0?
  • Seems odd because you mentioned 0*NaN should == 0? So under what conditions do you expect it to be 1?(nvm, I see your comment about it) I don't think this it makes sense to generalize this based on sign bit alone here. More about this thought below.
  • It looks like mix(genBType) was added in GL4.5 so this might fail in a 4.0 context. Does it work on your M1 Mac? You could cast the boolean to a float

As far as I can see in the general case, I think this NaN handling is not more correct than just using 0.0 in the event we do get a NaN (which would force all platforms to behave the same at least, but might be unnecessary if we don't have issues cataloged for it). Is there another case in vsh output that this solves for, beyond fog?

To fix this fog issue for Otogi (and now at least 1 other game (Triggerman) that is affected in the same way that I've seen), I do think it's a fair compromise to have specialized fog NaN handling (like we did with Inf previously); and if this is only intended to fix the fog issue, I think we should just stick with the specialized handling for that case.

The fishing game is a different, interesting case that appears to be setting light colors to NaN, so for this second specialized case I think it makes sense to interpret NaN as 1.0 before performing lighting calculations, so lighting is calculated correctly. Perhaps it makes sense to generalize NaNs in the uniform spaces to 1.0 for such cases but I'd have to think more about it. Needs more investigation to understand how hardware is handling that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial response:

As far as I can see in the general case, I think this NaN handling is not more correct than just using 0.0 in the event we do get a NaN (which would force all platforms to behave the same at least, but might be unnecessary if we don't have issues cataloged for it). Is there another case in vsh output that this solves for, beyond fog?

The tester itself is using the diffuse channel and the xemu output does not match HW, if that's what you're asking. I don't know of any game where this makes a difference offhand, but that doesn't mean there isn't one with graphics glitching attributable to this handling.

An example difference with all three cases in one place:

master:
-8_1

This branch:
-8_1

HW:
-8_1

To fix this fog issue for Otogi (and now at least 1 other game (Triggerman) that is affected in the same way that I've seen), I do think it's a fair compromise to have specialized fog NaN handling (like we did with Inf previously); and if this is only intended to fix the fog issue, I think we should just stick with the specialized handling for that case.

Otogi's fog is the only case I know of (I assume someone has already looked into Triggerman and determined that just applying the fix to the fog fixes that game? I don't own that one to test). I guess the issue is balancing the performance impact of doing this check against everything in an attempt to be closer to HW behavior versus the time a contributor might end up spending tracing some glitch back to a FIXME in the shader code.

The fishing game is a different, interesting case that appears to be setting light colors to NaN, so for this second specialized case I think it makes sense to interpret NaN as 1.0 before performing lighting calculations, so lighting is calculated correctly. Perhaps it makes sense to generalize NaNs in the uniform spaces to 1.0 for such cases but I'd have to think more about it. Needs more investigation to understand how hardware is handling that case.

Hopefully @Triticum0 wouldn't mind splitting Lakemasters into a new issue given that it sounds like it's a similar symptom due to a different bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that the shader works on macOS 12.4 (M1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abaire Lakemaster is fixed with this PR
xemu-2022-06-02-20-51-16

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1008 is due to a similar cause, INF * 0 should == 0 but is instead NaN (at least on my machine and presumably other nvidia systems since the original report was on a 3060).

In this case the incorrect number causes a chain of errors and catching it at the very end where the outputs are assigned would also lead to incorrect behavior; in the test cases I examined for #1008 the output diffuse should be set to values between 0 and 1 so blindly detecting the NaN and setting to either would do the wrong thing.

@Fabxx
Copy link
Contributor

Fabxx commented Jun 2, 2022

image
current handling causes this bug where the character is highlighted in proximity of lights

@abaire abaire marked this pull request as draft June 3, 2022 02:26
@abaire
Copy link
Contributor Author

abaire commented Jun 3, 2022

current handling causes this bug where the character is highlighted in proximity of lights

Looks like limiting the NaN fix to fog will fix that regression and align with Matt's suggestion, will update and re-push in a bit.

@Fabxx
Copy link
Contributor

Fabxx commented Jun 9, 2022

The highlight problem seems to be fixed with #1045

@ko81e24wy
Copy link

I test the lastest build of this PR,and it crash when I run xemu.
xemu.log

@abaire
Copy link
Contributor Author

abaire commented Jun 29, 2022

I test the lastest build of this PR,and it crash when I run xemu. xemu.log

Bad merge, should be fixed now. Thanks for letting me know!

@Triticum0
Copy link

@abaire could you rebase this pr with the work you did on Match inv_w qualifier to attribute qualifier, Thanks

@Kohryujin
Copy link

Can we get another pull request or rebase or however that works so there is a working download available for this branch for testing? The one from July is expired and no longer available.

@TheHollows
Copy link

Any update on when this will be merged?

@mborgerson
Copy link
Member

@TheHollows

Any update on when this will be merged?

When it's ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Blackened Texture Issues and Documentation
9 participants