Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86-32 "f16" ABI needs SSE, incompatible with i586 targets #131819

Open
Tracked by #116909
RalfJung opened this issue Oct 17, 2024 · 10 comments
Open
Tracked by #116909

x86-32 "f16" ABI needs SSE, incompatible with i586 targets #131819

RalfJung opened this issue Oct 17, 2024 · 10 comments
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. F-f16_and_f128 `#![feature(f16)]`, `#![feature(f128)]` O-x86_32 Target: x86 processors, 32 bit (like i686-*) S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea?

Comments

@RalfJung
Copy link
Member

RalfJung commented Oct 17, 2024

@beetrees wrote

On 32-bit x86, the ABI specifies that f16 gets returned in xmm0, which is what Clang and GCC do (neither Clang nor GCC support _Float16 when SSE2 is disabled)

So... for the "Rust" ABI we can do our own thing bypassing this (likely similar fo f32/f64, we should just always return f16 in via PassMode::Indirect), but what shall we do for the extern "C" ABI? We'll need a new dedicated compiler error (likely as part of the pass that we are adding here) that rejects returning f16 when the SSE registers are not available.

That will mean that libcore does not compile on i586 targets.... Actually libcore uses the "Rust" ABI so that one is fine. But if we have extern "C" float functions somewhere those would not be portable any more.

Cc @tgross35 @chorman0773

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 17, 2024
@RalfJung RalfJung changed the title x86-32 "f16" ABI needs SSE x86-32 "f16" ABI needs SSE, incompatible with i586 targets Oct 17, 2024
@beetrees
Copy link
Contributor

I think this will require LLVM changes to properly fix, as f16 is used as the return value of builtins; this is especially apparent when when combined with #[target_feature(enable = "sse2")], like in this example. LLVM will use whatever target features a function has to decide which ABI to use for a builtin, meaning that the no_sse function presumes that the result of __truncdfhf2 has been returned in ax whereas the yes_sse2 function presumes that the result has been returned in xmm0.

A similar issue can occur on PowerPC targets with f128 and the vsx feature (cc #125109). Although #[target_feature(enable = "vsx")] is still unstable, this is still an issue if user code is compiled with -Ctarget-feature=+vsx but compiler-builtins is not (or the other way around: system f128 builtins are compiled with vsx enabled as Clang and GCC don't support _Float128/__float128 when vsx is disabled).

@RalfJung
Copy link
Member Author

Oh lol so LLVM inserts calls to builtins that use the wrong ABI? Yeah that sounds like an LLVM bug. Could you file an issue there? You probably are better at explaining the problem in their terms. :)

@jieyouxu jieyouxu added A-ABI Area: Concerning the application binary interface (ABI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. labels Oct 17, 2024
@tgross35
Copy link
Contributor

I have been thinking that is would probably be in our advantage to just submit MRs against the ABIs whenever there are ambiguities like this. LLVM seems to usually pass f16 and f128 in vector registers if available (and large enough for f128), otherwise pass it as the same-sized int - but it would be good to start codifying this where applicable, considering we support these types on far more targets than C (unstably of course). __float128/_Float128 really isn't available outside of popular 64-bit targets.

@RalfJung
Copy link
Member Author

RalfJung commented Oct 17, 2024 via email

@tgross35
Copy link
Contributor

tgross35 commented Oct 17, 2024

What would be the silent part? I was mostly suggesting that the ABIs should specify a concrete calling convention for primitives when their usual cc is feature-dependent (either via other registers or indirect). Which doesn't help us with the "target features change ABI" issue, but does help us with the "is our extern "C" correct" concern.

@RalfJung
Copy link
Member Author

The silent part is where using -Ctarget-feature is not expected to change the ABI. Code with arbitrary combinations of target features for the same target should generally be linkable with each other; any change in ABI should be explicitly requested via different flags. Also see #116344, #116558.

For "Rust" we can just define our own ABI in a consistent way -- we already special case f32/f64 returning on x86-32, we should just extend the logic here to the remaining float types.

For non-"Rust" ABI however, we have two options:

  • partition targets into "with SSE regs in the ABI" and "without SSE regs in the ABI", and reject using target features to cross that boundary
  • Reject compiling f16-returning extern "C" functions when the SSE regs are missing

@beetrees btw you said this is about sse2, but shouldn't sse be enough as that's the feature controlling the registers?

@beetrees
Copy link
Contributor

Oh lol so LLVM inserts calls to builtins that use the wrong ABI? Yeah that sounds like an LLVM bug. Could you file an issue there? You probably are better at explaining the problem in their terms. :)

I've filed an LLVM issue at llvm/llvm-project#112885.

@beetrees btw you said this is about sse2, but shouldn't sse be enough as that's the feature controlling the registers?

AFAIK it should be, but LLVM currently switches to returning in eax as soon as sse2 is disabled. I've filed an LLVM issue about it at llvm/llvm-project#112890.

@RalfJung
Copy link
Member Author

Reject compiling f16-returning extern "C" functions when the SSE regs are missing

I guess there is an alternative to this, which is to say that on x86 targets without SSE, f16 uses a different ABI for returns. Not sure if there is an official standard ABI for this, but LLVM implements some de-facto ABI. Then we have to reject ever enabling SSE on that target (including on a per-function level via #[target_feature]) to ensure the correct ABI is used everywhere.

workingjubilee added a commit to workingjubilee/rustc that referenced this issue Oct 19, 2024
…bilee

x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.
@jieyouxu jieyouxu added S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea? C-bug Category: This is a bug. O-x86_32 Target: x86 processors, 32 bit (like i686-*) and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 19, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 19, 2024
…bilee

x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.
@RalfJung RalfJung added the F-f16_and_f128 `#![feature(f16)]`, `#![feature(f128)]` label Oct 19, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 19, 2024
…bilee

x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.
@RalfJung
Copy link
Member Author

Reject compiling f16-returning extern "C" functions when the SSE regs are missing

That seems to match what C does: https://godbolt.org/z/GYETdPaEe

bors added a commit to rust-lang-ci/rust that referenced this issue Oct 22, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 22, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

try-job: i686-gnu
try-job: i686-gnu-nopt
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 22, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

try-job: i686-gnu
try-job: i686-gnu-nopt
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 22, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

try-job: i686-gnu
try-job: i686-gnu-nopt
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 22, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

try-job: i686-gnu
try-job: i686-gnu-nopt
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Oct 29, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang/rust#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

try-job: i686-gnu
try-job: i686-gnu-nopt
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Oct 30, 2024
x86-32 float return for 'Rust' ABI: treat all float types consistently

This helps with rust-lang/rust#131819: for our own ABI on x86-32, we want to *never* use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

try-job: i686-gnu
try-job: i686-gnu-nopt
@RalfJung
Copy link
Member Author

One way to describe this is to say that there are 3 ABIs on x86-32: softfloat, full-hardfloat (with SSE2), and semi-hardfloat (with x87 but not using SSE registers for anything). Every target triple would have to decide which of these to use, and we'd use the check from #134794 to ensure the right target features are enabled.

However, LLVM doesn't actually have a way to force the semi-hardfloat ABI it seems, so wed have to forbid the SSE target feature on those targets...

We definitely need a larger discussion here before f16 gets stabilized. We either accept "some targets just can't pass f16 across a function boundary" or we come up with our own ABI and try to convince LLVM to support that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. F-f16_and_f128 `#![feature(f16)]`, `#![feature(f128)]` O-x86_32 Target: x86 processors, 32 bit (like i686-*) S-waiting-on-LLVM Status: the compiler-dragon is eepy, can someone get it some tea?
Projects
None yet
Development

No branches or pull requests

5 participants