Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: constexpr vec (constructors & operators) when using simd #1313

Draft
wants to merge 52 commits into
base: master
Choose a base branch
from

Conversation

sharkautarch
Copy link

@sharkautarch sharkautarch commented Sep 11, 2024

Yes this is creating a whole new vec implementation, because trying to add std::is_constant_evaluated() while also making both constexpr and simd work well, and also maintain backwards compatibility would be too complicated to incorporate into the pre-existing vec classes.

oh also this is only compatible with gcc and clang, because it makes use of gcc’s vector extension for simd in vec’s comstructors (for when std::is_constant_evaluated() = false)
This is because it allows me to write one simd construction function that works for all vec lengths (and I will be adding support for use w/ packed vecs soon)
support for use/ packed vecs & conversion between packed and aligned vecs works now

I don't intend this separate vec implementation to make the pre-existing vec implementations deprecated, this is just the most straightforward way to make constexpr work well w/ simd, without making the entire codebase unreadable

@sharkautarch
Copy link
Author

sharkautarch commented Sep 11, 2024

Todo:

  • Add all the operators to this vec struct
  • Edit the setup headers so that the simd define bits are enabled when using the c++20 vec class
  • Add support for packed vec types by changing the gcc vec attributes to __attribute__((vector_size(sizeof(T)*L), aligned(alignof(data_t))))
  • Add support for converting a packed vec to an aligned vec & vice versa (I’ll probably just use a non-constexpr constructor for this to make things simpler)
    ^ after adding support for packed vecs, now I’m pretty sure that the current constructors should now already allow converting between packed and unpacked vecs

@sharkautarch
Copy link
Author

sharkautarch commented Sep 11, 2024

I did a small amount of testing, (though only with aligned highp vec4), so far things are seeming to work OK~~, except:~~

When swizzling w/ .y (etc), passing stuff to c-style va_arg functions via a direct swizzle expression (ex: printf("%f\n", v.y); prints 0.000 no matter what) is broken
I think it is because of default conversions that are specific to only c-style va_arg functions:
https://en.cppreference.com/w/cpp/language/variadic_arguments
Edit: I fixed the aforementioned issue

@sharkautarch
Copy link
Author

sharkautarch commented Sep 27, 2024

New feature added: swizzling functions (enabled by default)
They leverage the compiler intrinsic (this is done for both packed and aligned vecs) __builtin_shufflevector() so that the swizzle functions will get optimized into simd shuffle/blend instructions.

Tho you have to specify at least two swizzle members, max N swizzle members for vec of length N, and you can’t have duplicate members (you can’t swizzle vec1’s because that’d be silly). Example: v.yzx(), v.gbr()

glm only utilizes simd intrinsics for aarch64 and x86_64, but other platforms will now benefit from the arch-agnostic simd constructors and arithmetic operators used in the c++20 vec implementation
only support this capability on clang for now, because gcc is not able to handle my creativity (issue w/ gcc segfaulting)
also fix an issue preventing compile time evaluation for one of the ctors
@sharkautarch sharkautarch changed the title wip: constexpr vec (constructors) when using simd wip: constexpr vec (constructors & operators) when using simd Dec 28, 2024
@sharkautarch
Copy link
Author

Major improvements to this PR:

  • All operators for this vec class are now able to be compile-time evaluated
  • Swizzling functions now support duplicated components (EX: v.xxyy())
  • Only available when compiling w/ clang: swizzling w/ don't-care components, where you can specify lanes that you won't use, which can sometimes allow the compiler to use either a lower latency or higher throughput swizzling instruction (EX: v.Xyzz())
  • New method added to this vec class for doing simd blends between two vectors: template <std::array mask> vec<L,T,Q> blend(vec<L,T,Q> rhs) Use it like: auto v3 = v1.blend<{0,1,0,1}>(v2); //v3 = {v1[0], v2[1], v0[2], v1[3]}. On x86_64 w/ sse or avx, this method should just compile down to a single simd instruction

when we're having simd enabled, both our c++20 vec implementation & the original implementations put the x,y,z,w components in (an) anonymous struct(s). This then means that you can no longer access the x,y,z,w components during compile-time evaluation, since accessing members of an anonymous struct *at all* during compile time evaluation causes a compile error. we fix that by instead using operator[], which accesses elementArr -- which actually *is* able to be done at compile-time.
also make operator&& and operator|| constexpr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant