Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor Support - Kernels #742

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Conversation

CPestka
Copy link
Contributor

@CPestka CPestka commented Jun 3, 2024

Tensor Support - Kernels

This Pr adds support for tensors, for all / most(?) CPU kernels that are non 2D- or frame- specific or that wire up some lib like eigen or cblas (let me know if I missed one) and adds some higher dim / tensor specific ones. I omitted the Cuda kernels for now as well, but if somebody is interested I can add those as well.

One kernel that differs a bit from the rest is the aggregate kernel, which is written to work with async i/o, without the need for the compiler to understand what is going on.

A couple of general points in respect to the kernels, which I have not touched:

  1. Besides generaly everything being 2D specific currently (fair enough), a lot of things are named MatX or MatrixX, even if they also operate on frames and now also tensors and imo should be renamed to smth like ObjX like is already the case for some of the kernels.
  2. It is unclear to me what the "allowed state" of the res ptr passed into most of the kernels is supposed to be. Some kernels check for a nullptr and create a new fitting object if they encounter one, but do not check whether the object is actually "suitable" to be overwritten if it is not a nullptr (e.g. large enough). Some kernels don't check and always create a new obj (which I did here for the tensor kernels as well).

@CPestka
Copy link
Contributor Author

CPestka commented Jun 3, 2024

Just noticed, that I used concepts in the quantize kernel -> added c++20 in cmake. Note sure why it compiled for me without it, gcc-12 should have complained, as gcc-9 did on the CI machine.

@CPestka
Copy link
Contributor Author

CPestka commented Jun 4, 2024

Just noticed in gcc concepts are only supported for >= 10

@CPestka CPestka mentioned this pull request Jun 7, 2024
@pdamme pdamme self-requested a review June 21, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant