Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue possibly related to returning C-style adaptor from function #2780

Open
foolnotion opened this issue Apr 6, 2024 · 1 comment

Comments

@foolnotion
Copy link

Hi,

I believe my issue is similar to #600, but that issue is 7 years old and the solution no longer applies.

I have a thin abstraction layer in my library which allows me to use different math backends (Xtensor, Eigen, Armadillo, etc.). This depends on being able to return views/maps from raw pointers. As per the documentation, the following function maps a C-style 1D array to a tensor:

template<typename T, std::size_t S>
inline auto Map(T* res) {
    auto a = xt::adapt(res, S, xt::no_ownership(), std::array{S});
    return a;
}

This is later used like this:

template<typename T, std::size_t S>
auto Add(T* res, auto const*... args) {
    Map<T, S>(res) = (Map<T const, S>(args) + ...);
}

However, the problem is that this code is ~2.5 times slower than Eigen and my suspicion is that the data is actually being copied.

I've also investigated other causes like a lack of optimizations but:

  • the code is compiled with -O3 -march=x86-64-v3 (which includes avx2)
  • xsimd is installed and XTENSOR_USE_XSIMD is defined

Any help fixing the performance issue would be greatly appreciated, thanks.

@Arktische
Copy link

You can try to remove useless assignment in
Map, just return xt::adapt(…) directly, that returns a xexpression and easy for compiler to apply RVO. NRVO sometimes can’t be applied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants