Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimized crc32 for Power 8+ processors #750

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mmatti-sw
Copy link

This is a pull request to include all Power8 optimisations rebased to v1.2.13.
The reference PR is #478

Manjunath S Matti and others added 3 commits November 16, 2022 04:40
Optimized functions for Power will make use of GNU indirect functions,
an extension to support different implementations of the same function,
which can be selected during runtime. This will be used to provide
optimized functions for different processor versions.

Since this is a GNU extension, we placed the definition of the Z_IFUNC
macro under `contrib/gcc`. This can be reused by other archs as well.

Author: Matheus Castanho <[email protected]>
Author: Rogerio Alves <[email protected]>
Signed-off-by: Manjunath Matti <[email protected]>
This commit adds an optimized version for the crc32 function based
on crc32-vpmsum from https://github.com/antonblanchard/crc32-vpmsum/

This is the C implementation created by Rogerio Alves
<[email protected]>

It makes use of vector instructions to speed up CRC32 algorithm.

Author: Rogerio Alves <[email protected]>
Signed-off-by: Manjunath Matti <[email protected]>
Clang 7 changed the behavior of vec_xxpermdi in order to match GCC's
behavior.  After this change, code that used to work on Clang 6 stopped
to work on Clang >= 7.

Tested on Clang 6, 7, 8 and 9.

Reference: https://bugs.llvm.org/show_bug.cgi?id=38192

Signed-off-by: Tulio Magno Quites Machado Filho <[email protected]>
@Neustradamus
Copy link

@madler: Can you look?

@ljavorsk
Copy link

I've tried to apply your patch on top of the patch from #410 and it has some rejected hunks (configure.rej Makefile.in.rej ; I can sent you the output if you want).

The previous patch (#478) was rebased on top of that patch, could you please preserve this order?

@nmoinvaz
Copy link
Contributor

The patches for Power are also maintained and incorporated in zlib-ng if anybody is interested.

@ljavorsk
Copy link

Okay, @iii-i has rebased his patch on top of yours and provided it to me.

I would like to agree on the order in which you'll have them applied, so I don't need to change it too often. Is that okay with you?

@mmatti-sw
Copy link
Author

I am ok with any order you or @iii-i would like to follow.

@iii-i
Copy link

iii-i commented Nov 29, 2022

I'd prefer POWER patches to go first, since they provide a foundation for adding optimized CRC32 implementations.

@ljavorsk
Copy link

Okay, I agree with that. Thank you

@ljavorsk
Copy link

Hi, could you please rebase your patches on top of zlib-1.3 version?

@Neustradamus
Copy link

@ljavorsk
Copy link

Hi, sorry @mmatti-sw for the inconvenience. We've transitioned to zlib-ng from Fedora 40, and thus we don't plan to rebase the zlib anymore.

This means, that you can fully focus on the zlib-ng PRs from now on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants