Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/efa: Cleanup ep and peer structures #10382

Merged
merged 3 commits into from
Sep 17, 2024

Conversation

shijin-aws
Copy link
Contributor

This PR contains 2 commits that cleans up efa_rdm_ep and efa_rdm_peer structs.

These dlists are now migrated to srx context.

Signed-off-by: Shi Jin <[email protected]>
Currently, extra_info is a 256*8 bytes array in
efa_rdm_ep and efa_rdm_peer, this is way too much
as it can cover 256*64 bits of extra features/requests
that RDM endpoint is likely to have.
This patch reduces it to 4*8 bytes array that
can cover 256 bits which should be enough for the foreseeable
future.

Signed-off-by: Shi Jin <[email protected]>
@shijin-aws shijin-aws requested a review from a team September 14, 2024 00:00
@shijin-aws shijin-aws changed the title prov/efa: Some cleanups prov/efa: Cleanup ep and peer structures Sep 14, 2024
* and efa_rdm_peer
* 4 means 64*4=256 bits of extra features or requests
*/
#define EFA_RDM_MAX_NUM_EXINFO (4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm did this bug cause any issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was actually correct: https://github.com/ofiwg/libfabric/blob/main/prov/efa/src/rdm/efa_rdm_pke_nonreq.c#L40 that it divided by 64 when calculating the number of extra_info. But I have no idea why we make it 256 as the array length. Likely an overlook

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arrays were originally of length 1 ($4-4+1$): aa5fec1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was planned that RDM v4 will support only 64 extra features, v5 will support 128, and so on. So it was reasonable to have it as length 1 at that time. I think this PR aimed to make it support 256/64 = 4 so it can support 256 features.

@shijin-aws
Copy link
Contributor Author

bot:aws:retest

1 similar comment
@shijin-aws
Copy link
Contributor Author

bot:aws:retest

@shijin-aws shijin-aws merged commit 8f45c7a into ofiwg:main Sep 17, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants