Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Use Unified Sequence Parallel (USP) instead of Ring attention #226

Open
feifeibear opened this issue Sep 1, 2024 · 0 comments

Comments

@feifeibear
Copy link

In your roadmap, you mentioned the planning for sequence parallelism, specifically the intention to implement ring-attention as part of sequence parallelism. I suggest you consider implementing Unified Sequence Parallel (USP), which combines Ulysses and Ring into a 2D sequence parallelism approach. USP offers better performance compared to using Ring or Ulysses alone.

The code we developed has been widely applied in large language models (LLM) and DiT long sequence training and inference scenarios. You can check our code at the following link:

https://github.com/feifeibear/long-context-attention

For a detailed technical report, please refer to:

https://arxiv.org/abs/2405.07719

I hope this information is helpful to you, and I look forward to your team considering this suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant