[torch-frontend] add stablehlo IRs for Mixtral model. #254

Vremold · 2024-05-16T17:01:34Z

In this PR, we provide stablehlo IR of a single Mixtral decoder layer using ByteIR stack. The IR is elided by --mlir-elide-resource-strings-if-larger=1000 option, so not all dialect resources storing the model weights are displayed in the IR.

Note: we have some local patches to make the compilation succeed.

We eliminate torch.runtime.assert in stablehlo conversion, as we haven't decided how to handle it.
We need patches of PR 3322 and PR 3085 in torch-mlir

… layer

Vremold · 2024-05-30T16:28:39Z

Update at 2024.05.31.

We add stablehlo IR of a whole Mixtral 8x7B model. Note, to save compilation time and memory consumption, we convert the large weights into splat DenseElementsAttrs. See frontends/torch-frontend/examples/inference/mixtral/infer_single_mixtral.py for how to run.

[torch-frontend] add elided stablehlo IR for a single Mixtral decoder…

5954c6c

… layer

Vremold requested a review from qingyunqu May 16, 2024 17:01

tmp mixtral compile env

50a7126

liwenchangbdbz added the enhancement New feature or request label May 21, 2024

Vremold added 2 commits May 31, 2024 00:05

[torch-frontend] fx export FakeTensor weight as SplatElementsAttr

9fd1ac8

[torch-frontend] Add stabelhlo IR of whole Mixtral 8x7B model

5f9927b

Vremold changed the title ~~[torch-frontend] add elided stablehlo IR for a single Mixtral decoder layer~~ [torch-frontend] add stablehlo IRs for Mixtral model. May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[torch-frontend] add stablehlo IRs for Mixtral model. #254

[torch-frontend] add stablehlo IRs for Mixtral model. #254

Vremold commented May 16, 2024

Vremold commented May 30, 2024 •

edited

Loading

[torch-frontend] add stablehlo IRs for Mixtral model. #254

Are you sure you want to change the base?

[torch-frontend] add stablehlo IRs for Mixtral model. #254

Conversation

Vremold commented May 16, 2024

Vremold commented May 30, 2024 • edited Loading

Vremold commented May 30, 2024 •

edited

Loading