Skip to content

Commit

Permalink
Relocate quantized matmul reassociation flag (#2047)
Browse files Browse the repository at this point in the history
* Remove quantized matmul reassociation flag

This flag should be a model/use-case specific addition, not a default CPU compile flag.
  • Loading branch information
monorimet committed Dec 20, 2023
1 parent 788cc91 commit fa95ed3
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 1 deletion.
1 change: 1 addition & 0 deletions apps/shark_studio/api/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ def compile(self) -> None:
frontend="torch",
external_weight_file=self.external_weight_file,
write_to=self.vmfb_name,
extra_args=["--iree-global-opt-enable-quantized-matmul-reassociation"],
)
# TODO: delete the temp file

Expand Down
1 change: 0 additions & 1 deletion shark/iree_utils/compile_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ def get_iree_device_args(device, extra_args=[]):
get_iree_cpu_args()
+ u_kernel_flag
+ stack_size_flag
+ ["--iree-global-opt-enable-quantized-matmul-reassociation"]
)
if device == "cuda":
from shark.iree_utils.gpu_utils import get_iree_gpu_args
Expand Down

0 comments on commit fa95ed3

Please sign in to comment.