Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ggml-cuda : perform cublas mat mul of quantized types as f16 (ggergan…
…ov#3412) * ggml-cuda : perform cublas matrix multiplication of quantized types as fp16 * rename CC_TURING to CC_VOLTA * disable fp16 mat mul completely with multi GPU
- Loading branch information