Use GEMM kernel for KleidiAI to accelerate FP16Benchmark #3440

milpuz01 · 2024-12-03T17:27:43Z

The following PR shows how to use kernels from KleidiAI to accelerate FP16Benchmark.

There is Makefile.FP16Benchmark.aarch64 that can be used to compile FP16Benchmark and FP16Test on AArch64 platforms with FBGEMM_ENABLE_KLEIDIAI enabled. It assumes that KleidiAI is in the external directory with branch f32_f32_f16p (https://gitlab.arm.com/kleidi/kleidiai/-/tree/f32_f32_f16p?ref_type=heads) in order to be able to access kernels that are implemented in this file KleidiAIFP16UKernelsNeon.cc (https://gitlab.arm.com/kleidi/kleidiai/-/blob/f32_f32_f16p/kai/ukernels/matmul/matmul_f32_f32_f16p/KleidiAIFP16UKernelsNeon.cc?ref_type=heads)

netlify · 2024-12-03T17:28:02Z

Name	Link
🔨 Latest commit	`667ce9b`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/674f3f917af367000807cff3
😎 Deploy Preview	https://deploy-preview-3440--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Use GEMM kernel for KleidiAI to accelerate FP16Benchmark

667ce9b

facebook-github-bot added the cla signed label Dec 3, 2024

Provide feedback