Skip to content

Commit

Permalink
Fix BF16 group_index_select_2d on AMD GPU (pytorch#2321)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: pytorch#2321

as title
```
[[email protected] /data/users/zhuoran/fbsource/fbcode (7932bb4ab|remote/fbsource/stable...)]$ HIP_VISIBLE_DEVICES=7 numactl --cpunodebind=1 --membind=1 buck2 run mode/{opt,amd-gpu} -c fbcode.triton_backend=amd -c fbcode.enable_gpu_sections=true //hammer/modules/sequential/encoders/tests:hstu_bench -- --enable-multi-stream=true --enable_profiler=true --num-streams=3 --num-workers=3
Watchman fresh instance: new mergebase, cleared graph state, cleared dep files
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/setup_helpers:gen_version_header to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2:substitute to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/amd_build:build_amd to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/torchgen:gen to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
 ⚠  Python 3.8 is EOL, and is going away by the end of H1 2024. Upgrade //caffe2/tools/setup_helpers:generate_code to Python 3.10 now to avoid breakages. https://fburl.com/py38-sunsetting
Action failed: fbcode//deeplearning/fbgemm/fbgemm_gpu:sparse_ops_hip (hip_compile src/sparse_ops/sparse_group_index.hip (pic))
Remote command returned non-zero exit code 1
Reproduce locally: `frecli cas download-action f0569d85851723e287f08ed03c0bc831587c0a05f94c911fe0b204ddd7670d24:145`
stdout:
stderr:
buck-out/v2/gen/fbcode/2ab98e452e15a67d/deeplearning/fbgemm/fbgemm_gpu/__sparse_ops_hip_hipify_gen__/out/src/sparse_ops/sparse_group_index.hip:11:10: fatal error: 'cuda_bf16.h' file not found
#include <cuda_bf16.h>
         ^~~~~~~~~~~~~
1 error generated when compiling for gfx90a.
```

Reviewed By: nrsatish, sryap, htyu

Differential Revision: D53549323

fbshipit-source-id: 73753c91cbb4c327ff6952bfa7d889ef02b8a31f
  • Loading branch information
zoranzhao authored and facebook-github-bot committed Feb 8, 2024
1 parent a816f8c commit 86ea895
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions fbgemm_gpu/src/sparse_ops/sparse_group_index.cu
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@
* LICENSE file in the root directory of this source tree.
*/

#ifdef USE_ROCM
#include <hip/hip_bf16.h>
#else
#if (defined(USE_ROCM))
#include <hip/hip_bfloat16.h>
#elif ( \
(defined(CUDA_VERSION) && CUDA_VERSION < 11000) || \
(defined(__CUDA_ARCH__) && (__CUDA_ARCH__ < 800)))
#include <cuda_bf16.h>
#endif // USE_ROCM

#endif
#include "common.cuh"

using Tensor = at::Tensor;
Expand Down

0 comments on commit 86ea895

Please sign in to comment.