Fix aarch64 build break (pytorch#2055)

Summary: Pull Request resolved: pytorch#2055 The aarch64 CUDA builds use D46213158 to disable F14 intrinsics for compilations driven by NVCC/CUDA, instead the typical workaround that x86 uses: D34439017. However, it looks like there's some issue preventing NVCC from parsing the `F14SetFallback.h` code. It turns out that we likely never use this code from `.cu` sources, so this diff just drops an umbrella header and uses fine-grained `#include`s to avoid F14. Reviewed By: meyering Differential Revision: D49792747 fbshipit-source-id: 8d2ef8cc68bcb2442a5b34e521d548cbb03a4c09
q10 · Sep 30, 2023 · 7b7ad61 · 7b7ad61
1 parent 39914ef
commit 7b7ad61
Showing 1 changed file with 2 additions and 1 deletion.
diff --git a/fbgemm_gpu/include/fbgemm_gpu/permute_pooled_embedding_ops.h b/fbgemm_gpu/include/fbgemm_gpu/permute_pooled_embedding_ops.h
@@ -9,7 +9,8 @@
 #pragma once
 
 #include <ATen/ATen.h>
-#include <torch/script.h>
+#include <torch/csrc/api/include/torch/types.h>
+#include <torch/csrc/autograd/custom_function.h>
 #include "fbgemm_gpu/ops_utils.h"
 #include "fbgemm_gpu/sparse_ops_utils.h"