Disable CUDNN_SOFTMAX_FAST
or use a separate math mode variable for softmax
#506
Labels
CUDNN_SOFTMAX_FAST
or use a separate math mode variable for softmax
#506
Since #455 is merged, I want to point out that
CUDNN_SOFTMAX_FAST
would easily cause problem for attention operation. In the masking scenario, we would usually set the masked value to-Inf
or some really small value, like-1e9
. But if we want to useCUDA.math_mode!(CUDA.FAST_MATH)
to accelerate thegemm
,softmax
would actually introduce manyNaN
s.MWE:
The text was updated successfully, but these errors were encountered: