Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

view(::CuArray, ...) is sometimes type-unstable: #2526

Open
charleskawczynski opened this issue Oct 21, 2024 · 6 comments
Open

view(::CuArray, ...) is sometimes type-unstable: #2526

charleskawczynski opened this issue Oct 21, 2024 · 6 comments
Labels
cuda array Stuff about CuArray. good first issue Good for newcomers

Comments

@charleskawczynski
Copy link
Contributor

Here is a reproducer:

a = CUDA.zeros(Float32, 2, 2);
view(a, 1, :); # make sure doesn't error
view(a, :, 1); # make sure doesn't error
@test_opt view(a, :, 1); # fails
@test_opt view(a, 1, :); # passes

Here is the error output:

julia> @test_opt view(a, :, 1); # fails
JET-test failed at REPL[61]:1
  Expression: #= REPL[61]:1 =# JET.@test_opt view(a, :, 1)
stmt = :($(Expr(:method, :(Base.getproperty(CUDA, :method_table)), %J15, CodeInfo(
1nothing
│   @ /home/charliek/.julia/packages/CUDA/2kjXI/src/device/array.jl:81 within `none`
└──      goto #3 if not $(Expr(:boundscheck))
2checkbounds(A, index)
    @ /home/charliek/.julia/packages/CUDA/2kjXI/src/device/array.jl:82 within `none`
3%4 = Base.getproperty(Base, :isbitsunion)
│   %5 = (%4)($(Expr(:static_parameter, 1)))
└──      goto #5 if not %5
    @ /home/charliek/.julia/packages/CUDA/2kjXI/src/device/array.jl:83 within `none`
4%7 = arrayref_union(A, index)
└──      return %7
    @ /home/charliek/.julia/packages/CUDA/2kjXI/src/device/array.jl:85 within `none`
5%9 = arrayref_bits(A, index)
└──      return %9
))))
  ═════ 1 possible error found ═════
  ┌ view(::CuArray{Float32, 2, CUDA.DeviceMemory}, ::Colon, ::Int64) @ GPUArrays /home/charliek/.julia/packages/GPUArrays/qt4ax/src/host/base.jl:310
  │┌ unsafe_view(A::CuArray{Float32, 2, CUDA.DeviceMemory}, I::Tuple{Base.Slice{…}, Int64}, ::GPUArrays.Contiguous) @ GPUArrays /home/charliek/.julia/packages/GPUArrays/qt4ax/src/host/base.jl:314
  ││┌ unsafe_contiguous_view(a::CuArray{Float32, 2, CUDA.DeviceMemory}, I::Tuple{Base.Slice{…}, Int64}, dims::Tuple{Int64}) @ GPUArrays /home/charliek/.julia/packages/GPUArrays/qt4ax/src/host/base.jl:319
  │││┌ derive(::Type{Float32}, a::CuArray{Float32, 2, CUDA.DeviceMemory}, dims::Tuple{Int64}, offset::Int64) @ CUDA /home/charliek/.julia/packages/CUDA/2kjXI/src/array.jl:799
  ││││┌ kwcall(::@NamedTuple{}, ::Type{…}, data::GPUArrays.DataRef{…}, dims::Tuple{…}) @ CUDA /home/charliek/.julia/packages/CUDA/2kjXI/src/array.jl:79
  │││││┌ (CuArray{Float32, 1})(data::GPUArrays.DataRef{CUDA.Managed{…}}, dims::Tuple{Int64}; maxsize::Int64, offset::Int64) @ CUDA /home/charliek/.julia/packages/CUDA/2kjXI/src/array.jl:83
  ││││││┌ finalizer(f::typeof(CUDA.unsafe_free!), o::CuArray{Float32, 1, CUDA.DeviceMemory}) @ Base ./gcutils.jl:87
  │││││││┌ unsafe_free!(xs::CuArray{Float32, 1, CUDA.DeviceMemory}) @ CUDA /home/charliek/.julia/packages/CUDA/2kjXI/src/array.jl:94
  ││││││││┌ unsafe_free!(::GPUArrays.DataRef{CUDA.Managed{CUDA.DeviceMemory}}) @ GPUArrays /home/charliek/.julia/packages/GPUArrays/qt4ax/src/host/abstractarray.jl:91
  │││││││││┌ release(::GPUArrays.RefCounted{CUDA.Managed{CUDA.DeviceMemory}}) @ GPUArrays /home/charliek/.julia/packages/GPUArrays/qt4ax/src/host/abstractarray.jl:42
  ││││││││││ runtime dispatch detected: %24::Any(%25::CUDA.Managed{CUDA.DeviceMemory})::Any
  │││││││││└────────────────────
  
ERROR: There was an error during testing

I think that this is because view on the last dimension can (efficiently) return a regular CuArray, instead of a SubArray, but it's not clear to me why this would lead to a JET failure.

@charleskawczynski charleskawczynski added the bug Something isn't working label Oct 21, 2024
@maleadt
Copy link
Member

maleadt commented Oct 21, 2024

Is this a problem? It's not always worthwhile to trade excessive specialization to get rid of a type instability. Especially in the scope of GPU operations, where everything is a multi-us call anyway, dynamic dispatch is pretty fast.

@charleskawczynski
Copy link
Contributor Author

I haven't measured it, and I suspect it's not. I was just surprised that it's not type stable. That's a fair point about specialization, but does that mean that it must be type unstable?

@maleadt
Copy link
Member

maleadt commented Oct 22, 2024

No, with sufficient inlining / constant propagation / effects annotations the code could be made inferrable.

@maleadt maleadt added good first issue Good for newcomers cuda array Stuff about CuArray. and removed bug Something isn't working labels Oct 22, 2024
@huiyuxie
Copy link
Contributor

It's not always worthwhile to trade excessive specialization to get rid of a type instability. Especially in the scope of GPU operations, where everything is a multi-us call anyway, dynamic dispatch is pretty fast.

True, and where do you (and how do you) @maleadt draw the line when trading off between specialization and dynamic dispatch on the GPU 🤔? On the CPU, making everything type-stable seems good, but on the GPU I can only rely on benchmarks - it is sometimes tedious and time-consuming.

@maleadt
Copy link
Member

maleadt commented Nov 18, 2024

Typically relating things to the cost of a kernel launch (~20us), e.g., some dynamic dispatches or allocations are much faster than that.
Note that I'm not necessarily opposed to making view type stable if anybody wants to tackle that.

@huiyuxie
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda array Stuff about CuArray. good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants