System of PDEs with CUDA? #410

killah-t-cell · 2021-10-03T14:46:04Z

I tried to adapt the https://neuralpde.sciml.ai/dev/pinn/2D/ GPU tutorial to a system of PDEs and unfortunately failed. I need to turn the initθ into a CuArray but I get a warning that Scalar indexing is disallowed. What is the performant/correct way to do the mapping I am doing here with CUDA?

using Flux, CUDA, DiffEqFlux
chain = [FastChain(FastDense(3, 16, Flux.σ), FastDense(16,16,Flux.σ), FastDense(16, 1)),
         FastChain(FastDense(2, 16, Flux.σ), FastDense(16,16,Flux.σ), FastDense(16, 1))]

initθ = map(c -> CuArray(Float64.(c)), DiffEqFlux.initial_params.(chain))

ChrisRackauckas · 2021-10-04T02:26:41Z

That code works?

killah-t-cell · 2021-10-04T06:46:32Z

@ChrisRackauckas not when you run it in the GPU, it throws a scalar indexing error. It only throws a warning when I run it in REPL

ChrisRackauckas · 2021-10-04T13:43:55Z

using Flux, CUDA, DiffEqFlux
CUDA.allowscalar(false)
chain = [FastChain(FastDense(3, 16, Flux.σ), FastDense(16,16,Flux.σ), FastDense(16, 1)),
         FastChain(FastDense(2, 16, Flux.σ), FastDense(16,16,Flux.σ), FastDense(16, 1))]

initθ = map(c -> CuArray(Float64.(c)), DiffEqFlux.initial_params.(chain))

seems to work fine?

killah-t-cell · 2021-10-05T06:09:07Z

Oh great :)

killah-t-cell · 2021-10-07T06:29:37Z

I should have included a fuller example: the code works until that point, and then fails once I call solve with.

ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.

Is there anything I am doing wrong here? This is the full example:

@parameters t x v
@variables f(..) E(..) 
Dx = Differential(x)
Dt = Differential(t)
Dv = Differential(v)

# Constants
μ_0 = 1.25663706212e-6 # N A⁻²
ε_0 = 8.8541878128e-12 # F ms⁻¹
e   = 1.602176634e-19 # Coulombs
m_e = 9.10938188e-31 # Kg
v_th = sqrt(2)

# Space
domains = [t ∈ Interval(0.0, 1.0),
           x ∈ Interval(0.0, 1.0), 
           v ∈ Interval(0.0, 1.0)]

# Integrals
Iv = Integral(v in DomainSets.ClosedInterval(-1, 1)) 

# Equations
eqs = [Dt(f(t,x,v)) ~ - v * Dx(f(t,x,v)) - e/m_e * E(t,x) * Dv(f(t,x,v))
       Dx(E(t,x)) ~ e/ε_0 * (Iv(f(t,x,v)) - 1)]


bcs = [f(0,x,v) ~ 1/(v_th * sqrt(2π)) * exp(-v^2/(2*v_th^2)),
       E(0,x) ~ e/ε_0 * (Iv(f(0,x,v)) - 1) * x,
       E(t,0) ~ 0]


# Neural Network
CUDA.allowscalar(false)
chain = [FastChain(FastDense(3, 16, Flux.σ), FastDense(16,16,Flux.σ), FastDense(16, 1)),
         FastChain(FastDense(2, 16, Flux.σ), FastDense(16,16,Flux.σ), FastDense(16, 1))]

initθ = GPU ? map(c -> CuArray(Float64.(c)), DiffEqFlux.initial_params.(chain)) : map(c -> Float64.(c), DiffEqFlux.initial_params.(chain)) 

discretization = NeuralPDE.PhysicsInformedNN(chain, QuadratureTraining(), init_params= initθ)
@named pde_system = PDESystem(eqs, bcs, domains, [t,x,v], [f(t,x,v), E(t,x)])
prob = SciMLBase.symbolic_discretize(pde_system, discretization)
prob = SciMLBase.discretize(pde_system, discretization)

# cb
cb = function (p,l)
    println("Current loss is: $l")
    return false
end

# Solve
opt = Optim.BFGS()
res = GalacticOptim.solve(prob, opt, cb = cb, maxiters=1000) # the code errors here
phi = discretization.phi

ChrisRackauckas · 2021-10-07T12:06:49Z

BFGS fails on GPU, and the PR needs help: JuliaNLSolvers/Optim.jl#946

udemirezen · 2021-10-31T16:04:24Z

Think this problem still occurs. I have just get the error given below with NeuralPDE 4.0.1:
(When I used CUDA.allowscalar(true) but when I set this flag to FALSE it works very slow)

Warning: : no method matching get_unit for arguments (SciMLBase.NullParameters,). └ @ ModelingToolkit ~/.julia/packages/ModelingToolkit/0dKHa/src/systems/validation.jl:135 Solving... ERROR: LoadError: Scalar indexing is disallowed. Invocation of getindex resulted in scalar indexing of a GPU array. This is typically caused by calling an iterating implementation of a method. Such implementations *do not* execute on the GPU, but very slowly on the CPU, and therefore are only permitted from the REPL for prototyping purposes. If you did intend to index this array, annotate the caller with @allowscalar.

Thank you

ChrisRackauckas · 2021-10-31T16:05:48Z

Using the latest version of Optim?

udemirezen · 2021-11-01T06:33:00Z

I think yes. Optim.jl 1.4.1 is the latest version.

This is the status report of my currect packages:

  [fbb218c0] BSON v0.3.4
  [052768ef] CUDA v3.5.0
  [667455a9] Cubature v1.5.1
  [aae7a2af] DiffEqFlux v1.43.0
  [5b8099bc] DomainSets v0.5.9
  [587475ba] Flux v0.12.8
  [a75be94c] GalacticOptim v2.2.0
  [961ee093] ModelingToolkit v6.4.9
  [315f7962] NeuralPDE v4.0.1
  [429524aa] Optim v1.4.1
  [1dea7af3] OrdinaryDiffEq v5.65.0
  [91a5bcdd] Plots v1.23.1
  [67601950] Quadrature v1.12.0`

ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations do not execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.

println("Adam Training...")
res = GalacticOptim.solve(prob, ADAM(0.001); cb = cb, maxiters=5)
prob = remake(prob,u0=res.minimizer)

println("LBFGS Training...")
res = GalacticOptim.solve(prob, LBFGS(); cb = cb, maxiters=5)

This will allow the version to be tagged JuliaRegistries/General#46769 and is required to finally get SciML/NeuralPDE.jl#410 (comment) solved.

ChrisRackauckas · 2021-11-01T12:47:38Z

It needs Optim v1.5, which was blocked because of a compat bounds issue fixed in JuliaNLSolvers/Optim.jl#959 . It should be fine on master and we'll get that tag finished ASAP.

This will allow the version to be tagged JuliaRegistries/General#46769 and is required to finally get SciML/NeuralPDE.jl#410 (comment) solved.

udemirezen · 2021-11-03T08:57:43Z

I have updated Optim to 1.5.0

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [fbb218c0] BSON v0.3.4
  [052768ef] CUDA v3.5.0
  [667455a9] Cubature v1.5.1
  [aae7a2af] DiffEqFlux v1.44.0
  [5b8099bc] DomainSets v0.5.9
  [587475ba] Flux v0.12.8
  [a75be94c] GalacticOptim v2.2.0
  [961ee093] ModelingToolkit v6.4.9
  [315f7962] NeuralPDE v4.0.1
  [429524aa] Optim v1.5.0
  [1dea7af3] OrdinaryDiffEq v5.65.0
  [91a5bcdd] Plots v1.23.1
  [67601950] Quadrature v1.12.0

But result is the same :( It still gives the same error :(

Program Start...
┌ Warning: : no method matching get_unit for arguments (SciMLBase.NullParameters,).
└ @ ModelingToolkit ~/.julia/packages/ModelingToolkit/0dKHa/src/systems/validation.jl:135
┌ Warning: : no method matching get_unit for arguments (SciMLBase.NullParameters,).
└ @ ModelingToolkit ~/.julia/packages/ModelingToolkit/0dKHa/src/systems/validation.jl:135
Solving...
ADAM Training...
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] assertscalar(op::String)
    @ GPUArrays ~/.julia/packages/GPUArrays/3sW6s/src/host/indexing.jl:53
  [3] getindex(xs::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, I::Int64)
    @ GPUArrays ~/.julia/packages/GPUArrays/3sW6s/src/host/indexing.jl:86
  [4] first
    @ ./abstractarray.jl:368 [inlined]
  [5] dot(x::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, y::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/generic.jl:914
  [6] #1287
    @ ~/.julia/packages/ChainRules/Tj6lu/src/rulesets/Base/arraymath.jl:102 [inlined]
  [7] unthunk
    @ ~/.julia/packages/ChainRulesCore/Kzv5O/src/tangent_types/thunks.jl:194 [inlined]
  [8] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/rv6db/src/compiler/chainrules.jl:104 [inlined]
  [9] map
    @ ./tuple.jl:215 [inlined]
 [10] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/rv6db/src/compiler/chainrules.jl:105 [inlined]
 [11] ZBack
    @ ~/.julia/packages/Zygote/rv6db/src/compiler/chainrules.jl:179 [inlined]
 [12] #3740#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [13] macro expansion
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:556 [inlined]
 [14] macro expansion
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:129 [inlined]
 [15] macro expansion
    @ ./none:0 [inlined]
 [16] Pullback
    @ ./none:0 [inlined]
 [17] (::Zygote.var"#203#204"{Tuple{Tuple{Nothing}, NTuple{7, Nothing}}, typeof(∂(generated_callfunc))})(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:203
 [18] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [19] Pullback
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117 [inlined]
 [20] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:595 [inlined]
 [21] (::typeof(∂(λ)))(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [22] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:875 [inlined]
 [23] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [24] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1157 [inlined]
 [25] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [26] #553
    @ ~/.julia/packages/Zygote/rv6db/src/lib/array.jl:202 [inlined]
 [27] (::Base.var"#4#5"{Zygote.var"#553#558"})(a::Tuple{Tuple{Float64, typeof(∂(λ))}, Float64})
    @ Base ./generator.jl:36
 [28] iterate
    @ ./generator.jl:47 [inlined]
 [29] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{Vector{Tuple{Float64, Zygote.Pullback}}, Vector{Float64}}}, Base.var"#4#5"{Zygote.var"#553#558"}})
    @ Base ./array.jl:681
 [30] map
    @ ./abstractarray.jl:2383 [inlined]
 [31] (::Zygote.var"#550#555"{NeuralPDE.var"#349#365"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, 1, Tuple{Vector{NeuralPDE.var"#297#298"{_A, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}} where _A}}, Vector{Tuple{Float64, Zygote.Pullback}}})(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/lib/array.jl:202
 [32] (::Zygote.var"#2576#back#559"{Zygote.var"#550#555"{NeuralPDE.var"#349#365"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, 1, Tuple{Vector{NeuralPDE.var"#297#298"{_A, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}} where _A}}, Vector{Tuple{Float64, Zygote.Pullback}}}})(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67
 [33] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1157 [inlined]
 [34] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [35] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1159 [inlined]
 [36] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [37] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1163 [inlined]
 [38] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [39] #203
    @ ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:203 [inlined]
 [40] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [41] Pullback
    @ ~/.julia/packages/SciMLBase/s5sVH/src/problems/basic_problems.jl:107 [inlined]
 [42] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [43] (::Zygote.var"#203#204"{Tuple{Tuple{Nothing, Nothing}, Tuple{}}, typeof(∂(λ))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:203
 [44] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [45] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:6 [inlined]
 [46] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [47] (::Zygote.var"#203#204"{Tuple{Tuple{Nothing}, Tuple{}}, typeof(∂(λ))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:203
 [48] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [49] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8 [inlined]
 [50] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [51] (::Zygote.var"#50#51"{typeof(∂(λ))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:41
 [52] gradient(f::Function, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:76
 [53] (::GalacticOptim.var"#254#264"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}})(::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8
 [54] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:41 [inlined]
 [55] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/utils.jl:35 [inlined]
 [56] __solve(prob::OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, opt::ADAM, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:39
 [57] #solve#476
    @ ~/.julia/packages/SciMLBase/s5sVH/src/solve.jl:3 [inlined]
 [58] top-level scope
    @ /workspace/notebooks/JuliaRep/diffeqs_PINN_PDE.jl:132
 [59] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [60] top-level scope
    @ REPL[8]:1
in expression starting at /workspace/notebooks/JuliaRep/diffeqs_PINN_PDE.jl:132

I really looking forward to being solved this issue :) because training is too slow :(
Thank you.

ChrisRackauckas · 2021-11-03T16:46:47Z

What code is that for? And the issue was only with BFGS anyways, ADAM has always worked fine.

udemirezen · 2021-11-04T06:12:51Z

What code is that for? And the issue was only with BFGS anyways, ADAM has always worked fine.

I am using the code below (from julia examples):

using NeuralPDE, Flux, ModelingToolkit, GalacticOptim, Optim, DiffEqFlux
using Plots
using Quadrature,Cubature
import ModelingToolkit: Interval, infimum, supremum
using CUDA
using Random
Random.seed!(100)
CUDA.allowscalar(false)

@parameters t, x
@variables u(..), w(..)
Dxx = Differential(x)^2
Dt = Differential(t)

# Constants
a  = 1
b1 = 4
b2 = 2
c1 = 3
c2 = 1
λ1 = (b1 + c2 + sqrt((b1 + c2)^2 + 4 * (b1 * c2 - b2 * c1))) / 2
λ2 = (b1 + c2 - sqrt((b1 + c2)^2 + 4 * (b1 * c2 - b2 * c1))) / 2

# Analytic solution
θ(t, x) = exp(-t) * cos(x / a)
u_analytic(t, x) = (b1 - λ2) / (b2 * (λ1 - λ2)) * exp(λ1 * t) * θ(t, x) - (b1 - λ1) / (b2 * (λ1 - λ2)) * exp(λ2 * t) * θ(t, x)
w_analytic(t, x) = 1 / (λ1 - λ2) * (exp(λ1 * t) * θ(t, x) - exp(λ2 * t) * θ(t, x))

# Second-order constant-coefficient linear parabolic system
eqs = [Dt(u(x, t)) ~ a * Dxx(u(x, t)) + b1 * u(x, t) + c1 * w(x, t),
       Dt(w(x, t)) ~ a * Dxx(w(x, t)) + b2 * u(x, t) + c2 * w(x, t)]

# Boundary conditions
bcs = [u(0, x) ~ u_analytic(0, x),
       w(0, x) ~ w_analytic(0, x),
       u(t, 0) ~ u_analytic(t, 0),
       w(t, 0) ~ w_analytic(t, 0),
       u(t, 1) ~ u_analytic(t, 1),
       w(t, 1) ~ w_analytic(t, 1)]

# Space and time domains
domains = [x ∈ Interval(0.0, 1.0),
           t ∈ Interval(0.0, 1.0)]

# Neural network
input_ = length(domains)
n = 15
chain = [Chain(Dense(input_, n, Flux.σ), Dense(n, n, Flux.σ), Dense(n, 1)) for _ in 1:2] |>  gpu
initθ = map(c -> Float64.(c), DiffEqFlux.initial_params.(chain)) |>  gpu

_strategy = QuadratureTraining()
discretization = PhysicsInformedNN(chain, _strategy, init_params=initθ)

@named pde_system = PDESystem(eqs, bcs, domains, [t,x], [u(t,x),w(t,x)])
prob = discretize(pde_system, discretization)
sym_prob = symbolic_discretize(pde_system, discretization)

pde_inner_loss_functions = prob.f.f.loss_function.pde_loss_function.pde_loss_functions.contents
bcs_inner_loss_functions = prob.f.f.loss_function.bcs_loss_function.bc_loss_functions.contents

cb = function (p, l)
    println("loss: ", l)
    println("pde_losses: ", map(l_ -> l_(p), pde_inner_loss_functions))
    println("bcs_losses: ", map(l_ -> l_(p), bcs_inner_loss_functions))
    return false
end

println("ADAM Training...")
flush(stdout)
res = GalacticOptim.solve(prob, ADAM(0.001); cb = cb, maxiters=3000)
prob = remake(prob,u0=res.minimizer)

println("LBFGS Training...")
flush(stdout)
res = GalacticOptim.solve(prob, LBFGS(); cb = cb, maxiters=3000)
phi = discretization.phi

when I run the code:

julia> include("ex.jl")
ADAM Training...
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] (::Cubature.var"#17#18"{Bool, Bool, Int64, Float64, Float64, Int64, Int32, Ptr{Nothing}, Cubature.IntegrandData{Quadrature.var"#83#95"{QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Int64})()
    @ Cubature ~/.julia/packages/Cubature/5zwuu/src/Cubature.jl:215
  [2] disable_sigint(f::Cubature.var"#17#18"{Bool, Bool, Int64, Float64, Float64, Int64, Int32, Ptr{Nothing}, Cubature.IntegrandData{Quadrature.var"#83#95"{QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Int64})
    @ Base ./c.jl:458
  [3] cubature(xscalar::Bool, fscalar::Bool, vectorized::Bool, padaptive::Bool, fdim::Int64, f::Quadrature.var"#83#95"{QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, xmin_::Vector{Float64}, xmax_::Vector{Float64}, reqRelError::Float64, reqAbsError::Float64, maxEval::Int64, error_norm::Int32)
    @ Cubature ~/.julia/packages/Cubature/5zwuu/src/Cubature.jl:169
  [4] hcubature_v(f::Function, xmin::Vector{Float64}, xmax::Vector{Float64}; reltol::Float64, abstol::Float64, maxevals::Int64)
    @ Cubature ~/.julia/packages/Cubature/5zwuu/src/Cubature.jl:230
  [5] __solvebp_call(::QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::CubatureJLh, ::Quadrature.ReCallVJP{Quadrature.ZygoteVJP}, ::Vector{Float64}, ::Vector{Float64}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}; reltol::Float64, abstol::Float64, maxiters::Int64, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Quadrature ~/.julia/packages/Quadrature/UCzIP/src/Quadrature.jl:308
  [6] #rrule#43
    @ ~/.julia/packages/Quadrature/UCzIP/src/Quadrature.jl:494 [inlined]
  [7] chain_rrule_kw
    @ ~/.julia/packages/Zygote/rv6db/src/compiler/chainrules.jl:203 [inlined]
  [8] macro expansion
    @ ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0 [inlined]
  [9] _pullback(::Zygote.Context, ::Quadrature.var"#__solvebp##kw", ::NamedTuple{(:reltol, :abstol, :maxiters), Tuple{Float64, Float64, Int64}}, ::typeof(Quadrature.__solvebp), ::QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::CubatureJLh, ::Quadrature.ReCallVJP{Quadrature.ZygoteVJP}, ::Vector{Float64}, ::Vector{Float64}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:9
 [10] _apply(::Function, ::Vararg{Any, N} where N)
    @ Core ./boot.jl:804
 [11] adjoint
    @ ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:200 [inlined]
 [12] _pullback
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65 [inlined]
 [13] _pullback
    @ ~/.julia/packages/Quadrature/UCzIP/src/Quadrature.jl:154 [inlined]
 [14] _pullback(::Zygote.Context, ::Quadrature.var"##solve#12", ::Quadrature.ReCallVJP{Quadrature.ZygoteVJP}, ::Base.Iterators.Pairs{Symbol, Real, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:reltol, :abstol, :maxiters), Tuple{Float64, Float64, Int64}}}, ::typeof(solve), ::QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::CubatureJLh)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [15] _apply(::Function, ::Vararg{Any, N} where N)
    @ Core ./boot.jl:804
 [16] adjoint
    @ ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:200 [inlined]
 [17] _pullback
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65 [inlined]
 [18] _pullback
    @ ~/.julia/packages/Quadrature/UCzIP/src/Quadrature.jl:153 [inlined]
 [19] _pullback(::Zygote.Context, ::CommonSolve.var"#solve##kw", ::NamedTuple{(:reltol, :abstol, :maxiters), Tuple{Float64, Float64, Int64}}, ::typeof(solve), ::QuadratureProblem{false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, NeuralPDE.var"#_loss#329"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, UnionAll}, Vector{Float64}, Vector{Float64}, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::CubatureJLh)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [20] _pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:972 [inlined]
 [21] _pullback(::Zygote.Context, ::NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, ::Vector{Float64}, ::Vector{Float64}, ::NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [22] _pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:978 [inlined]
 [23] _pullback(ctx::Zygote.Context, f::NeuralPDE.var"#326#330"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, Vector{Float64}, Vector{Float64}, NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, Float32}, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [24] _pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1157 [inlined]
 [25] _pullback(ctx::Zygote.Context, f::NeuralPDE.var"#349#365"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, args::NeuralPDE.var"#326#330"{NeuralPDE.var"#175#176"{Nothing, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, NeuralPDE.var"#276#277", NeuralPDE.var"#278#286"{QuadratureTraining, NeuralPDE.var"#278#279#287"{NeuralPDE.var"#276#277"}, Dict{Symbol, Int64}, Dict{Symbol, Int64}, Vector{Symbol}, Vector{Symbol}}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:cord, Symbol("##θ#257"), :phi, :derivative, :integral, :u, :p), NeuralPDE.var"#_RGF_ModTag", NeuralPDE.var"#_RGF_ModTag", (0xe62903fa, 0x7efa751b, 0xaae82815, 0x08c988b7, 0x4f82a206)}, NeuralPDE.var"#274#275"}, Vector{Float64}, Vector{Float64}, NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, Float32})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [26] (::Zygote.var"#549#554"{Zygote.Context, NeuralPDE.var"#349#365"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}})(args::Function)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/lib/array.jl:190
 [27] iterate
    @ ./generator.jl:47 [inlined]
 [28] _collect(c::Vector{NeuralPDE.var"#326#330"{loss_function, Vector{Float64}, Vector{Float64}, NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, Float32} where loss_function}, itr::Base.Generator{Vector{NeuralPDE.var"#326#330"{loss_function, Vector{Float64}, Vector{Float64}, NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, Float32} where loss_function}, Zygote.var"#549#554"{Zygote.Context, NeuralPDE.var"#349#365"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base ./array.jl:695
 [29] collect_similar
    @ ./array.jl:606 [inlined]
 [30] map
    @ ./abstractarray.jl:2294 [inlined]
 [31] ∇map(cx::Zygote.Context, f::NeuralPDE.var"#349#365"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, args::Vector{NeuralPDE.var"#326#330"{loss_function, Vector{Float64}, Vector{Float64}, NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, Float32} where loss_function})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/lib/array.jl:190
 [32] adjoint
    @ ~/.julia/packages/Zygote/rv6db/src/lib/array.jl:211 [inlined]
 [33] _pullback(__context__::Zygote.Context, 522::typeof(map), f::Function, args::Vector{NeuralPDE.var"#326#330"{loss_function, Vector{Float64}, Vector{Float64}, NeuralPDE.var"#325#328"{UnionAll, QuadratureTraining}, Float32} where loss_function})
    @ Zygote ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65
 [34] _pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1157 [inlined]
 [35] _pullback(ctx::Zygote.Context, f::NeuralPDE.var"#348#364", args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [36] _pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1159 [inlined]
 [37] _pullback(ctx::Zygote.Context, f::NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [38] _pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1163 [inlined]
 [39] _pullback(::Zygote.Context, ::NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ::SciMLBase.NullParameters)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [40] _apply
    @ ./boot.jl:804 [inlined]
 [41] adjoint
    @ ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:200 [inlined]
 [42] _pullback
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65 [inlined]
 [43] _pullback
    @ ~/.julia/packages/SciMLBase/x3z0g/src/problems/basic_problems.jl:107 [inlined]
 [44] _pullback(::Zygote.Context, ::OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ::SciMLBase.NullParameters)
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [45] _apply(::Function, ::Vararg{Any, N} where N)
    @ Core ./boot.jl:804
 [46] adjoint
    @ ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:200 [inlined]
 [47] _pullback
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65 [inlined]
 [48] _pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:6 [inlined]
 [49] _pullback(ctx::Zygote.Context, f::GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [50] _apply(::Function, ::Vararg{Any, N} where N)
    @ Core ./boot.jl:804
 [51] adjoint
    @ ~/.julia/packages/Zygote/rv6db/src/lib/lib.jl:200 [inlined]
 [52] _pullback
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65 [inlined]
 [53] _pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8 [inlined]
 [54] _pullback(ctx::Zygote.Context, f::GalacticOptim.var"#256#266"{Tuple{}, GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface2.jl:0
 [55] _pullback(f::Function, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:34
 [56] pullback(f::Function, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:40
 [57] gradient(f::Function, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:75
 [58] (::GalacticOptim.var"#254#264"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}})(::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8
 [59] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:41 [inlined]
 [60] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/utils.jl:35 [inlined]
 [61] __solve(prob::OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#270#272"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, opt::ADAM, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:39
 [62] #solve#476
    @ ~/.julia/packages/SciMLBase/x3z0g/src/solve.jl:3 [inlined]
 [63] top-level scope
    @ /workspace/notebooks/JuliaRep2/ex.jl:70
 [64] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [65] top-level scope
    @ REPL[4]:1
in expression starting at /workspace/notebooks/JuliaRep2/ex.jl:70

Might Problem be related to GalacticOptim???? ("res = GalacticOptim.solve(prob, ADAM(0.001); cb = cb, maxiters=3000)")

Edit: If I use LBFGS optimizer alone without ADAM, result is the same with the error given above.

udemirezen · 2021-11-05T23:38:33Z

@ChrisRackauckas hi, any improvement or comment 👆👆👆
Thank you

ChrisRackauckas · 2021-11-06T03:48:12Z

That's a completely unrelated issue #267 . Did you try ADAM with QuasiRandomStrategy?

udemirezen · 2021-11-08T12:41:39Z

QuasiRandomStrategy

@ChrisRackauckas yes, I have tried that.
_strategy = NeuralPDE.QuasiRandomTraining(500; sampling_alg = SobolSample(), resampling = false, minibatch = 30)
The result is the same :( I get exactly the same error message. Even gridTraininstrategy doest not work both for ADAM and also LBFGS when I set CUDA.allowscalar(false).

I want to emphasize that "If I use LBFGS optimizer alone without ADAM, the result is the same with the error given above."
I can not train anything by using CUDA.allowscalar(false), for both ADAM and LBFGS.

ChrisRackauckas · 2021-11-08T14:00:10Z

What piece of code is using scalar indexing when not using quadrature?

udemirezen · 2021-11-12T09:10:16Z

@ChrisRackauckas
Hi, I have pasted the code earlier. Here I give the same code below:

using NeuralPDE, Flux, ModelingToolkit, GalacticOptim, Optim, DiffEqFlux
using Plots
using Quadrature,Cubature
import ModelingToolkit: Interval, infimum, supremum
using CUDA
using Random
Random.seed!(100)
CUDA.allowscalar(false)

@parameters t, x
@variables u(..), w(..)
Dxx = Differential(x)^2
Dt = Differential(t)

# Constants
a  = 1
b1 = 4
b2 = 2
c1 = 3
c2 = 1
λ1 = (b1 + c2 + sqrt((b1 + c2)^2 + 4 * (b1 * c2 - b2 * c1))) / 2
λ2 = (b1 + c2 - sqrt((b1 + c2)^2 + 4 * (b1 * c2 - b2 * c1))) / 2

# Analytic solution
θ(t, x) = exp(-t) * cos(x / a)
u_analytic(t, x) = (b1 - λ2) / (b2 * (λ1 - λ2)) * exp(λ1 * t) * θ(t, x) - (b1 - λ1) / (b2 * (λ1 - λ2)) * exp(λ2 * t) * θ(t, x)
w_analytic(t, x) = 1 / (λ1 - λ2) * (exp(λ1 * t) * θ(t, x) - exp(λ2 * t) * θ(t, x))

# Second-order constant-coefficient linear parabolic system
eqs = [Dt(u(x, t)) ~ a * Dxx(u(x, t)) + b1 * u(x, t) + c1 * w(x, t),
       Dt(w(x, t)) ~ a * Dxx(w(x, t)) + b2 * u(x, t) + c2 * w(x, t)]

# Boundary conditions
bcs = [u(0, x) ~ u_analytic(0, x),
       w(0, x) ~ w_analytic(0, x),
       u(t, 0) ~ u_analytic(t, 0),
       w(t, 0) ~ w_analytic(t, 0),
       u(t, 1) ~ u_analytic(t, 1),
       w(t, 1) ~ w_analytic(t, 1)]

# Space and time domains
domains = [x ∈ Interval(0.0, 1.0),
           t ∈ Interval(0.0, 1.0)]

# Neural network
input_ = length(domains)
n = 15
chain = [Chain(Dense(input_, n, Flux.σ), Dense(n, n, Flux.σ), Dense(n, 1)) for _ in 1:2] |>  gpu
initθ = map(c -> Float64.(c), DiffEqFlux.initial_params.(chain)) |>  gpu

_strategy = QuadratureTraining()
discretization = PhysicsInformedNN(chain, _strategy, init_params=initθ)

@named pde_system = PDESystem(eqs, bcs, domains, [t,x], [u(t,x),w(t,x)])
prob = discretize(pde_system, discretization)
sym_prob = symbolic_discretize(pde_system, discretization)

pde_inner_loss_functions = prob.f.f.loss_function.pde_loss_function.pde_loss_functions.contents
bcs_inner_loss_functions = prob.f.f.loss_function.bcs_loss_function.bc_loss_functions.contents

cb = function (p, l)
    println("loss: ", l)
    println("pde_losses: ", map(l_ -> l_(p), pde_inner_loss_functions))
    println("bcs_losses: ", map(l_ -> l_(p), bcs_inner_loss_functions))
    return false
end

println("ADAM Training...")
flush(stdout)
res = GalacticOptim.solve(prob, ADAM(0.001); cb = cb, maxiters=3000)
prob = remake(prob,u0=res.minimizer)

println("LBFGS Training...")
flush(stdout)
res = GalacticOptim.solve(prob, LBFGS(); cb = cb, maxiters=3000)
phi = discretization.phi

This is the code I have been using and it is a test code from your git repository.
Thank you

ChrisRackauckas · 2021-11-16T11:20:35Z

@KirillZubov could you take a look?

KirillZubov · 2021-11-16T13:49:42Z

@udemirezen
|> gpu convert back to Float32

initθ = map(c -> Float64.(c), DiffEqFlux.initial_params.(chain)) |>  gpu
typeof(initθ)
Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}} (alias for Array{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, 1})

it is required for GPU in NeuralPDE that all calc be in one format: Float32 (like in the test -

NeuralPDE.jl/test/NNPDE_tests_gpu.jl

Line 32 in 860972f

    
           eq = Dθ(u(θ)) ~ θ^3 + 2.f0*θ + (θ^2)*((1.f0+3*(θ^2))/(1.f0+θ+(θ^3))) - u(θ)*(θ + ((1.f0+3.f0*(θ^2))/(1.f0+θ+θ^3)))

)

or Float64, then need use so initial params should be float64

initθ = map(c -> CuArray(Float64.(c)), DiffEqFlux.initial_params.(chain))
Vector{CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}} (alias for Array{CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, 1})

KirillZubov · 2021-11-16T13:54:08Z

it works with all strategies except Quadrature

using NeuralPDE, Flux, ModelingToolkit, GalacticOptim, Optim, DiffEqFlux
using Plots
using Quadrature,Cubature
import ModelingToolkit: Interval, infimum, supremum
using CUDA
using Random
Random.seed!(100)
CUDA.allowscalar(false)

@parameters t, x
@variables u(..), w(..)
Dxx = Differential(x)^2
Dt = Differential(t)

# Constants
a  = 1
b1 = 4
b2 = 2
c1 = 3
c2 = 1
λ1 = (b1 + c2 + sqrt((b1 + c2)^2 + 4 * (b1 * c2 - b2 * c1))) / 2
λ2 = (b1 + c2 - sqrt((b1 + c2)^2 + 4 * (b1 * c2 - b2 * c1))) / 2

# Analytic solution
θ(t, x) = exp(-t) * cos(x / a)
u_analytic(t, x) = (b1 - λ2) / (b2 * (λ1 - λ2)) * exp(λ1 * t) * θ(t, x) - (b1 - λ1) / (b2 * (λ1 - λ2)) * exp(λ2 * t) * θ(t, x)
w_analytic(t, x) = 1 / (λ1 - λ2) * (exp(λ1 * t) * θ(t, x) - exp(λ2 * t) * θ(t, x))

# Second-order constant-coefficient linear parabolic system
eqs = [Dt(u(x, t)) ~ a * Dxx(u(x, t)) + b1 * u(x, t) + c1 * w(x, t),
       Dt(w(x, t)) ~ a * Dxx(w(x, t)) + b2 * u(x, t) + c2 * w(x, t)]

# Boundary conditions
bcs = [u(0, x) ~ u_analytic(0, x),
       w(0, x) ~ w_analytic(0, x),
       u(t, 0) ~ u_analytic(t, 0),
       w(t, 0) ~ w_analytic(t, 0),
       u(t, 1) ~ u_analytic(t, 1),
       w(t, 1) ~ w_analytic(t, 1)]

# Space and time domains
domains = [x ∈ Interval(0.0, 1.0),
           t ∈ Interval(0.0, 1.0)]

# Neural network
input_ = length(domains)
n = 15
chain = [Chain(Dense(input_, n, Flux.σ), Dense(n, n, Flux.σ), Dense(n, 1)) for _ in 1:2] |>  gpu
initθ = map(c -> CuArray(Float64.(c)), DiffEqFlux.initial_params.(chain))

_strategy = GridTraining(0.1)
_strategy = QuasiRandomTraining(200)
_strategy = StochasticTraining(200)
#_strategy = QuadratureTraining() not support
discretization = PhysicsInformedNN(chain, _strategy, init_params=initθ)

@named pde_system = PDESystem(eqs, bcs, domains, [t,x], [u(t,x),w(t,x)])
prob = discretize(pde_system, discretization)
sym_prob = symbolic_discretize(pde_system, discretization)

pde_inner_loss_functions = prob.f.f.loss_function.pde_loss_function.pde_loss_functions.contents
bcs_inner_loss_functions = prob.f.f.loss_function.bcs_loss_function.bc_loss_functions.contents

cb = function (p, l)
    println("loss: ", l)
    println("pde_losses: ", map(l_ -> l_(p), pde_inner_loss_functions))
    println("bcs_losses: ", map(l_ -> l_(p), bcs_inner_loss_functions))
    return false
end


println("ADAM Training...")
#flush(stdout)
res = GalacticOptim.solve(prob, ADAM(0.01); cb = cb, maxiters=10)

prob = remake(prob,u0=res.minimizer)

println("LBFGS Training...")
flush(stdout)
res = GalacticOptim.solve(prob, LBFGS(); cb = cb, maxiters=10)

prob = remake(prob,u0=res.minimizer)

println("BFGS Training...")
flush(stdout)
res = GalacticOptim.solve(prob, BFGS(); cb = cb, maxiters=10)
phi = discretization.phi

KirillZubov · 2021-11-16T16:17:32Z

but FastChain is failed with GPU system

chain = [FastChain(FastDense(input_, n, Flux.σ), FastDense(n, n, Flux.σ), FastDense(n, 1)) for _ in 1:2] 
initθ = map(c -> CuArray(Float64.(c)), DiffEqFlux.initial_params.(chain))

type Nothing has no field buffer

Stacktrace:
  [1] getproperty(x::Nothing, f::Symbol)
    @ Base ./Base.jl:33
  [2] unsafe_convert
    @ ~/.julia/packages/CUDA/YpW0k/src/array.jl:320 [inlined]
  [3] pointer
    @ ~/.julia/packages/CUDA/YpW0k/src/array.jl:275 [inlined]
  [4] mightalias(A::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ CUDA ~/.julia/packages/CUDA/YpW0k/src/array.jl:113
  [5] unalias(dest::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, A::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Base ./abstractarray.jl:1349
  [6] broadcast_unalias
    @ ./broadcast.jl:957 [inlined]
  [7] preprocess
    @ ./broadcast.jl:964 [inlined]
  [8] preprocess_args
    @ ./broadcast.jl:967 [inlined]
  [9] preprocess_args
    @ ./broadcast.jl:966 [inlined]
 [10] preprocess
    @ ./broadcast.jl:963 [inlined]
 [11] copyto!
    @ ~/.julia/packages/GPUArrays/3sW6s/src/host/broadcast.jl:53 [inlined]
 [12] copyto!
    @ ./broadcast.jl:936 [inlined]
 [13] copy
    @ ~/.julia/packages/GPUArrays/3sW6s/src/host/broadcast.jl:47 [inlined]
 [14] materialize
    @ ./broadcast.jl:883 [inlined]
 [15] broadcast_preserving_zero_d
    @ ./broadcast.jl:872 [inlined]
 [16] *(A::Float64, B::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Base ./arraymath.jl:52
 [17] #1295
    @ ~/.julia/packages/ChainRules/Tj6lu/src/rulesets/Base/arraymath.jl:123 [inlined]
 [18] unthunk
    @ ~/.julia/packages/ChainRulesCore/7ZiwT/src/tangent_types/thunks.jl:194 [inlined]
 [19] unthunk
    @ ~/.julia/packages/ChainRulesCore/7ZiwT/src/tangent_types/thunks.jl:217 [inlined]
 [20] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:104 [inlined]
 [21] map
    @ ./tuple.jl:215 [inlined]
 [22] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:105 [inlined]
 [23] ZBack
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:204 [inlined]
 [24] (::Zygote.var"#3802#back#1032"{Zygote.ZBack{ChainRules.var"#times_pullback#1297"{CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, Float64, ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}}, ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}}}})(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67
 [25] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:857 [inlined]
 [26] (::typeof(∂(λ)))(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [27] macro expansion
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:583 [inlined]
 [28] macro expansion
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:129 [inlined]
 [29] macro expansion
    @ ./none:0 [inlined]
 [30] Pullback
    @ ./none:0 [inlined]
 [31] (::Zygote.var"#208#209"{Tuple{Tuple{Nothing}, NTuple{7, Nothing}}, typeof(∂(generated_callfunc))})(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203
 [32] #1734#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [33] Pullback
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117 [inlined]
 [34] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:622 [inlined]
 [35] (::typeof(∂(λ)))(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [36] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:909 [inlined]
 [37] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [38] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:1191 [inlined]
 [39] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [40] #559
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/array.jl:211 [inlined]
 [41] (::Base.var"#4#5"{Zygote.var"#559#564"})(a::Tuple{Tuple{Float64, typeof(∂(λ))}, Float64})
    @ Base ./generator.jl:36
 [42] iterate
    @ ./generator.jl:47 [inlined]
 [43] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{Vector{Tuple{Float64, Zygote.Pullback}}, Vector{Float64}}}, Base.var"#4#5"{Zygote.var"#559#564"}})
    @ Base ./array.jl:678
 [44] map
    @ ./abstractarray.jl:2383 [inlined]
 [45] (::Zygote.var"#map_back#561"{NeuralPDE.var"#351#367"{CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}, 1, Tuple{Vector{NeuralPDE.var"#299#300"{_A, CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}} where _A}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{Float64, Zygote.Pullback}}})(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/lib/array.jl:211
 [46] (::Zygote.var"#2577#back#565"{Zygote.var"#map_back#561"{NeuralPDE.var"#351#367"{CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}, 1, Tuple{Vector{NeuralPDE.var"#299#300"{_A, CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}} where _A}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{Float64, Zygote.Pullback}}}})(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67
 [47] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:1191 [inlined]
 [48] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [49] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:1193 [inlined]
 [50] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [51] Pullback
    @ ~/.julia/packages/NeuralPDE/6acEl/src/pinns_pde_solve.jl:1197 [inlined]
 [52] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [53] #208
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203 [inlined]
 [54] #1734#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [55] Pullback
    @ ~/.julia/packages/SciMLBase/x3z0g/src/problems/basic_problems.jl:107 [inlined]
 [56] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [57] #208
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203 [inlined]
 [58] #1734#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [59] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:6 [inlined]
 [60] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [61] #208
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203 [inlined]
 [62] #1734#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [63] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8 [inlined]
 [64] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [65] (::Zygote.var"#55#56"{typeof(∂(λ))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface.jl:41
 [66] gradient(f::Function, args::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface.jl:76
 [67] (::GalacticOptim.var"#231#241"{GalacticOptim.var"#230#240"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#371"{NeuralPDE.var"#354#370"{NeuralPDE.var"#352#368", NeuralPDE.var"#350#366"}, Vector{NeuralPDE.var"#274#276"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}})(::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8
 [68] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:41 [inlined]
 [69] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/utils.jl:35 [inlined]
 [70] __solve(prob::OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#371"{NeuralPDE.var"#354#370"{NeuralPDE.var"#352#368", NeuralPDE.var"#350#366"}, Vector{NeuralPDE.var"#274#276"{FastChain{Tuple{FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(σ), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}, UnionAll}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, opt::ADAM, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:39
 [71] #solve#476
    @ ~/.julia/packages/SciMLBase/x3z0g/src/solve.jl:3 [inlined]
 [72] top-level scope
    @ In[9]:80
 [73] eval
    @ ./boot.jl:360 [inlined]

udemirezen · 2021-11-19T06:51:37Z

@KirillZubov Sorry but according to your answer I checked what you said:

....
CUDA.allowscalar(false)
....

input_ = length(domains)
n = 15
chain = [Chain(Dense(input_, n, Flux.σ), Dense(n, n, Flux.σ), Dense(n, 1)) for _ in 1:2] |> gpu
initθ = map(c -> CuArray(Float32.(c)), DiffEqFlux.initial_params.(chain))

_strategy = NeuralPDE.GridTraining(0.1) #NeuralPDE.StochasticTraining(50)#QuadratureTraining()
discretization = PhysicsInformedNN(chain, _strategy, init_params=initθ)

@named pde_system = PDESystem(eqs, bcs, domains, [t,x], [u(t,x), w(t,x)])

prob = discretize(pde_system, discretization)
sym_prob = symbolic_discretize(pde_system, discretization)

When I investigate the objects both are Float32 as far as I see from the typeof command.

julia> typeof(chain)
Vector{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}} (alias for Array{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, 1})

julia> typeof(initθ)
Vector{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}} (alias for Array{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, 1})

But again it failed even with GridTraining. Result is :

julia> include("ex.jl")
LBFGS Training...
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] assertscalar(op::String)
    @ GPUArrays ~/.julia/packages/GPUArrays/3sW6s/src/host/indexing.jl:53
  [3] getindex(xs::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, I::Int64)
    @ GPUArrays ~/.julia/packages/GPUArrays/3sW6s/src/host/indexing.jl:86
  [4] first
    @ ./abstractarray.jl:368 [inlined]
  [5] dot(x::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, y::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/generic.jl:914
  [6] #1287
    @ ~/.julia/packages/ChainRules/Tj6lu/src/rulesets/Base/arraymath.jl:102 [inlined]
  [7] unthunk
    @ ~/.julia/packages/ChainRulesCore/7ZiwT/src/tangent_types/thunks.jl:194 [inlined]
  [8] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:104 [inlined]
  [9] map
    @ ./tuple.jl:215 [inlined]
 [10] wrap_chainrules_output
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:105 [inlined]
 [11] ZBack
    @ ~/.julia/packages/Zygote/AlLTp/src/compiler/chainrules.jl:204 [inlined]
 [12] (::Zygote.var"#3790#back#1030"{Zygote.ZBack{ChainRules.var"#times_pullback#1290"{Float64, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float32, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}}, ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}}}})(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67
 [13] macro expansion
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:556 [inlined]
 [14] macro expansion
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:129 [inlined]
 [15] macro expansion
    @ ./none:0 [inlined]
 [16] Pullback
    @ ./none:0 [inlined]
 [17] (::Zygote.var"#208#209"{Tuple{Tuple{Nothing}, NTuple{7, Nothing}}, typeof(∂(generated_callfunc))})(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203
 [18] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [19] Pullback
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117 [inlined]
 [20] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:595 [inlined]
 [21] (::typeof(∂(λ)))(Δ::CuArray{Float64, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [22] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:875 [inlined]
 [23] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [24] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1158 [inlined]
 [25] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [26] #559
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/array.jl:211 [inlined]
 [27] (::Base.var"#4#5"{Zygote.var"#559#564"})(a::Tuple{Tuple{Float64, typeof(∂(λ))}, Float64})
    @ Base ./generator.jl:36
 [28] iterate
    @ ./generator.jl:47 [inlined]
 [29] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{Vector{Tuple{Float64, Zygote.Pullback}}, Vector{Float64}}}, Base.var"#4#5"{Zygote.var"#559#564"}})
    @ Base ./array.jl:681
 [30] map
    @ ./abstractarray.jl:2383 [inlined]
 [31] (::Zygote.var"#map_back#561"{NeuralPDE.var"#351#367"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, 1, Tuple{Vector{NeuralPDE.var"#297#298"{_A, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}} where _A}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{Float64, Zygote.Pullback}}})(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/lib/array.jl:211
 [32] (::Zygote.var"#2576#back#565"{Zygote.var"#map_back#561"{NeuralPDE.var"#351#367"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, 1, Tuple{Vector{NeuralPDE.var"#297#298"{_A, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}} where _A}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{Float64, Zygote.Pullback}}}})(Δ::FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}})
    @ Zygote ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67
 [33] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1158 [inlined]
 [34] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [35] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1159 [inlined]
 [36] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [37] Pullback
    @ ~/.julia/packages/NeuralPDE/nxvU8/src/pinns_pde_solve.jl:1163 [inlined]
 [38] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [39] #208
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203 [inlined]
 [40] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [41] Pullback
    @ ~/.julia/packages/SciMLBase/x3z0g/src/problems/basic_problems.jl:107 [inlined]
 [42] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [43] #208
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203 [inlined]
 [44] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [45] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:6 [inlined]
 [46] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [47] #208
    @ ~/.julia/packages/Zygote/AlLTp/src/lib/lib.jl:203 [inlined]
 [48] #1733#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [49] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8 [inlined]
 [50] (::typeof(∂(λ)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface2.jl:0
 [51] (::Zygote.var"#55#56"{typeof(∂(λ))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface.jl:41
 [52] gradient(f::Function, args::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Zygote ~/.julia/packages/Zygote/AlLTp/src/compiler/interface.jl:76
 [53] (::GalacticOptim.var"#254#264"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}})(::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8
 [54] (::GalacticOptim.var"#130#138"{OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, GalacticOptim.var"#129#137"{OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#254#264"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, GalacticOptim.var"#257#267"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, GalacticOptim.var"#262#272", Nothing, Nothing, Nothing}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#254#264"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, GalacticOptim.var"#257#267"{GalacticOptim.var"#253#263"{OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, GalacticOptim.var"#262#272", Nothing, Nothing, Nothing}})(G::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, θ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/solve/optim.jl:93
 [55] value_gradient!!(obj::TwiceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/interface.jl:82
 [56] initial_state(method::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#17#19"}, options::Optim.Options{Float64, GalacticOptim.var"#_cb#136"{var"#15#18", LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#17#19"}, Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}}}, d::TwiceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
    @ Optim ~/.julia/packages/Optim/rES57/src/multivariate/solvers/first_order/l_bfgs.jl:164
 [57] optimize(d::TwiceDifferentiable{Float32, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, method::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#17#19"}, options::Optim.Options{Float64, GalacticOptim.var"#_cb#136"{var"#15#18", LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#17#19"}, Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}}})
    @ Optim ~/.julia/packages/Optim/rES57/src/multivariate/optimize/optimize.jl:35
 [58] ___solve(prob::OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoZygote, NeuralPDE.var"#loss_function_#369"{NeuralPDE.var"#352#368"{NeuralPDE.var"#350#366", NeuralPDE.var"#348#364"}, Vector{NeuralPDE.var"#271#273"{UnionAll, Flux.var"#60#62"{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}}}}, Nothing, Bool, Nothing}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, opt::LBFGS{Nothing, LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Optim.var"#17#19"}, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; cb::Function, maxiters::Int64, maxtime::Nothing, abstol::Nothing, reltol::Nothing, progress::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/solve/optim.jl:129
 [59] #__solve#127
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/solve/optim.jl:49 [inlined]
 [60] #solve#476
    @ ~/.julia/packages/SciMLBase/x3z0g/src/solve.jl:3 [inlined]
 [61] top-level scope
    @ /workspace/notebooks/JuliaRep2/ex.jl:77
 [62] include(fname::String)
    @ Base.MainInclude ./client.jl:444
 [63] top-level scope
    @ REPL[6]:1
in expression starting at /workspace/notebooks/JuliaRep2/ex.jl:77

So what is wrong with this? Am I doing somthing wrong?
what are the package versions for your trials?

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [fbb218c0] BSON v0.3.4
  [052768ef] CUDA v3.5.0
  [667455a9] Cubature v1.5.1
  [2445eb08] DataDrivenDiffEq v0.6.6
  [aae7a2af] DiffEqFlux v1.44.0
  [9fdde737] DiffEqOperators v4.34.0
  [0c46a032] DifferentialEquations v6.19.0
  [5b8099bc] DomainSets v0.5.9
  [587475ba] Flux v0.12.8
  [a75be94c] GalacticOptim v2.2.0
  [f67ccb44] HDF5 v0.15.7
  [7073ff75] IJulia v1.23.2
  [961ee093] ModelingToolkit v6.7.1
  [315f7962] NeuralPDE v4.0.1
  [429524aa] Optim v1.5.0
  [1dea7af3] OrdinaryDiffEq v5.65.5
  [91a5bcdd] Plots v1.23.3
  [67601950] Quadrature v1.12.0
  [8a4e6c94] QuasiMonteCarlo v0.2.3
  [9a3f8284] Random

How can i get rid of this problem?
Thank you for your support and help.

KirillZubov · 2021-11-19T12:40:40Z

@udemirezen if you want to calculate in Float32 , description of eq and bcs also should be in Float32 - bcs = [u(0.) ~ 1.0f0]
like in this example(which I mentioned above)

NeuralPDE.jl/test/NNPDE_tests_gpu.jl

Line 32 in 860972f

    
           eq = Dθ(u(θ)) ~ θ^3 + 2.f0*θ + (θ^2)*((1.f0+3*(θ^2))/(1.f0+θ+(θ^3))) - u(θ)*(θ + ((1.f0+3.f0*(θ^2))/(1.f0+θ+θ^3)))

ChrisRackauckas · 2022-06-29T10:46:58Z

All that's left is #267, so closing as a duplicate of that issue.

killah-t-cell closed this as completed Oct 5, 2021

killah-t-cell reopened this Oct 7, 2021

killah-t-cell closed this as completed Oct 26, 2021

ChrisRackauckas added a commit to ChrisRackauckas/Optim.jl that referenced this issue Nov 1, 2021

Fully remove LoopVectorization

6b808a1

This will allow the version to be tagged JuliaRegistries/General#46769 and is required to finally get SciML/NeuralPDE.jl#410 (comment) solved.

ChrisRackauckas mentioned this issue Nov 1, 2021

Fully remove LoopVectorization JuliaNLSolvers/Optim.jl#959

Merged

pkofod pushed a commit to JuliaNLSolvers/Optim.jl that referenced this issue Nov 1, 2021

Fully remove LoopVectorization (#959)

2c570a3

This will allow the version to be tagged JuliaRegistries/General#46769 and is required to finally get SciML/NeuralPDE.jl#410 (comment) solved.

killah-t-cell reopened this Nov 8, 2021

KirillZubov mentioned this issue Dec 21, 2021

Systems of PDE using GPU #460

Closed

ChrisRackauckas closed this as completed Jun 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System of PDEs with CUDA? #410

System of PDEs with CUDA? #410

killah-t-cell commented Oct 3, 2021

ChrisRackauckas commented Oct 4, 2021

killah-t-cell commented Oct 4, 2021

ChrisRackauckas commented Oct 4, 2021

killah-t-cell commented Oct 5, 2021

killah-t-cell commented Oct 7, 2021

ChrisRackauckas commented Oct 7, 2021

udemirezen commented Oct 31, 2021

ChrisRackauckas commented Oct 31, 2021

udemirezen commented Nov 1, 2021 •

edited

Loading

ChrisRackauckas commented Nov 1, 2021

udemirezen commented Nov 3, 2021

ChrisRackauckas commented Nov 3, 2021

udemirezen commented Nov 4, 2021 •

edited

Loading

udemirezen commented Nov 5, 2021

ChrisRackauckas commented Nov 6, 2021

udemirezen commented Nov 8, 2021 •

edited

Loading

ChrisRackauckas commented Nov 8, 2021

udemirezen commented Nov 12, 2021

ChrisRackauckas commented Nov 16, 2021

KirillZubov commented Nov 16, 2021

KirillZubov commented Nov 16, 2021

KirillZubov commented Nov 16, 2021

udemirezen commented Nov 19, 2021

KirillZubov commented Nov 19, 2021

ChrisRackauckas commented Jun 29, 2022

System of PDEs with CUDA? #410

System of PDEs with CUDA? #410

Comments

killah-t-cell commented Oct 3, 2021

ChrisRackauckas commented Oct 4, 2021

killah-t-cell commented Oct 4, 2021

ChrisRackauckas commented Oct 4, 2021

killah-t-cell commented Oct 5, 2021

killah-t-cell commented Oct 7, 2021

ChrisRackauckas commented Oct 7, 2021

udemirezen commented Oct 31, 2021

ChrisRackauckas commented Oct 31, 2021

udemirezen commented Nov 1, 2021 • edited Loading

ChrisRackauckas commented Nov 1, 2021

udemirezen commented Nov 3, 2021

ChrisRackauckas commented Nov 3, 2021

udemirezen commented Nov 4, 2021 • edited Loading

udemirezen commented Nov 5, 2021

ChrisRackauckas commented Nov 6, 2021

udemirezen commented Nov 8, 2021 • edited Loading

ChrisRackauckas commented Nov 8, 2021

udemirezen commented Nov 12, 2021

ChrisRackauckas commented Nov 16, 2021

KirillZubov commented Nov 16, 2021

KirillZubov commented Nov 16, 2021

KirillZubov commented Nov 16, 2021

udemirezen commented Nov 19, 2021

KirillZubov commented Nov 19, 2021

ChrisRackauckas commented Jun 29, 2022

udemirezen commented Nov 1, 2021 •

edited

Loading

udemirezen commented Nov 4, 2021 •

edited

Loading

udemirezen commented Nov 8, 2021 •

edited

Loading