coreneuron_modtests::datareturn_py_gpu test hangs using NMODL, OpenACC and OpenMP host threading #810

olupton · 2022-05-05T15:40:02Z

Describe the issue
In GPU-enabled builds that combine NMODL, GPU offload using OpenACC, and OpenMP host-side threading, the coreneuron_modtests::datareturn_py_gpu test hangs.

The hang occurs at

    void nrn_init_pas(NrnThread* nt, Memb_list* ml, int type) {
        nrn_pragma_acc(data present(nt, ml, pas_global) if(nt->compute_gpu))

apparently during initialisation of the second CoreNEURON run that has 2 threads:

(gdb) bt 10
#0  0x00007fffebc9241c in __pgi_uacc_event_synchronize (present_entry=0xa4b1e70, devid=1, async=-1, hostptr=0x5961ce8) at ../../src/event_mgmt.c:192
#1  0x00007fffebc8ef51 in __pgi_uacc_dataonb (filename=0x7bc5c0 <.F0003.22059> "/gpfs/bbp.cscs.ch/home/olupton/nrn2/build/test/nrnivmodl/0f75a099f800c237e43e24070e16322d263224e3a0f3b81e12c9c6aacb9933b1/x86_64/corenrn/mod2c/passive.cpp",
    funcname=0x7bc660 <.F0004.22061> "_ZN10coreneuron12nrn_init_pasEPNS_9NrnThreadEPNS_9Memb_listEi", pdevptr=0x7fffffff4a30, hostptr=0x5961ce8, hostptrptr=0x7fffffff4b18, poffset=0, dims=1, desc=0x7fffffff4ad0, elementsize=464, hostdescptr=0x0, hostdescsize=0,
    lineno=219, name=0x7bca77 <.S22085> "nt", pdtype=0x0, flags=512, async=-1, devid=1) at ../../src/dataonb.c:294
#2  0x0000000000489eb0 in coreneuron::nrn_init_pas (nt=0x5961ce8, ml=0x6280d00) at x86_64/corenrn/mod2c/passive.cpp:247
#3  0x00000000006f67e3 in coreneuron::allocate_data_in_mechanism_nrn_init () at /gpfs/bbp.cscs.ch/home/olupton/nrn2/external/coreneuron/coreneuron/sim/finitialize.cpp:33
#4  0x00000000005155a7 in coreneuron::nrn_init_and_load_data (checkPoints=<optimized out>, is_mapping_needed=0 '\000', run_setup_cleanup=<optimized out>) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/external/coreneuron/coreneuron/apps/main1.cpp:351
#5  0x00000000004f81d8 in run_solve_core (argc=<optimized out>, argv=<optimized out>) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/external/coreneuron/coreneuron/apps/main1.cpp:551
#6  0x00000000004566ce in corenrn_embedded_run (nthread=<optimized out>, have_gaps=<optimized out>, use_mpi=<optimized out>, use_fast_imem=<optimized out>, mpi_lib=<optimized out>, nrn_arg=<optimized out>)
    at /gpfs/bbp.cscs.ch/home/olupton/nrn2/build/share/coreneuron/enginemech.cpp:109
#7  0x00007fffed550429 in nrncore_psolve (tstop=<optimized out>, file_mode=<optimized out>) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/src/nrniv/nrncore_write.cpp:291
#8  0x00007fffed579c91 in psolve (v=0x5834420) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/src/nrniv/../parallel/ocbbs.cpp:684

Setting OMP_NUM_THREADS=1 avoids the problem, as does removing the OpenMP parallelism from this line:

CoreNeuron/coreneuron/sim/fadvance_core.cpp

Line 99 in d07bae0

nrn_multithread_job(nrn_fixed_step_thread);

This datareturn test runs various combinations of psolve-direct mode, number of threads, and cell permutation scheme:
https://github.com/neuronsimulator/nrn/blob/3cc83cf7b09044a478665490d52acfc31f606036/test/coreneuron/test_datareturn.py#L184-L186

The current CI does not include this configuration. We have:

MOD2C + OpenACC + OpenMP host threading:

CoreNeuron/.gitlab-ci.yml

Lines 84 to 89 in d07bae0

    
           build:coreneuron:mod2c:nvhpc:acc: 
        
             extends: [.build, .spack_nvhpc] 
        
             variables: 
        
               SPACK_PACKAGE: coreneuron 
        
               # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type 
        
               SPACK_PACKAGE_SPEC: +gpu+openmp+tests~legacy-unit build_type=RelWithDebInfo

MOD2C + OpenACC + OpenMP host threading + unified memory:

CoreNeuron/.gitlab-ci.yml

Lines 91 to 97 in d07bae0

    
           # Build CoreNEURON with Unified Memory on GPU 
        
           build:coreneuron:mod2c:nvhpc:acc:unified: 
        
             extends: [.build, .spack_nvhpc] 
        
             variables: 
        
               SPACK_PACKAGE: coreneuron 
        
               # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type 
        
               SPACK_PACKAGE_SPEC: +gpu+unified+openmp+tests~legacy-unit build_type=RelWithDebInfo

which we do not run the NEURON tests against

NMODL + OpenMP target offload and host threading:

CoreNeuron/.gitlab-ci.yml

Lines 112 to 118 in d07bae0

    
           build:coreneuron:nmodl:nvhpc:omp: 
        
             extends: [.build_coreneuron_nmodl, .spack_nvhpc] 
        
             variables: 
        
               SPACK_PACKAGE: coreneuron 
        
               # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type 
        
               SPACK_PACKAGE_SPEC: +nmodl+openmp+gpu+tests~legacy-unit~sympy build_type=RelWithDebInfo 
        
             needs: ["build:nmodl"]

NMODL + OpenACC without OpenMP host threading:

CoreNeuron/.gitlab-ci.yml

Lines 120 to 127 in d07bae0

    
           build:coreneuron:nmodl:nvhpc:acc: 
        
             extends: [.build_coreneuron_nmodl, .spack_nvhpc] 
        
             variables: 
        
               SPACK_PACKAGE: coreneuron 
        
               # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type 
        
               # Sympy + OpenMP target offload does not currently work with NVHPC 
        
               SPACK_PACKAGE_SPEC: +nmodl~openmp+gpu+tests~legacy-unit+sympy build_type=RelWithDebInfo 
        
             needs: ["build:nmodl"]

To Reproduce
Configure + build NEURON + CoreNEURON + NMODL:

cmake .. -G Ninja \
    -DNRN_ENABLE_RX3D=OFF \
    -DNRN_ENABLE_TESTS=ON \
    -DNRN_ENABLE_CORENEURON=ON \
    -DNRN_ENABLE_INTERVIEWS=OFF \
    -DCORENRN_ENABLE_GPU=ON \
    -DCORENRN_ENABLE_NMODL=ON \
    -DCORENRN_ENABLE_OPENMP=ON \
    -DCORENRN_ENABLE_OPENMP_OFFLOAD=OFF \
    -DPYTHON_EXECUTABLE=$(command -v python3)
ninja
ctest -R datareturn_py_gpu -V

Expected behaviour
This shouldn't hang. https://gitlab.com/QEF/q-e/-/issues/452 seems to show a similar issue, but the comments there indicate the issue was fixed in NVHPC 22.3, which is the same version I see the above hang with.

System (please complete the following information)

OS: BB5
Compiler: NVHPC 22.3
Version: master
Backend: GPU

The text was updated successfully, but these errors were encountered:

olupton added bug gpu CI Jenkins CI tests labels May 5, 2022

olupton mentioned this issue Aug 19, 2022

Support for shared libraries in GPU execution (python launch support) #795

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coreneuron_modtests::datareturn_py_gpu test hangs using NMODL, OpenACC and OpenMP host threading #810

coreneuron_modtests::datareturn_py_gpu test hangs using NMODL, OpenACC and OpenMP host threading #810

olupton commented May 5, 2022

coreneuron_modtests::datareturn_py_gpu test hangs using NMODL, OpenACC and OpenMP host threading #810

coreneuron_modtests::datareturn_py_gpu test hangs using NMODL, OpenACC and OpenMP host threading #810

Comments

olupton commented May 5, 2022