Skip to content
This repository has been archived by the owner on Mar 20, 2023. It is now read-only.

coreneuron_modtests::datareturn_py_gpu test hangs using NMODL, OpenACC and OpenMP host threading #810

Open
olupton opened this issue May 5, 2022 · 0 comments
Labels

Comments

@olupton
Copy link
Contributor

olupton commented May 5, 2022

Describe the issue
In GPU-enabled builds that combine NMODL, GPU offload using OpenACC, and OpenMP host-side threading, the coreneuron_modtests::datareturn_py_gpu test hangs.

The hang occurs at

    void nrn_init_pas(NrnThread* nt, Memb_list* ml, int type) {
        nrn_pragma_acc(data present(nt, ml, pas_global) if(nt->compute_gpu))

apparently during initialisation of the second CoreNEURON run that has 2 threads:

(gdb) bt 10
#0  0x00007fffebc9241c in __pgi_uacc_event_synchronize (present_entry=0xa4b1e70, devid=1, async=-1, hostptr=0x5961ce8) at ../../src/event_mgmt.c:192
#1  0x00007fffebc8ef51 in __pgi_uacc_dataonb (filename=0x7bc5c0 <.F0003.22059> "/gpfs/bbp.cscs.ch/home/olupton/nrn2/build/test/nrnivmodl/0f75a099f800c237e43e24070e16322d263224e3a0f3b81e12c9c6aacb9933b1/x86_64/corenrn/mod2c/passive.cpp",
    funcname=0x7bc660 <.F0004.22061> "_ZN10coreneuron12nrn_init_pasEPNS_9NrnThreadEPNS_9Memb_listEi", pdevptr=0x7fffffff4a30, hostptr=0x5961ce8, hostptrptr=0x7fffffff4b18, poffset=0, dims=1, desc=0x7fffffff4ad0, elementsize=464, hostdescptr=0x0, hostdescsize=0,
    lineno=219, name=0x7bca77 <.S22085> "nt", pdtype=0x0, flags=512, async=-1, devid=1) at ../../src/dataonb.c:294
#2  0x0000000000489eb0 in coreneuron::nrn_init_pas (nt=0x5961ce8, ml=0x6280d00) at x86_64/corenrn/mod2c/passive.cpp:247
#3  0x00000000006f67e3 in coreneuron::allocate_data_in_mechanism_nrn_init () at /gpfs/bbp.cscs.ch/home/olupton/nrn2/external/coreneuron/coreneuron/sim/finitialize.cpp:33
#4  0x00000000005155a7 in coreneuron::nrn_init_and_load_data (checkPoints=<optimized out>, is_mapping_needed=0 '\000', run_setup_cleanup=<optimized out>) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/external/coreneuron/coreneuron/apps/main1.cpp:351
#5  0x00000000004f81d8 in run_solve_core (argc=<optimized out>, argv=<optimized out>) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/external/coreneuron/coreneuron/apps/main1.cpp:551
#6  0x00000000004566ce in corenrn_embedded_run (nthread=<optimized out>, have_gaps=<optimized out>, use_mpi=<optimized out>, use_fast_imem=<optimized out>, mpi_lib=<optimized out>, nrn_arg=<optimized out>)
    at /gpfs/bbp.cscs.ch/home/olupton/nrn2/build/share/coreneuron/enginemech.cpp:109
#7  0x00007fffed550429 in nrncore_psolve (tstop=<optimized out>, file_mode=<optimized out>) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/src/nrniv/nrncore_write.cpp:291
#8  0x00007fffed579c91 in psolve (v=0x5834420) at /gpfs/bbp.cscs.ch/home/olupton/nrn2/src/nrniv/../parallel/ocbbs.cpp:684

Setting OMP_NUM_THREADS=1 avoids the problem, as does removing the OpenMP parallelism from this line:

nrn_multithread_job(nrn_fixed_step_thread);

This datareturn test runs various combinations of psolve-direct mode, number of threads, and cell permutation scheme:
https://github.com/neuronsimulator/nrn/blob/3cc83cf7b09044a478665490d52acfc31f606036/test/coreneuron/test_datareturn.py#L184-L186

The current CI does not include this configuration. We have:

  • MOD2C + OpenACC + OpenMP host threading:
    build:coreneuron:mod2c:nvhpc:acc:
    extends: [.build, .spack_nvhpc]
    variables:
    SPACK_PACKAGE: coreneuron
    # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type
    SPACK_PACKAGE_SPEC: +gpu+openmp+tests~legacy-unit build_type=RelWithDebInfo
  • MOD2C + OpenACC + OpenMP host threading + unified memory:
    # Build CoreNEURON with Unified Memory on GPU
    build:coreneuron:mod2c:nvhpc:acc:unified:
    extends: [.build, .spack_nvhpc]
    variables:
    SPACK_PACKAGE: coreneuron
    # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type
    SPACK_PACKAGE_SPEC: +gpu+unified+openmp+tests~legacy-unit build_type=RelWithDebInfo
    which we do not run the NEURON tests against
  • NMODL + OpenMP target offload and host threading:

    CoreNeuron/.gitlab-ci.yml

    Lines 112 to 118 in d07bae0

    build:coreneuron:nmodl:nvhpc:omp:
    extends: [.build_coreneuron_nmodl, .spack_nvhpc]
    variables:
    SPACK_PACKAGE: coreneuron
    # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type
    SPACK_PACKAGE_SPEC: +nmodl+openmp+gpu+tests~legacy-unit~sympy build_type=RelWithDebInfo
    needs: ["build:nmodl"]
  • NMODL + OpenACC without OpenMP host threading:

    CoreNeuron/.gitlab-ci.yml

    Lines 120 to 127 in d07bae0

    build:coreneuron:nmodl:nvhpc:acc:
    extends: [.build_coreneuron_nmodl, .spack_nvhpc]
    variables:
    SPACK_PACKAGE: coreneuron
    # See https://github.com/BlueBrain/CoreNeuron/issues/518 re: build_type
    # Sympy + OpenMP target offload does not currently work with NVHPC
    SPACK_PACKAGE_SPEC: +nmodl~openmp+gpu+tests~legacy-unit+sympy build_type=RelWithDebInfo
    needs: ["build:nmodl"]

To Reproduce
Configure + build NEURON + CoreNEURON + NMODL:

cmake .. -G Ninja \
    -DNRN_ENABLE_RX3D=OFF \
    -DNRN_ENABLE_TESTS=ON \
    -DNRN_ENABLE_CORENEURON=ON \
    -DNRN_ENABLE_INTERVIEWS=OFF \
    -DCORENRN_ENABLE_GPU=ON \
    -DCORENRN_ENABLE_NMODL=ON \
    -DCORENRN_ENABLE_OPENMP=ON \
    -DCORENRN_ENABLE_OPENMP_OFFLOAD=OFF \
    -DPYTHON_EXECUTABLE=$(command -v python3)
ninja
ctest -R datareturn_py_gpu -V

Expected behaviour
This shouldn't hang. https://gitlab.com/QEF/q-e/-/issues/452 seems to show a similar issue, but the comments there indicate the issue was fixed in NVHPC 22.3, which is the same version I see the above hang with.

System (please complete the following information)

  • OS: BB5
  • Compiler: NVHPC 22.3
  • Version: master
  • Backend: GPU
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant