-
Notifications
You must be signed in to change notification settings - Fork 39
Data structure development
This page is intended to collect ideas and discussion points for the forthcoming development work on NEURON and CoreNEURON data structures, particularly in relation to host/device memory transfers and general memory management.
The existing code managing data transfer to/from an accelerator/GPU is rather explicit and error prone, with a lot of hand-written code to ensure -- for example -- that structs containing pointers have those pointers updated to be valid on the device side too, and so on. Deallocation/cleanup is also managed by hand, and was only added rather recently in some cases. We would, therefore, like to improve in these areas as part of a re-write. For example:
- Automatic lifetime management
- Abstracting away the 2-part copy-data-and-then-manually-update-pointers-to-it pattern
- ...
Re-working this area of [Core]NEURON is also an opportunity to add support for new features and optimisations. Here is a list of possible improvements, please add to it as appropriate!
- Support for placing data in different types of device/GPU memory (e.g. constants -- this can include mechanism properties and matrix elements)
- Reducing device/GPU memory requirements by copying more selectively (there is data relating to artificial mechanisms that never needs to go onto the device/GPU)
- Finer grained control over NUMA domains / locality (unclear how explicit this needs to be; it may be fine to just allocate from pinned threads)
- Support for GPU programming models other than OpenACC (perhaps move to OpenMP if support is now better; would like Intel/AMD GPU support)
- Support for multiple GPUs per process (this is probably not important to us)
- Support for monitoring mechanism properties' evolution on GPU, probably by buffering per-time-step on the device and doing batched copies to the host CPU (this is getting into implementation detail, but good to be aware)
- ...
Here are some possible resources or libraries that could be used
Umpire is a fairly lightweight library providing a uniform interface for allocating host/device memory and moving data between different address spaces
CHAI is built on top of Umpire and provides an array type that tries to automate data migration between different address spaces.
Useful resources
- https://github.com/BlueBrain/CoreNeuron/issues/201
- https://github.com/BlueBrain/CoreNeuron/issues/253
- CoreNEURON data structure redesign (restricted)
- CoreNEURON data structures (restricted, slides)
- CoreNEURON report (restricted)