Skip to content
This repository has been archived by the owner on Mar 20, 2023. It is now read-only.

Data structure development

Olli Lupton edited this page Apr 26, 2021 · 7 revisions

This page is intended to collect ideas and discussion points for the forthcoming development work on NEURON and CoreNEURON data structures, particularly in relation to host/device memory transfers and general memory management.

Goals

Improvements

The existing code managing data transfer to/from an accelerator/GPU is rather explicit and error prone, with a lot of hand-written code to ensure -- for example -- that structs containing pointers have those pointers updated to be valid on the device side too, and so on. Deallocation/cleanup is also managed by hand, and was only added rather recently in some cases. We would, therefore, like to improve in these areas as part of a re-write. For example:

  • Automatic lifetime management
  • Abstracting away the 2-part copy-data-and-then-manually-update-pointers-to-it pattern
  • ...

New features

Re-working this area of [Core]NEURON is also an opportunity to add support for new features and optimisations. Here is a list of possible improvements, please add to it as appropriate!

  • Support for placing data in different types of device/GPU memory (e.g. constants -- this can include mechanism properties and matrix elements)
  • Reducing device/GPU memory requirements by copying more selectively (there is data relating to artificial mechanisms that never needs to go onto the device/GPU)
  • Finer grained control over NUMA domains / locality (unclear how explicit this needs to be; it may be fine to just allocate from pinned threads)
  • Support for GPU programming models other than OpenACC (perhaps move to OpenMP if support is now better; would like Intel/AMD GPU support)
  • Support for multiple GPUs per process (this is probably not important to us)
  • Support for monitoring mechanism properties' evolution on GPU, probably by buffering per-time-step on the device and doing batched copies to the host CPU (this is getting into implementation detail, but good to be aware)
  • ...

Possible libraries/resources

Here are some possible resources or libraries that could be used

Umpire

Umpire is a fairly lightweight library providing a uniform interface for allocating host/device memory and moving data between different address spaces

CHAI

CHAI is built on top of Umpire and provides an array type that tries to automate data migration between different address spaces.

Links

Useful resources

This GitHub repository

External