Skip to content

Finite Volume (FV) and high-order Discontinuous Galerkin (DG) solver

License

Notifications You must be signed in to change notification settings

dshawul/NebulaSEM

Repository files navigation

NebulaSEM APIdoc JOSS DOI DOI

NebulaSEM

NebulaSEM is an experimental finite volume (FV) and discontinuous Galerkin spectral element (dGSEM) code for solving partial differential equations (PDEs) of fluid dynamics. It comes with solvers for compressible and incompressible flow, and also provides infrastructure for writing PDE solvers easily with support for adaptive mesh refinement (AMR). The primary focus of the software is research in a high-order non-hydrostatic dynamical core for atmospheric simulations. Several examples are provided to demonstrate this capability.

Build and install

Clone the repository

git clone https://github.com/dshawul/NebulaSEM.git

To build and install NebulaSEM

mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=release -DCMAKE_INSTALL_PREFIX=.. ..
make && make install

Additional options to enable fine-grained parallelization on CPUs and GPUs

-DUSE_ACC=ON
-DUSE_OMP=ON

This will install tools for pre-processing, solution and post-processing. The tool mesh generates the grid, prepare does various pre- and post-processing, and several other binaries for solving PDEs e.g. euler, convection etc

Requirements

  • METIS for domain decomposition
  • OpenMPI or other MPI library compatible with your C++ compiler
  • CMake for makefile generation
  • GCC or other C++ compiler with at least C++17 standard support

Optional packages

Documentation

The API documentation built with Doxygen can be found here. Also READMEs are provided for writing new solvers under apps/ directory, and for setting up test cases under examples/ directory.

Testing

A testing script test.sh is provided. By default it runs the lid-driven test case under examples/cavity using the binaries installed with the make install command.

To run a test case, execute the test.sh script specifying the number of MPI ranks, if greater than, 1 and the test case name.

./test.sh -n 2 -c examples/atmo/advection-leveque/bubble

If the binaries are installed other than $PWD/bin, pass the installation location to it using the --bin-path option

./test.sh -n 2 --bin-path /usr/local/bin -c examples/atmo/advection-leveque/bubble

The path to the test case should point to the grid file, in this case bubble, not to the directory itself.

Usage: ./test.sh [options]

   -n,--np       Number of MPI processes to launch.
   -c,--case     Path to grid file name that is under a test case directory.
   -b,--bin-path Path to binaries: mesh, prepare and solvers.
   -s,--steps    Number of time steps, which overwrites the one in control file.
   -h,--help     Display this help message.

Lid-driven cavity flow

This test case uses the Pressure Implicit Splitting of Operators (PISO) solver for incompressible flow at low Reynolds number i.e. no turbulence.

$ ./test.sh -n 1 -c examples/cavity-amr/cavity

This will generate a run-examples-cavity-amr directory in which you can find the results including VTK files for visualization by Paraview.

Here are images of the decomposition using METIS with 12 mpi ranks, and the magnitude of velocity plots.

Pitz-Daily test case

Another test case, namely the Pitz and Daily, solved using LES is shown below. You can see the formation of eddies at the backward facing step and later convection towards the outlet.

The same test case simulated with the k-e turbulence model is shown below. It is a Reynolds-average turbulence scheme so only mean state is displayed.

Rising thermal bubble

This is a popular test case for numerical weather prediction models that solve the Euler equations using explicit time-stepping unlike other CFD applications that often use implicit solvers. Moreover this test cases uses discontinuous Galerkin method (spectral-element version) on hexahedral grids, and adaptive mesh refinement. Thanks to my postdoc supervisor Francis X. Giraldo, from whom I learned this stuff!

A thermal bubble of Gaussian distribution rises up due to buoyancy, while deforming on the way, and collides with the top boundary.

A note about MPI/OpenMP/OpenACC parallelization

NebulaSEM is able to exploit multi-core CPUs either using a pure MPI approach or a hybrid MPI+OpenMP approach. CFD codes often utilize a pure MPI approach because that scales better than OpenMP. However, when extreme scalability on supercomputers is required, the hybrid MPI + OpenMP parallelization can become beneficial by reducing inter-process communication. To compile NebulaSEM for a hybrid MPI+OpenMP

mkdir build-omp && cd build-omp
cmake -DCMAKE_BUILD_TYPE=release -DCMAKE_INSTALL_PREFIX=.. -DUSE_OMP=ON ..
make && make install

The number of threads for OpenMP is controlled by the OMP_NUM_THREADS environment variables. Then we can run the lid-driven cavity flow as usual specifying only 1 MPI rank and 2 OpenMP threads for a pure OpenMP approach as:

$ export OMP_NUM_THREADS=2
$ ./test.sh -n 1 -c examples/cavity/cavity

Or 2 mpi ranks + 2 openmp threads per rank for a hybrid MPI+OpenMP approach

$ export OMP_NUM_THREADS=2
$ ./test.sh -n 2 -c examples/cavity/cavity

Or 2 mpi ranks with 1 threads per rank for a pure MPI approach

$ export OMP_NUM_THREADS=1
$ ./test.sh -n 2 -c examples/cavity/cavity

Note that to obtain significant speedups from either MPI or OpenMP, the problem size should be large enough. For the lid-driven cavity flow we can increase the problem size to 100x100 by editing examples/cavity/vaity

-8{0 1 2 3 4 5 6 7} linear 3{20 20 1}
+8{0 1 2 3 4 5 6 7} linear 3{100 100 1}

Run single MPI rank run

$ export OMP_NUM_THREADS=1
$ ./test.sh -n 1 -c examples/cavity/cavity 
...
9018 [0] Time 5.000000
9025 [0] SYMM-FULL-SSOR-PCG :Iterations 1 Initial Residual 1.47888e-11 Final Residual 1.14971e-11
9027 [0] SYMM-FULL-SSOR-PCG :Iterations 1 Initial Residual 1.44703e-11 Final Residual 1.12991e-11
9073 [0] Exiting application run with 1 processes

Takes about 9073 milliseconds.

Run 2-mpi ranks

$ ./test.sh -n 2 -c examples/cavity/cavity
....
4623 [0] Time 5.000000
4626 [0] SYMM-FULL-SSOR-PCG :Iterations 1 Initial Residual 3.55206e-11 Final Residual 3.16645e-11
4628 [0] SYMM-FULL-SSOR-PCG :Iterations 1 Initial Residual 3.38580e-11 Final Residual 3.00837e-11
4672 [0] Exiting application run with 2 processes

Takes about 4672 milliseconds for a speedup of 1.94x out of 2 which is good.


Lets do the same with the OpenMP implementation. Make sure to use the build-omp binaries by doing:

cd build-omp && make install

We can now run the test case with 1 mpi rank

$ export OMP_NUM_THREADS=2
$ ./test.sh -n 1 -c examples/cavity/cavity
....
6686 [0] Time 5.000000
6691 [0] SYMM-FULL-SSOR-PCG :Iterations 1 Initial Residual 1.47888e-11 Final Residual 1.14971e-11
6693 [0] SYMM-FULL-SSOR-PCG :Iterations 1 Initial Residual 1.44703e-11 Final Residual 1.12991e-11
6739 [0] Exiting application run with 1 processes

It took about 6739 milliseconds. It is slower than the MPI implementation but still faster than the serial implementation with a speedup of 1.34x times.

Contribution

Users are welcome to suggest and implement new features such as new solvers, turbulence models, new test cases, as well as report bugs or issues they encounter. Please feel free to open an issue on this repository describing your desired change/bug-fix. Pull requests are also welcome!