Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation deadlock or stuck during initialisation #819

Open
CharlesQiZhou opened this issue Jul 10, 2024 · 1 comment
Open

Simulation deadlock or stuck during initialisation #819

CharlesQiZhou opened this issue Jul 10, 2024 · 1 comment
Assignees

Comments

@CharlesQiZhou
Copy link
Contributor

We have been testing the main branch of hemelb for red blood cell (RBC) simulations (with @c-denham). With the updated main branch 81a78e16e237aafab2d05ba4a82a0802affc76f7, the simulation seemingly experiences a parallel deadlock. The symptom is silent fail and empty output during initialisation, with void stdout.txt as well as stderr.txt. The deadlock occurred both for compilations on ARCHER2 and a local server.

This may be related to an earlier issue for the rollback version b7dfb8879af22592928723f8e2061556ab6ee78d, where the simulation got stuck during initialisation without entering the time loops. An example output is as below (@rupertnash this is an RBC simulation case different from the fluid-only test case I shared with you in May):

![0.0s]Reading configuration from /mnt/lustre/a2fs-work3/work/e283/e283/qizhou/hemelb-main/results_main/YAZbifur-1b_FE10_Hct0.12_posNoise_global-Ks-Kb_Ks5e-6_timeNoise_REx100/config.xml
![0.0s]RBC insertion random seed: 0x17e0f879b5104f78
![0.0s]Beginning Initialisation.
![0.0s]Loading and decomposing geometry file /mnt/lustre/a2fs-work3/work/e283/e283/qizhou/hemelb-main/results_main/YAZbifur-1b_FE10_Hct0.12_posNoise_global-Ks-Kb_Ks5e-6_timeNoise_REx100/Bifur2_final.gmy.
![0.0s]Opened config file /mnt/lustre/a2fs-work3/work/e283/e283/qizhou/hemelb-main/results_main/YAZbifur-1b_FE10_Hct0.12_posNoise_global-Ks-Kb_Ks5e-6_timeNoise_REx100/Bifur2_final.gmy

NOTE: this "stuck" type error was supposedly resolved by the debug-decomp branch @rupertnash recently merged into main.

Both the "deadlock" and "stuck" errors reported above should be replicated following the conventional compilations as below:

module load cmake/3.21.3
module load PrgEnv-gnu
module swap gcc gcc/11.2.0
module load boost/1.81.0
module load parmetis/4.0.3
module load cray-hdf5-parallel

cd dependencies
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=. -DHEMELB_BUILD_RBC=ON ..
make -j64

cd ../../Code
mkdir build && cd build

cmake -DCMAKE_INSTALL_PREFIX=. \
-DHEMELB_DEPENDENCIES_INSTALL_PREFIX=../../dependencies/build \
-DCMAKE_BUILD_TYPE=Debug \
-DHEMELB_WALL_BOUNDARY=BFL \
-DHEMELB_INLET_BOUNDARY=LADDIOLET \
-DHEMELB_OUTLET_BOUNDARY=NASHZEROTHORDERPRESSUREIOLET \
-DHEMELB_KERNEL:string=GuoForcingLBGK \
-DHEMELB_LATTICE:string=D3Q19 \
-DHEMELB_STENCIL:string=ThreePoint \
-DHEMELB_USE_SSE3:string=ON \
-DHEMELB_BUILD_RBC=ON \
-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON ..

make -j64

Only the change below is made to the code before compilation:

diff --git a/Code/constants.h b/Code/constants.h
index 78e6dd2f..636f3c4e 100644
--- a/Code/constants.h
+++ b/Code/constants.h
@@ -19,7 +19,7 @@ namespace hemelb

constexpr double mmHg_TO_PASCAL = 133.3223874;
constexpr double DEFAULT_FLUID_DENSITY_Kg_per_m3 = 1000.0;

  • constexpr double DEFAULT_FLUID_VISCOSITY_Pas = 0.004;
  • constexpr double DEFAULT_FLUID_VISCOSITY_Pas = 0.001;
@CharlesQiZhou
Copy link
Contributor Author

The RBC test case in the example above was successfully run before with a legacy version of the code with revision number: f85dac87900e082a6f5fd125a2b6366c94c752e5 @mobernabeu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants