These codes were used to generate the results in Nvidia's Parallel For All Blog.
🐝 The crystal structure of metal-organic framework IRMOF-1 is stored in IRMOF-1.cssr
. There are 424 atoms in this unit cell, which is a cube of dimension 25.832 Angstroms. The second column in the .cssr is the atom name; the following three columns give fractional coordinates of these atoms in the a, b, and c crystal lattice directions, respectively.
🐝 The 'henry.cu' code is the CUDA code for GPUs.
🐝 The 'henry_serial.cc' code is the C++ code parallelized using OpenMP. See the:
#pragma omp parallel for
that parallelizes the loop.
🐝 To compile both codes, type make
(See Makefile
).
🐝 The Bash shell file run.sh
runs the performance benchmark tests of the CUDA and OpenMP-parallelized codes by varying the number of parallel elements (GPU threads per block / OpenMP threads) and stores the run times in .csv files. It also calls the Python script plot_performance.py
to plot the results.