Release 1.8.0
The Ginkgo team is proud to announce the new Ginkgo minor release 1.8.0. This
release brings new features such as:
- A brand new file-based configuration for Ginkgo objects: you can now construct
Ginkgo objects (solvers, preconditioners, ...) from a JSON configuration file.
This simplifies interfacing to Ginkgo as well as exploring different settings
to solve a problem. - Expand the batched feature set with: the Batched CSR Matrix format, batched CG
solver, batched (Block-)Jacobi preconditioner, usage example and other
features such as scaling, - New Distributed Multigrid and the PGM coarsening method,
- New CUDA and HIP kernels for Reverse Cuthill McKee (RCM) reordering
- Better Ginkgo and Kokkos interaction thanks to a mapping from simple Ginkgo
types to native Kokkos types
and more!
If you face an issue, please first check our known issues page and the open issues list and if you do not find a solution, feel free to open a new issue or ask a question using the github discussions.
Supported systems and requirements:
- For all platforms, CMake 3.16+
- C++14 compliant compiler
- Linux and macOS
- GCC: 5.5+
- clang: 3.9+
- Intel compiler: 2019+
- Apple Clang: 14.0 is tested. Earlier versions might also work.
- NVHPC: 22.7+
- Cray Compiler: 14.0.1+
- CUDA module: CMake 3.18+, and CUDA 10.1+ or NVHPC 22.7+
- HIP module: CMake 3.21+, and ROCm 4.5+
- DPC++ module: Intel oneAPI 2023.1+ with oneMKL and oneDPL. Set the CXX compiler to
dpcpp
oricpx
. - MPI: standard version 3.1+, ideally GPU Aware, for best performance
- Windows
- MinGW: GCC 5.5+
- Microsoft Visual Studio: VS 2019+
- CUDA module: CUDA 10.1+, Microsoft Visual Studio
- OpenMP module: MinGW.
Version support changes
- The Ginkgo license header now uses the SPDX format. #1404
- Ginkgo changes the oneapi support to 2023.1+ #1396
- Ginkgo's HIP backend now requires CMake 3.21 #1334
Interface changes
- The
gko::dim
single-parameter constructor is nowexplicit
to avoid accidental conversion from integers #1474 - The CMake option
GINKGO_BUILD_HWLOC
is now set toOFF
by default, and if it is set toON
, thenHWLOC
is required to be available #1513.
Behavior changes
gko::write_raw
now defaults to writing sparse output unless otherwise specified #1533- Ginkgo now adheres to the
--prefix
option forcmake --install
, instead of overwriting it #1534
Deprecations
array::get_num_elems()
has been renamed toget_size()
#1400matrix_data::ensure_row_major_order()
has been renamed tosort_row_major()
#1400device_matrix_data::get_num_elems()
has been renamed toget_num_stored_elements()
#1400- The CMake parameter
GINKGO_COMPILER_FLAGS
has been superseded byCMAKE_CXX_FLAGS
, andGINKGO_CUDA_COMPILER_FLAGS
has been superseded byCMAKE_CUDA_FLAGS
#1535 - The
std::initializer_list
overloads of matrixcreate
methods and constructors are deprecated in favor of explicitarray
parameters #1433
Summary of previous deprecations
- The
device_reset
parameter of CUDA and HIP executors no longer has an effect, and itsallocation_mode
parameters have been deprecated in favor of theAllocator
interface. - The CMake parameter
GINKGO_BUILD_DPCPP
has been deprecated in favor ofGINKGO_BUILD_SYCL
. - The
gko::reorder::Rcm
interface has been deprecated in favor ofgko::experimental::reorder::Rcm
based onPermutation
. - The Permutation class'
permute_mask
functionality. - Multiple functions with typos (
set_complex_subpsace()
, range functions such asconj_operaton
etc). gko::lend()
is not necessary anymore.- The classes
RelativeResidualNorm
andAbsoluteResidualNorm
are deprecated in favor ofResidualNorm
. - The class
AmgxPgm
is deprecated in favor ofPgm
. - Default constructors for the CSR
load_balance
andautomatical
strategies - The PolymorphicObject's move-semantic
copy_from
variant - The templated
SolverBase
class. - The class
MachineTopology
is deprecated in favor ofmachine_topology
. - Logger constructors and create functions with the
executor
parameter. - The virtual, protected, Dense functions
compute_norm1_impl
,add_scaled_impl
, etc. - Logger events for solvers and criterion without the additional
implicit_tau_sq
parameter. - The global
gko::solver::default_krylov_dim
, use insteadgko::solver::gmres_default_krylov_dim
.
Added features
- Add a batched CG solver #1598, #1609
- Add a batched Jacobi (scalar/block) preconditioner, #1542, #1600
- Add an example for batched iterative solver #1553
- Add
add_scaled_identity
andscale_add
for batch matrix formats. #1528 - Add scaling for batch objects (matrix formats and multi-vectors). #1527
- Add a
batch::Csr
matrix format class and core and support for batched spmv kernels on CUDA, HIP and SYCL. #1450 - Add a script for comparing benchmark JSON outputs #1467
- Add an example for reordered preconditioned linear solver #1465
- Add single-value access functions
load_value
andstore_value
toarray
#1485 - Add the
BlockOperator
format to represent block-matrices #1435 - Add CUDA and HIP kernels for Reverse Cuthill McKee (RCM) reordering #1503
- Add FileConfig #1389, #1392, #1395, #1479, #1480, #1607
- Add Distributed Multigrid #1269 and coarsening method PGM #1403
- Add a mapping from simple Ginkgo types to native Kokkos types #1358
- Add a segmented array class #1545
- Add a class for mapping between global and local indexing #1543
Improvements
- Ginkgo installation now has separate
Ginkgo_Runtime
andGinkgo_Development
components for easier packaging #1502 - The HIP backend now supports complex number operations for sparse matrices based on hipSPARSE #1538
- The
create
functions are now documented explicitly instead of using theEnableCreateMethod
mixin #1433 - The
solver
benchmark now supports Ginkgo's binary format for right-hand side vector inputs #1584 - The build system now uses native HIP support for CMake, which also provides support for ROCm 6.0 #1334
- The Multigrid solver generated from
distributed::Matrix
will use a global scalar Jacobi smoother and a GMRES solver as coarse grid solver #1612
Fixes
- Compilation with libc++ was fixed #1463
- Fix the
__cplusplus
by_MSVC_LANG
in MSVC #1496 Coo::read(const T&)
andCsr::read(const T&)
will no longer overwrite the locally stored arrays and instead copy directly into them #1476- Fix the interaction of
ProfilerHook::create(_nested)_summary
, executors and GPU timers, which lead to the summary not being printed #1509 - Fix compilation in environments where
CPATH
contains the current working directory #1531 - Fix read from matrix-market files with CR line endings #1557
- Fix undefined behavior that shows up with libstdc++ debug builds #1176
- Fix for CUDA 12.4 bug and METIS detection #1569
- Fix the pkgconfig installation with DESTDIR #1597
- Fix various issues causing build or test failures #1619