Skip to content

Commit

Permalink
Merge pull request #1345 from ndellingwood/release-3.6.00
Browse files Browse the repository at this point in the history
Release 3.6.00
  • Loading branch information
ndellingwood authored Apr 26, 2022
2 parents 8381db0 + cc63e3d commit e09389a
Show file tree
Hide file tree
Showing 801 changed files with 148,219 additions and 122,835 deletions.
36 changes: 36 additions & 0 deletions .github/workflows/format.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: github-FORMAT

on:
pull_request:
branches:
- master
- develop

jobs:
clang-format-check:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout@v2

- name: Install Dependencies
run: sudo apt install clang-format-8

- name: check
run: |
# Fetch from the default remote (origin)
git fetch &> /dev/null
# For every file changed, apply clang-format
for file in $(git diff --name-only origin/$GITHUB_BASE_REF | egrep '.*\.cpp$|.*\.hpp$|.*\.h$'); do
if [ -e $file ]; then
clang-format-8 -i -style=file $file
git add $file
fi
done
# If any diffs exist, error out
if [[ ! -z $(git status -s -uno . -- ':!.github') ]]; then
echo "The following files require formatting changes:"
git status -s -uno . -- ':!.github'
exit 1
fi
86 changes: 86 additions & 0 deletions .github/workflows/osx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
name: github-OSX

on:
pull_request:
branches:
- master
- develop

jobs:
osxci:
name: osx-ci
runs-on: [macos-latest]

strategy:
matrix:
include:
- backend: "SERIAL"
cmake_build_type: "RelWithDebInfo"
- backend: "THREADS"
cmake_build_type: "RelWithDebInfo"
- backend: "SERIAL"
cmake_build_type: "Debug"
- backend: "SERIAL"
cmake_build_type: "Release"

steps:
- name: checkout_kokkos_kernels
uses: actions/checkout@v2
with:
path: kokkos-kernels

- name: checkout_kokkos
uses: actions/checkout@v2
with:
repository: kokkos/kokkos
ref: ${{ github.base_ref }}
path: kokkos

- name: configure_kokkos
run: |
ls -lat
mkdir -p kokkos/{build,install}
cd kokkos/build
cmake \
-DKokkos_ENABLE_${{ matrix.backend }}=ON \
-DCMAKE_CXX_FLAGS="-Werror" \
-DCMAKE_CXX_STANDARD=14 \
-DKokkos_ENABLE_COMPILER_WARNINGS=ON \
-DKokkos_ENABLE_DEPRECATED_CODE_3=OFF \
-DCMAKE_BUILD_TYPE=${{ matrix.cmake_build_type }} \
-DCMAKE_INSTALL_PREFIX=$PWD/../install \
..
- name: build_and_install_kokkos
working-directory: kokkos/build
run: make -j2 install

- name: configure_kokkos_kernels
run: |
ls -lat
mkdir -p kokkos-kernels/{build,install}
cd kokkos-kernels/build
cmake \
-DKokkos_DIR=$PWD/../../kokkos/install/lib/cmake/Kokkos \
-DCMAKE_BUILD_TYPE=${{ matrix.cmake_build_type }} \
-DCMAKE_CXX_FLAGS="-Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wignored-qualifiers -Wempty-body -Wuninitialized" \
-DCMAKE_INSTALL_PREFIX=$PWD/../install \
-DKokkosKernels_ENABLE_TESTS=ON \
-DKokkosKernels_ENABLE_EXAMPLES:BOOL=ON \
-DKokkosKernels_INST_COMPLEX_DOUBLE=ON \
-DKokkosKernels_INST_DOUBLE=ON \
-DKokkosKernels_INST_COMPLEX_FLOAT=ON \
-DKokkosKernels_INST_FLOAT=ON \
-DKokkosKernels_INST_LAYOUTLEFT:BOOL=ON \
-DKokkosKernels_INST_LAYOUTRIGHT:BOOL=ON \
-DKokkosKernels_ENABLE_TPL_CUSPARSE=OFF \
-DKokkosKernels_ENABLE_TPL_CUBLAS=OFF \
..
- name: build_kokkos_kernels
working-directory: kokkos-kernels/build
run: make -j2

- name: test
working-directory: kokkos-kernels/build
run: ctest -j2 --output-on-failure
110 changes: 78 additions & 32 deletions .jenkins/nightly.groovy
Original file line number Diff line number Diff line change
@@ -1,40 +1,86 @@
pipeline {
agent none

options {
timeout(time: 3, unit: 'HOURS')
}

stages {
stage('HIP-ROCm-4.2-C++14') {
agent {
dockerfile {
filename 'Dockerfile.hip'
dir 'scripts/docker'
additionalBuildArgs '--build-arg BASE=rocm/dev-ubuntu-20.04:4.2'
label 'rocm-docker && vega'
args '-v /tmp/ccache.kokkos:/tmp/ccache --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video --env HIP_VISIBLE_DEVICES=$HIP_VISIBLE_DEVICES'
stage('Build & Run') {
parallel {
stage('SYCL-OneAPI') {
agent {
dockerfile {
filename 'Dockerfile.sycl'
dir 'scripts/docker'
label 'nvidia-docker && volta'
args '-v /tmp/ccache.kokkos:/tmp/ccache'
}
}
steps {
sh '''rm -rf kokkos &&
git clone -b develop https://github.com/kokkos/kokkos.git && cd kokkos && \
mkdir build && cd build && \
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=clang++ \
-DKokkos_ARCH_VOLTA70=ON \
-DKokkos_ENABLE_DEPRECATED_CODE_3=OFF \
-DKokkos_ENABLE_SYCL=ON \
-DKokkos_ENABLE_UNSUPPORTED_ARCHS=ON \
-DCMAKE_CXX_STANDARD=17 \
.. && \
make -j8 && make install && \
cd ../.. && rm -rf kokkos'''
sh '''rm -rf build && mkdir -p build && cd build && \
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=clang++ \
-DKokkosKernels_ENABLE_TESTS=ON \
-DKokkosKernels_ENABLE_EXAMPLES=ON \
-DKokkosKernels_INST_DOUBLE=ON \
-DKokkosKernels_INST_ORDINAL_INT=ON \
-DKokkosKernels_INST_OFFSET_INT=ON \
.. && \
make -j8'''
}
}

stage('HIP-ROCm-4.5-C++14') {
agent {
dockerfile {
filename 'Dockerfile.hip'
dir 'scripts/docker'
additionalBuildArgs '--build-arg BASE=rocm/dev-ubuntu-20.04:4.5'
label 'rocm-docker && vega'
args '-v /tmp/ccache.kokkos:/tmp/ccache --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video --env HIP_VISIBLE_DEVICES=$HIP_VISIBLE_DEVICES'
}
}
steps {
sh '''rm -rf kokkos &&
git clone -b develop https://github.com/kokkos/kokkos.git && cd kokkos && \
mkdir build && cd build && \
cmake \
-DCMAKE_CXX_COMPILER=hipcc \
-DCMAKE_CXX_EXTENSIONS=OFF \
-DKokkos_ENABLE_HIP=ON \
.. && \
make -j8 && make install && \
cd ../.. && rm -rf kokkos'''
sh '''rm -rf build && mkdir -p build && cd build && \
cmake \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_COMPILER=hipcc \
-DCMAKE_CXX_EXTENSIONS=OFF \
-DKokkosKernels_ENABLE_TESTS=ON \
-DKokkosKernels_ENABLE_EXAMPLES=ON \
-DKokkosKernels_INST_DOUBLE=ON \
-DKokkosKernels_INST_ORDINAL_INT=ON \
-DKokkosKernels_INST_OFFSET_INT=ON \
.. && \
make -j8 && ctest --verbose'''
}
}
}
steps {
sh '''rm -rf kokkos &&
git clone -b develop https://github.com/kokkos/kokkos.git && cd kokkos && \
mkdir build && cd build && \
cmake \
-DCMAKE_CXX_COMPILER=hipcc \
-DCMAKE_CXX_EXTENSIONS=OFF \
-DKokkos_ENABLE_HIP=ON \
.. && \
make -j8 && make install && \
cd ../.. && rm -rf kokkos'''
sh '''rm -rf build && mkdir -p build && cd build && \
cmake \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_COMPILER=hipcc \
-DCMAKE_CXX_EXTENSIONS=OFF \
-DKokkosKernels_ENABLE_TESTS=ON \
-DKokkosKernels_ENABLE_EXAMPLES=ON \
-DKokkosKernels_INST_DOUBLE=ON \
-DKokkosKernels_INST_ORDINAL_INT=ON \
-DKokkosKernels_INST_OFFSET_INT=ON \
.. && \
make -j8 && ctest --verbose'''
}
}
}
Expand Down
124 changes: 124 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,129 @@
# Change Log

## [3.6.00](https://github.com/kokkos/kokkos-kernels/tree/3.6.00) (2022-02-18)
[Full Changelog](https://github.com/kokkos/kokkos-kernels/compare/3.5.00...3.6.00)

### Features:

#### Batched Sparse Linear algebra
- Kokkos Kernels is adding a new component to the library: batched sparse linear algebra.
- Similarly to the current dense batched algorithms, the new algorithms are called from
- the GPU and provide Team and TeamVector level of parallelism, SpMV also provides a Serial
- call on GPU.

- Add Batched CG and Batched GMRES [\#1155](https://github.com/kokkos/kokkos-kernels/pull/1155)
- Add Jacobi Batched preconditioner [\#1219](https://github.com/kokkos/kokkos-kernels/pull/1219)

#### Bsr and Tensor core algorithm for sparse linear algebra
- After introducing the BsrMatrix in release 3.5.0 new algorithms are now supporting this format.
- For release 3.6.0 we are adding matrix-vector (matvec) multiplication and Gauss-Seidel as well as an
- implementation of matvec that leverages tensor cores on Nvidia GPUs. More kernels are expected to
- support the Bsr format in future releases.

- Add Spmv for BsrMatrix [\#1255](https://github.com/kokkos/kokkos-kernels/pull/1255)
- Add BLAS to SpMV operations for BsrMatrix [\#1297](https://github.com/kokkos/kokkos-kernels/pull/1297)
- BSR format support in block Gauss-Seidel [\#1232](https://github.com/kokkos/kokkos-kernels/pull/1232)
- Experimental tensor-core SpMV for BsrMatrix [\#1090](https://github.com/kokkos/kokkos-kernels/pull/1090)

#### Improved AMD math libraries support
- rocBLAS and rocSPARSE TPLs are now officially supported, they can be enabled at configure time.
- Initial kernels that can call rocBLAS are GEMV, GEMM, IAMAX and SCAL, while rocSPARSE can be
- called for matrix-vector multiplication. Further support for TPL calls can be requested on slack
- and by GitHub issues.

- Tpl rocBLAS and rocSPARSE [\#1153](https://github.com/kokkos/kokkos-kernels/pull/1153)
- Add rocBLAS GEMV wrapper [\#1201](https://github.com/kokkos/kokkos-kernels/pull/1201)
- Add rocBLAS wrappers for GEMM, IAMAX, and SCAL [\#1230](https://github.com/kokkos/kokkos-kernels/pull/1230)
- SpMV: adding support for rocSPARSE TPL [\#1221](https://github.com/kokkos/kokkos-kernels/pull/1221)

#### Additional new features
- bhalf: Unit test Batched GEMM [\#1251](https://github.com/kokkos/kokkos-kernels/pull/1251)
- and demostrate GMRES example convergence with bhalf_t (https://github.com/kokkos/kokkos-kernels/pull/1300)
- Stream interface: adding stream support in GEMV and GEMM [\#1131](https://github.com/kokkos/kokkos-kernels/pull/1131)
- Improve double buffering batched gemm performance [\#1217](https://github.com/kokkos/kokkos-kernels/pull/1217)
- Allow choosing coloring algorithm in multicolor GS [\#1199](https://github.com/kokkos/kokkos-kernels/pull/1199)
- Batched: Add armpl dgemm support [\#1256](https://github.com/kokkos/kokkos-kernels/pull/1256)

### Deprecations:
- Deprecation warning: SpaceAccessibility move out of impl, see #1140 [\#1141](https://github.com/kokkos/kokkos-kernels/pull/1141)

### Backends and Archs Enhancements:

#### SYCL:
- Full Blas support on SYCL [\#1270](https://github.com/kokkos/kokkos-kernels/pull/1270)
- Get sparse tests enabled and working for SYCL [\#1269](https://github.com/kokkos/kokkos-kernels/pull/1269)
- Changes to make graph run on SYCL [\#1268](https://github.com/kokkos/kokkos-kernels/pull/1268)
- Allow querying free/total memory for SYCL [\#1225](https://github.com/kokkos/kokkos-kernels/pull/1225)
- Use KOKKOS_IMPL_DO_NOT_USE_PRINTF instead of printf in kernels [\#1162](https://github.com/kokkos/kokkos-kernels/pull/1162)

#### HIP:
- Work around hipcc size_t/int division with remainder bug [\#1262](https://github.com/kokkos/kokkos-kernels/pull/1262)

#### Other Improvements:
- Replace std::abs with ArithTraits::abs [\#1312](https://github.com/kokkos/kokkos-kernels/pull/1312)
- Batched/dense: Add Gemm_DblBuf LayoutLeft operator [\#1299](https://github.com/kokkos/kokkos-kernels/pull/1299)
- KokkosKernels: adding variable that returns version as a single number [\#1295](https://github.com/kokkos/kokkos-kernels/pull/1295)
- Add KOKKOSKERNELS_FORCE_SIMD macro (Fix #1040) [\#1290](https://github.com/kokkos/kokkos-kernels/pull/1290)
- Rename KOKKOS_IF_{HOST,DEVICE} -> KOKKOS_IF_ON_{HOST,DEVICE} [\#1278](https://github.com/kokkos/kokkos-kernels/pull/1278)
- Algo::Level{2,3}::Blocked::mb() [\#1265](https://github.com/kokkos/kokkos-kernels/pull/1265)
- Batched: Use SerialOpt2 for 33 to 39 square matrices [\#1261](https://github.com/kokkos/kokkos-kernels/pull/1261)
- Prune extra dependencies [\#1241](https://github.com/kokkos/kokkos-kernels/pull/1241)
- Improve double buffering batched gemm perf for matrix sizes >64x64 [\#1239](https://github.com/kokkos/kokkos-kernels/pull/1239)
- Improve graph color perf test [\#1229](https://github.com/kokkos/kokkos-kernels/pull/1229)
- Add custom implementation for strcasecmp [\#1227](https://github.com/kokkos/kokkos-kernels/pull/1227)
- Replace __restrict__ with KOKKOS_RESTRICT [\#1223](https://github.com/kokkos/kokkos-kernels/pull/1223)
- Replace array reductions in BLAS-1 MV reductions [\#1204](https://github.com/kokkos/kokkos-kernels/pull/1204)
- Update MIS-2 and aggregation [\#1143](https://github.com/kokkos/kokkos-kernels/pull/1143)
- perf_test/blas/blas3: Update SHAs for benchmarking [\#1139](https://github.com/kokkos/kokkos-kernels/pull/1139)

### Implemented enhancements BuildSystem
- Bump ROCm version 4.2 -> 4.5 in nightly Jenkins CI build [\#1279](https://github.com/kokkos/kokkos-kernels/pull/1279)
- scripts/cm_test_all_sandia: Add A64FX ci checks [\#1276](https://github.com/kokkos/kokkos-kernels/pull/1276)
- github/workflows: Add osx CI [\#1254](https://github.com/kokkos/kokkos-kernels/pull/1254)
- Update SYCL compiler version in CI [\#1247](https://github.com/kokkos/kokkos-kernels/pull/1247)
- Do not set Kokkos variables when exporting CMake configuration [\#1236](https://github.com/kokkos/kokkos-kernels/pull/1236)
- Add nightly CI check for SYCL [\#1190](https://github.com/kokkos/kokkos-kernels/pull/1190)
- Update cmake minimum version to 3.16 [\#866](https://github.com/kokkos/kokkos-kernels/pull/866)

### Incompatibilities:
- Kokkos::Impl: removing a few more instances of throw_runtime_exception [\#1320](https://github.com/kokkos/kokkos-kernels/pull/1320)
- Remove Kokkos::Impl::throw_runtime_exception from Kokkos Kernels [\#1294](https://github.com/kokkos/kokkos-kernels/pull/1294)
- Remove unused memory space utility [\#1283](https://github.com/kokkos/kokkos-kernels/pull/1283)
- Clean up Kokkos header includes [\#1282](https://github.com/kokkos/kokkos-kernels/pull/1282)
- Remove private Kokkos header include (Cuda/Kokkos_Cuda_Half.hpp) [\#1281](https://github.com/kokkos/kokkos-kernels/pull/1281)
- Avoid using #ifdef KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_* macro guards [\#1266](https://github.com/kokkos/kokkos-kernels/pull/1266)
- Rename enumerator Impl::Exec_{PTHREADS -> THREADS} [\#1253](https://github.com/kokkos/kokkos-kernels/pull/1253)
- Remove all references to the Kokkos QThreads backend [\#1238](https://github.com/kokkos/kokkos-kernels/pull/1238)
- Replace more occurences of Kokkos::Impl::is_view [\#1234](https://github.com/kokkos/kokkos-kernels/pull/1234)
- Do not use Kokkos::Impl::is_view [\#1214](https://github.com/kokkos/kokkos-kernels/pull/1214)
- Replace Kokkos::Impl::if_c -> std::conditional [\#1213](https://github.com/kokkos/kokkos-kernels/pull/1213)

### Bug Fixes:
- Fix bug in spmv_mv_bsrmatrix() for Ampere GPU arch [\#1315](https://github.com/kokkos/kokkos-kernels/pull/1315)
- Fix std::abs calls for rocBLAS/rocSparse [\#1310](https://github.com/kokkos/kokkos-kernels/pull/1310)
- cast literal 0 to fragment scalar type [\#1307](https://github.com/kokkos/kokkos-kernels/pull/1307)
- Fix 1303: maintain correct #cols on A in twostage [\#1304](https://github.com/kokkos/kokkos-kernels/pull/1304)
- Add dimension checking to generic spmv interface [\#1301](https://github.com/kokkos/kokkos-kernels/pull/1301)
- Add missing barriers to TeamGMRES, fix vector len [\#1285](https://github.com/kokkos/kokkos-kernels/pull/1285)
- Examples: fixing some issues related to type checking [\#1267](https://github.com/kokkos/kokkos-kernels/pull/1267)
- Restrict BsrMatrix specialization for AMPERE and VOLTA to CUDA [\#1242](https://github.com/kokkos/kokkos-kernels/pull/1242)
- Fix compilation errors for multi-vectors in kk_print_1Dview() [\#1231](https://github.com/kokkos/kokkos-kernels/pull/1231)
- src/batched: Fixes #1224 [\#1226](https://github.com/kokkos/kokkos-kernels/pull/1226)
- Fix SpGEMM crashing on empty rows [\#1220](https://github.com/kokkos/kokkos-kernels/pull/1220)
- Fix issue #1212 [\#1218](https://github.com/kokkos/kokkos-kernels/pull/1218)
- example/gmres: Specify half_t namespace [\#1208](https://github.com/kokkos/kokkos-kernels/pull/1208)
- Check that ordinal types are signed [\#1188](https://github.com/kokkos/kokkos-kernels/pull/1188)
- Fixing a couple of small issue with tensor core spmv [\#1185](https://github.com/kokkos/kokkos-kernels/pull/1185)
- Fix #threads setting in pcg for OpenMP [\#1182](https://github.com/kokkos/kokkos-kernels/pull/1182)
- SpMV: fix catch all case to avoid compiler warnings [\#1179](https://github.com/kokkos/kokkos-kernels/pull/1179)
- using namespace should be scoped to prevent name clashes [\#1177](https://github.com/kokkos/kokkos-kernels/pull/1177)
- using namespace should be scoped to prevent name clashes, see issue #1170 [\#1171](https://github.com/kokkos/kokkos-kernels/pull/1171)
- Fix bug with mkl impl of spgemm [\#1167](https://github.com/kokkos/kokkos-kernels/pull/1167)
- Add missing $ to KOKKOS_HAS_TRILINOS in sparse_sptrsv_superlu check [\#1160](https://github.com/kokkos/kokkos-kernels/pull/1160)
- Small fixes to spgemm, and plug gaps in testing [\#1159](https://github.com/kokkos/kokkos-kernels/pull/1159)
- SpMV: mismatch in #ifdef check and kernel specialization [\#1151](https://github.com/kokkos/kokkos-kernels/pull/1151)
- Fix values dimension for block sparse matrices [\#1147](https://github.com/kokkos/kokkos-kernels/pull/1147)

## [3.5.00](https://github.com/kokkos/kokkos-kernels/tree/3.5.00) (2021-10-19)
[Full Changelog](https://github.com/kokkos/kokkos-kernels/compare/3.4.01...3.5.00)

Expand Down
Loading

0 comments on commit e09389a

Please sign in to comment.