Skip to content

stdgpu 1.3.0

Latest
Compare
Choose a tag to compare
@stotko stotko released this 02 Jun 08:37
· 257 commits to master since this release

This release of stdgpu introduces a new experimental HIP backend adding support for AMD GPUs, significant improvements to the API documentation as well as many new code examples, the integration of clang-tidy and cppcheck in the CI, as well as a tremendous amount of warning fixes to enable clean builds at very high warning levels.

New Features & Enhancements

  • General: Add experimental HIP backend #121 #143
  • General: Add support for Compute Capability 3.0 in CUDA backend #153
  • General: Add clang-tidy support #129 #138
  • General: Add cppcheck support #149
  • General: Add CI job for documentation creation #109
  • General: Deprecate misleading/obsolete cmake options #103
  • atomic: Make all operations follow sequentially consistent ordering #176
  • atomic: Add backend documentation of template parameter #177
  • atomic: Cleanup backend-specific internals of CUDA backend #152
  • bit: Add ceil2 and floor2 functions #105
  • bit: Rename functions to match most recent draft of C++20 #110
  • bitset: Remove dependency to cstdlib #145
  • cstddef: Hide initializers for clearer documentation #166
  • cstdlib: Deprecate sizedivPow2 #161
  • limits: Add implementation for non-specialized template and documentation for every type #167
  • memory: Add construct_at function #95
  • memory: Cleanup global variables and simplify allocate/deallocate logic #104
  • memory: Improve construct* and destroy* unit tests #175
  • platform: Add automatic dispatching of backend-specific definitions #119
  • platform: Change detection of device code for OpenMP #174
  • ranges: Add size() and empty() functions as well as additional constructors #122
  • ranges: Add index64_t constructor and deprecate index_t version #102
  • unordered_map,unordered_set: Improve robustness of Fibonacci Hashing #111
  • README,doc: Significantly improve introduction, examples, and documentation #114 #116 #162 #165 #170 #171 #172 #181
  • doc: Group all class and function definitions into modules #169
  • doc: Cleanup unnecessary documentation #168
  • examples: Add many new examples and improve existing ones #173
  • test: Disable unused GMock #160
  • cmake: Make installable package relocatable #180
  • cmake: Add option to treat warnings as errors #108
  • cmake: Generate compile flags more robustly #128
  • cmake: Simplify architecture flag generation in CUDA backend #154
  • cmake: Install backend-specific find modules in subdirectories #117
  • cmake: Update support for CMake 3.17+ #123

Bug Fixes

  • General: Increase warning level and fix conversion and float-equal warnings #98
  • General: Increase MSVC warning level and fix related warnings #107 #156
  • General: Fix Clang warnings #91 #147
  • General: Fix format warnings #101
  • General: Fix sign-conversion warnings #100
  • General: Fix shadow warnings #90
  • General: Fix numerous clang-tidy warnings #130 #131 #132 #133 #134 #135 #136 #137 #140 #141
  • examples: Pass containers by reference for OpenMP backend #182
  • src,test: Improve consistency and cleanup includes #118
  • test: Fix missing namespace for uint8_t #142
  • test: Pass containers by const reference to functors #158
  • test: Fix double-promotion warnings in backend code #151
  • test: Fix conversion warning and missing namespace #124
  • test: Fix missing include in device_info cpp files #120
  • bit: Fix potential negative bit shift in unit test #159
  • bit,bitset: Fix missing post-conditions and remove unnecessary dependency #112
  • bitset: Fix deprecated-copy warning #144
  • compiler: Fix NVCC detection #155
  • compiler,platform: Use unique numbers as internal macro definitions #139
  • contract: Enforce user semicolon for all possible expansions #148 #150
  • limits: Suppress long double device code warning with MSVC #178
  • platform: Move STDGPU_HAS_CXX_17 to compiler #146
  • ranges: Fix compilation with 64-bit index type #157
  • ranges: Fix compilation error with select functor #125
  • deque,vector: Fix overflow in test #99
  • doc: Fix several minor documentation bugs #164
  • scripts: Use released thrust version #126
  • cmake: Fix error with unspecified build type #179
  • cmake: Fix parsing of thrust version #163
  • cmake: Workaround bug in imported rocthrust target name #127
  • cmake: Properly handle CUDA toolkit dependency #96
  • cmake: Add missing dependency checks in package config #94
  • cmake: Fix selection of header files for installation #93
  • cmake: Fix inconsistent thrust detection across the backends #92
  • CI: Fix codecov task #113
  • CI: Fix potentially missing OpenMP runtime package #106

Deprecated Features

  • bit: ispow2(), log2pow2(), mod2()
  • cstdlib: sizedivPow2(std::size_t, std::size_t), sizediv_t
  • memory: safe_pinned_host_allocator, default_allocator_traits
  • mutex: mutex_ref
  • ranges: device_range(T*, index_t), host_range(T*, index_t), non-const begin() and end() member functions
  • unordered_map,unordered_set: createDeviceObject(index_t, index_t), excess_count(), total_count()
  • CMake Configuration Options: STDGPU_ENABLE_AUXILIARY_ARRAY_WARNING, STDGPU_ENABLE_MANAGED_ARRAY_WARNING, STDGPU_USE_FAST_DESTROY, STDGPU_USE_FIBONACCI_HASHING