Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] rmm v24.10 #1687

Merged
merged 36 commits into from
Oct 9, 2024
Merged

[RELEASE] rmm v24.10 #1687

merged 36 commits into from
Oct 9, 2024

Conversation

raydouglass
Copy link
Member

❄️ Code freeze for branch-24.10 and v24.10 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-24.10 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-24.10 into main for the release

raydouglass and others added 30 commits July 19, 2024 14:50
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Forward-merge branch-24.08 into branch-24.10
Instead of installing into a hard-coded `include` directory, install to `${CMAKE_INSTALL_INCLUDEDIR}`, which defaults to `include`.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Bradley Dice (https://github.com/bdice)

URL: #1633
Forward-merge branch-24.08 into branch-24.10
This PR deprecates adaptor factory functions, per issue #1616.

After some deprecation cycle, these functions can be removed in a later release.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Paul Mattione (https://github.com/pmattione-nvidia)

URL: #1626
`cmake.minimum-version` has been deprecated since `scikit-build-core` 0.8, and is now causing conflicts in 0.10 due to its attempts to auto-detect `cmake.version` from `CMakeLists.txt`. Bump the minimum `scikit-build-core` to 0.10 and use the suggested `cmake.version`.

Contributes to rapidsai/build-planning#58

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - James Lamb (https://github.com/jameslamb)
  - https://github.com/jakirkham

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/jakirkham

URL: #1637
A few small tweaks to `update-version.sh` for alignment across RAPIDS.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - James Lamb (https://github.com/jameslamb)

URL: #1640
This PR updates pre-commit hooks to the latest versions that are supported without causing style check errors.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #1643
Fixes #1645

Contributes to rapidsai/build-planning#33

Similar to rapidsai/cudf#15982

Proposes more tightly controlling the visibility of symbols in the shared libraries produces for the `rmm` Python library, via the following:

* compiling with `-fvisibility=hidden` by default
* marking intended-to-be-public parts of `rmm` *(everything in the `rmm::` namespace)* with `__attribute__((visibility("default")))`

## Benefits of this change

Reduces the risk of symbol conflicts when `rmm` is used alongside other libraries. For example, see this case in `cudf` where the `spdlog::` symbols in `rmm` are conflicting with the `spdlog::` symbols in `nvcomp`: rapidsai/cudf#15483 (comment)

Reduces library size by a bit (around 0.3 MB uncompressed), by reducing the size of symbol tables in DSOs.

## Notes for Reviewers

This is at the very edge of my C++ knowledge, apologies in advance if I've missed something obvious 😬 

#

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Mark Harris (https://github.com/harrism)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1644
This PR removes the NumPy<2 pin which is expected to work for
RAPIDS projects once CuPy 13.3.0 is released (CuPy 13.2.0 had
some issues preventing the use with NumPy 2).

Authors:
  - Sebastian Berg (https://github.com/seberg)

Approvers:
  - https://github.com/jakirkham

URL: #1650
Contributes to rapidsai/build-planning#88

Finishes the work of dropping Python 3.9 support.

This project stopped building / testing against Python 3.9 as of rapidsai/shared-workflows#235.
This PR updates configuration and docs to reflect that.

## Notes for Reviewers

### How I tested this

Checked that there were no remaining uses like this:

```shell
git grep -E '3\.9'
git grep '39'
git grep 'py39'
```

And similar for variations on Python 3.8 (to catch things that were missed the last time this was done).

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - https://github.com/jakirkham

URL: #1659
Mostly identical to gh-1606, which allows the testing of oldest/latest dependencies.  What oldest/latest dependencies means exactly has to be set by each project in their `dependencies.yaml` file for the test env.

See rapidsai/shared-workflows#228 for the workflow changes.

This is part of preparing for 2.0 (which should just work, but in the end no need to include it in the same PR).

*Modifications from draft PR:*
- I renamed it to `oldest`, as suggested by Matthew.
- Noticed that the name is different between wheel and conda workflows, so modified both to include all matrix information.

(Draft, since the shared workflow should be pushed in first to revert the branch renaming probably.  Need to test that the wheel workflow is correct.)

Authors:
  - Sebastian Berg (https://github.com/seberg)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - https://github.com/jakirkham

URL: #1613
While investigating cuml benchmarks, I found an issue with the current `system_memory_resource` that causes segfault. Roughly it's in code like this:
```cuda
void foo(...) {
  rmm::device_uvector<T> tmp(bufferSize, stream);
  // launch cuda kernels making use of tmp
}
```
When the function returns, the `device_uvector` would go out of scope and get deleted, while the cuda kernel might still be in flight. With `cudaFree`, the CUDA runtime would perform implicit synchronization to make sure the kernel finishes before actually freeing the memory, but with SAM we don't have that guarantee, thus causing use-after-free errors.

This is a rather simple fix. In the future we may want to use CUDA events to make this less blocking.

Authors:
  - Rong Ou (https://github.com/rongou)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1655
This PR updates rapidsai/pre-commit-hooks to the version 0.4.0.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #1663
Fixes #1597

Adds new` get_per_device_resource_ref()`, `set_per_device_resource_ref()` and `current_device` versions of these, intended to replace `get_per_device_resource()`. The new functions deal in `device_async_resource_ref`, while the old functions deal in `device_memory_resource` pointers. Tests are updated to use the new functions.
  
Note that I have also added `reset_per_device_resource_ref()` and `reset_current_device_resource_ref()` which are necessary because previously we implemented the resetting behavior by passing `nullptr` to `set_current_device_resource()`, which doesn't work with `resource_ref` because it can't refer to `nullptr` (no such thing as a null reference).

Updates tests to cover the new functionality, and re-enables tests of `set_current_device_resource_ref()` which were disabled in #1589.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Rong Ou (https://github.com/rongou)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #1598
Contributes to rapidsai/build-planning#40

This PR adds support for Python 3.12.

## Notes for Reviewers

This is part of ongoing work to add Python 3.12 support across RAPIDS.
It temporarily introduces a build/test matrix including Python 3.12, from rapidsai/shared-workflows#213.

A follow-up PR will revert back to pointing at the `branch-24.10` branch of `shared-workflows` once all
RAPIDS repos have added Python 3.12 support.

### This will fail until all dependencies have been updates to Python 3.12

CI here is expected to fail until all of this project's upstream dependencies support Python 3.12.

This can be merged whenever all CI jobs are passing.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1666
…bility (#1653)

In #833, we gave `rmm::mr::detail::get_map` default visibility. However, there are a number of other functions that return static references that should also have this visibility so that the static reference is unique across multiple DSOs.

- Closes #1651

Authors:
  - Lawrence Mitchell (https://github.com/wence-)
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: #1653
The `logging_resource_adaptor` has a friend declaration (which is not a definition) preceded by an attribute, whereas C++ standard requires that such [declaration must be a definition](https://eel.is/c++draft/dcl.attr.grammar#5). gcc 11.4.0 used in the dev container does not correctly identify this, but gcc 12.1 and newer are able to, and will report a compile error (converted from a warning) when building RMM:

```
an attribute that appertains to a friend declaration that is not a definition is ignored
```
This simple PR removes the friend declaration from the `logging_resource_adaptor` class. This is valid since in recent releases of RMM the friended function no longer accesses any private (or protected) data or methods of `logging_resource_adaptor`.

closes #1668

Authors:
  - Tianyu Liu (https://github.com/kingcrimsontianyu)

Approvers:
  - Rong Ou (https://github.com/rongou)
  - Bradley Dice (https://github.com/bdice)
  - Mark Harris (https://github.com/harrism)

URL: #1669
Audit the existing MR tests and serialize those that make large allocations (specifically, a pool with 90% of the available device memory). This also allows us to remove serialization from some of the tests which don't make large allocations.

- Closes #1671

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Mark Harris (https://github.com/harrism)

URL: #1672
Closes #173

Authors:
  - Matthew Murray (https://github.com/Matt711)
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Mark Harris (https://github.com/harrism)

URL: #1670
…ce_resource_ref() (#1661)

Closes #1660.

This adds a constructor to each MR adaptor to take a resource_ref rather than an `Upstream*`. It also updates RMM to use `get_current_device_resource_ref()` everywhere: in containers, in tests, in adaptors, Thrust allocator, polymorphic allocator, execution_policy, etc.

Importantly, this PR also modifies `set_current_device_resource()` to basically call `set_current_device_resource_ref()`. This is necessary, because while RMM C++ uses `get_current_device_resource_ref()` everywhere, the Python API still uses the raw pointer API `set_current_device_resource()`. So we need the latter to update the state for the former.  This is a temporary bootstrap to help with the refactoring.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Michael Schellenberger Costa (https://github.com/miscco)
  - Lawrence Mitchell (https://github.com/wence-)
  - Rong Ou (https://github.com/rongou)
  - Bradley Dice (https://github.com/bdice)

URL: #1661
miscco and others added 5 commits September 10, 2024 08:55
This has been found breaking CCCL CI when building cuDF

Authors:
  - Michael Schellenberger Costa (https://github.com/miscco)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Mark Harris (https://github.com/harrism)

URL: #1677
Recommending `miniforge` for conda install in installation docs.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Mark Harris (https://github.com/harrism)

URL: #1681
* Update fmt (to 11.0.2) and spdlog (to 1.14.1).

* simplify get_spdlog

* copyright

* Apply suggestions from code review

Co-authored-by: Bradley Dice <[email protected]>

* Apply suggestions from code review

* test with changes from rapidsai/rapids-cmake@d7671a3

* Update cmake/thirdparty/get_spdlog.cmake

* move rapids-cmake overrides [skip ci]

* try reverting get_spdlog export changes [skip ci]

* more fiddling with export sets [skip ci]

* more exporting [skip ci]

* more export set fiddling [skip ci]

* more [skip ci]

* exports [skip ci]

* run a build

* restore tests

* branch references

* remove testing-only changes [skip ci]

---------

Co-authored-by: Bradley Dice <[email protected]>
@raydouglass raydouglass requested review from a team as code owners September 27, 2024 14:35
@github-actions github-actions bot added CMake Python Related to RMM Python API conda cpp Pertains to C++ code ci labels Sep 27, 2024
@raydouglass raydouglass merged commit d403b76 into main Oct 9, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake conda cpp Pertains to C++ code Python Related to RMM Python API
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.