Skip to content

Version 2.10

Compare
Choose a tag to compare
@HansKristian-Work HansKristian-Work released this 11 Sep 13:53
· 704 commits to master since this release

This release rolls up a ton of bug fixes, game and driver workarounds, and other improvements.

Features

DirectStorage MetaCommands

We can now make use of NV_memory_decompression to implement
GPU accelerated GDeflate compression in DirectStorage.
This is demonstrated to work in Ratchet & Clank: Rift Apart.

We also worked around an NV driver bug when using the fallback GDeflate shader.
The fallback works on RADV.

Enhanced Barriers

NOTE: This isn't all that well tested because there are no games shipping with this yet to our knowledge.

Device generated commands for compute

With NV_device_generated_commands_compute we can efficiently implement
Starfield's use of ExecuteIndirect which hammers multi-dispatch COMPUTE + root parameter changes.
Previously, we would rely on a very slow workaround.

NOTE: This feature is currently only enabled on RADV due to driver issues.

Misc

  • Support Root Signature version 1.2
  • Implement Shader Model 6.7
    • Includes all SM 6.7 features like AdvancedTextureOps, WaveOpsIncludeHelperLanes
    • Caveat: Technically not Vulkan spec compliant implementation, but works fine on at least NV and RADV. Currently implemented as an opt-in option for now in case some game relies on it to work
  • Implement CreateSampler2
  • Expose inverted viewport / height feature
  • Implement RelaxedFormatCasting feature from Enhanced Barriers
  • Implement support for adjacency topologies
  • Support A8_UNORM format properly by using VK_KHR_maintenance5, allowing A8_UNORM UAVs to work correctly
  • Handle range checked index buffers correctly with VK_KHR_maintenance5

New extension use

  • VK_EXT_dynamic_rendering_unused_attachments
  • VK_KHR_maintenance5
  • VK_NV_device_generated_commands_compute

Performance

  • Batch acceleration structure builds. Massively improves build performance on at least RADV.
  • Massively improve ExecuteIndirect performance when using COMPUTE + root parameter changes when VK_NV_device_generated_commands_compute is enabled.

Fixes

  • Fix root signature creation from DXIL library target (DXR) blobs
  • Fix some dual source blending PSOs scenarios. Fixes Star Wars Battlefront II
  • Implement wave operations in pixel shaders more strictly according to D3D12 rules
  • Fix spurious hangs in Ashes of Singularity when using shared fences and wait-before-signal
  • Fix PSO caching bug in mesh shaders. Fixes mesh shaders in Unreal Engine 5
  • Fix udiv remainder in DXBC, which fixed some Xenia bugs
  • Fix query heap tracking bug that was exposed by NV Streamline
  • Various DXIL -> SPIR-V fixes as usual
  • Rewrote descriptor set layouts to be more robust against application bugs
    • Motivated by Armored Core VI bug (see below)
    • Native D3D12 drivers are also robust against these application bugs :(

Workarounds

  • Workaround bad ReBAR performance in Age of Wonders 4
  • Remove workaround for KHR_present_wait on NV 535+ drivers
  • Workaround Starfield memory corruption issue where it does not correctly query for 4 KiB alignment
  • Disable ReBAR usage on Halo Infinite to workaround very poor CPU performance
  • Workaround Street Fighter 6 bug causing spurious GPU hangs
    • Also appears to have worked around GPU hangs in Resident Evil 2
  • Workaround Armored Core VI bug causing GPU hang on Balteus fight in chapter 1
  • Workaround "firefly" glitches in Resident Evil 4 caused by dubious min16float usage
  • Workaround "firefly" glitches in Monster Hunter Rise caused by dubious shader requiring particular precise math
  • Workaround Unreal Engine 5 breaking if mesh shaders are exposed, but not barycentrics
  • Workaround NV driver bug with TIMESTAMP query heaps that could cause spurious GPU hangs
  • Workaround broken CFG code generation in Xenia's DXBC emitter