Skip to content

Simd v6.1.143

Latest
Compare
Choose a tag to compare
@ermig1979 ermig1979 released this 04 Nov 15:26
· 18 commits to master since this release

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetConvolution16bNhwcDepthwise.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w4 for class SynetConvolution32fNhwcDepthwise.
  • AMX-BF16 kernel DepthwiseConvolution_k7p3d1s1w4 for class SynetMergedConvolution16b.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w6 for class SynetConvolution32fNhwcDepthwise.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w8 for class SynetConvolution32fNhwcDepthwise.
  • AMX-BF16 kernel DepthwiseConvolution_k7p3d1s1w6 for class SynetMergedConvolution16b.
  • AMX-BF16 kernel DepthwiseConvolution_k7p3d1s1w8 for class SynetMergedConvolution16b.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w4 for framework SynetMergedConvolution32f.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w6 for framework SynetMergedConvolution32f.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w8 for framework SynetMergedConvolution32f.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w8 for class SynetMergedConvolution16b.
  • Base implementation of function SimdYuv444pToRgbaV2.
Improving
  • AVX-512BW optimizations of function Convolution32fNhwcDepthwiseDefault.
  • AMX-BF16 optimizations of function DepthwiseConvolutionLargePad.
Bug fixing
  • Error in Base implementation of class SynetDeconvolution16bNhwcGemm.

Test framework

New features
  • Tests for verifying functionality of function SimdYuv444pToRgbaV2.