Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RDO support to the ASTC encoder. #588

Open
MarkCallow opened this issue Jun 10, 2022 · 11 comments
Open

Add RDO support to the ASTC encoder. #588

MarkCallow opened this issue Jun 10, 2022 · 11 comments

Comments

@MarkCallow
Copy link
Collaborator

The simplest way is to take the output of the ASTC encoder and pass it through the ERT (Entropy Reduction Transform) in this repo:

https://github.com/richgel999/bc7enc_rdo

See this class:
https://github.com/richgel999/bc7enc_rdo/blob/master/ert.h

The ERT is format agnostic. It just cares about block bytes, the original block's RGBA pixels, and a block unpack function.

You pass it an array of encoded blocks, an array of original block pixels, some parameters, and a pointer to a block unpack function. It does the rest. We need to find a good ASTC block unpacker. (There may be one with astc-encoder repo that is included in this repo. I haven't confirmed.) The main problem is the ASTC specification is so complex that some block unpackers wouldn't always catch invalid blocks, which the ERT will sometimes create as it injects trial matches into blocks. These blocks would fail to be unpacked by some unpackers, but would be unpacked by others, which is an issue.

The ERT is really simple and not much code, so it should be a good place to start.

A possible improvement is to encode a texture to ASTC multiple times, and use that as the source into the ERT. Each encode can use a different encoding that achieves more or less the same error.

@solidpixel
Copy link
Contributor

solidpixel commented Aug 9, 2022

The main problem is the ASTC specification is so complex that some block unpackers wouldn't always catch invalid blocks

The astcenc block unpack should be standard-conformant - returning either max-magenta (unorm output) or NaN (float output) for any invalid encodings. It's a bug if it isn't, so if anyone finds cases where it's being too lenient please raise a bug upstream on the astcenc project.

@MarkCallow
Copy link
Collaborator Author

Thanks @solidpixel that is good to know. The person who gave me the information I wrote above had tried various block unpackers and was reporting his observations. Not sure if he tried the one in astcenc. When we implement RDO, we'll be sure to use that one.

@solidpixel
Copy link
Contributor

solidpixel commented Aug 12, 2022

A few issues were fixed during the development of the 3.x series, so there have definitely historically been cases where it's not been doing the right thing here, but for 4.x these have all been fixed (at least the ones we know about;).

@YunHsiao
Copy link

Here's one possible way to do it:
ARM-software/astc-encoder@4.7.0...YunHsiao:astc-encoder:rdo

@MarkCallow
Copy link
Collaborator Author

MarkCallow commented Mar 26, 2024

Here's one possible way to do it:
ARM-software/astc-encoder@4.7.0...YunHsiao:astc-encoder:rdo

Thank you @YunHsiao. From what I can see it looks good. Since there does not appear to be a PR to the ASTC encoder source yet, I will post my question here. ktxBasisParams has the following parameters for UASTC RDO:

    float uastcRDOQualityScalar;
        /*!< UASTC RDO quality scalar (lambda). Lower values yield higher
             quality/larger LZ compressed files, higher values yield lower
             quality/smaller LZ compressed files. A good range to try is [.2,4].
             Full range is [.001,50.0]. Default is 1.0.
         */
    ktx_uint32_t uastcRDODictSize;
        /*!< UASTC RDO dictionary size in bytes. Default is 4096. Lower
             values=faster, but give less compression. Range is [64,65536].
         */
    float uastcRDOMaxSmoothBlockErrorScale;
        /*!< UASTC RDO max smooth block error scale. Range is [1,300].
             Default is 10.0, 1.0 is disabled. Larger values suppress more
             artifacts (and allocate more bits) on smooth blocks.
         */
    float uastcRDOMaxSmoothBlockStdDev;
        /*!< UASTC RDO max smooth block standard deviation. Range is
             [.01,65536.0]. Default is 18.0. Larger values expand the range of
             blocks considered smooth.

Do any of these correspond to the rdo_level or rdo_lookback parameters you have? I thought uastcRDODictSize and rdo_lookback might be the same thing but the values are vastly different. Would the SmoothBlock parameters be any help in your RDO?

It would be great if ASTC and UASTC RDO had the same set of parameters so users only have to understand one set.

@YunHsiao
Copy link

@MarkCallow

Thanks for the feedback, rdo-lookback is just uastcRDODictSize divided by ASTC block bytes (16) and rdo-level is exactly the same as uastcRDOQualityScalar.

Afaics the parameters can be completely the same since they all got routed to the ERT solver, I'll see what I can do here.

As for PRs the implementation is considered in-progress and may still have some rough edges to be ironed out, we're not there yet ;)

@MarkCallow
Copy link
Collaborator Author

rdo-lookback is just uastcRDODictSize divided by ASTC block bytes (16)

So the difference is one is specified in bytes and the other in blocks, correct? I think I prefer blocks. Unfortunately libktx has long been released so changing uastcRDODictSize is difficult. Some source-code compatibility path is needed.

As for PRs the implementation is considered in-progress and may still have some rough edges to be ironed out, we're not there yet ;)

Please post something here when you open a PR.

Please add a "licence" to ert.[ch]. We run reuse which complains if there is nothing. I suggest copying in the public domain no-license text from Rich's GitHub repo. Mind you I am not sure there is a SPDX-License-Identifier for no-license.

@YunHsiao
Copy link

YunHsiao commented Mar 27, 2024

So the difference is one is specified in bytes and the other in blocks, correct? I think I prefer blocks.

Yes but for compatability's sake I'd rather go with bytes.

Please post something here when you open a PR.

I'll keep here updated.

Note that I'm not affiliated in any way with the astc encoder dev team and I'm doing this on my own, so no promises! But I would love to see how far I can go with this so I'll try to do what's right here.

@MarkCallow
Copy link
Collaborator Author

Yes but for compatability's sake I'd rather go with bytes.

Okay. But the KTX documentation should explain more clearly what this parameter is. I like your name of rdo-lookback. So the doc should say something like "Buffer for recording previous blocks for look back. Should be a multiple of the blocksize."

Note that I'm not affiliated in any way with the astc encoder dev team and I'm doing this on my own, so no promises! But I would love to see how far I can go with this so I'll try to do what's right here.

I'd figured that out but thanks for letting me know. I am excited that you are working on it. It should bring a good reduction in the zstd-deflated sizes of ASTC textures in KTX v2 files.

@YunHsiao
Copy link

Okay. But the KTX documentation should explain more clearly what this parameter is. I like your name of rdo-lookback. So the doc should say something like "Buffer for recording previous blocks for look back. Should be a multiple of the blocksize."

Yeah... There's definitely room for improvements. Although technically it does not really need to be multiples of the block size, just name it "dictionary size" is indeed a bit misleading.

I'd figured that out but thanks for letting me know. I am excited that you are working on it. It should bring a good reduction in the zstd-deflated sizes of ASTC textures in KTX v2 files.

Glad to help, let's see how this goes.

I've updated my branch with following changes:

  • Souce licenses should be conformant now
  • Updated CLI interface to mirror KTX2 params
  • Refactored ert solver a bit with a more general & performant interface (30% speed up)
  • Introduced a new path for computing block differences, which specializes for constant blocks and blocks with different partitions & planes, most of which comes from existing infrastructure (another 30% speed up)

@YunHsiao
Copy link

A draft PR has been opened here:
ARM-software/astc-encoder#460

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants