[v4] Consider other RCTs #221

dwbuiten · 2020-07-10T16:28:01Z

In particular, the interesting ones are YCoCg-R[0] and one based on PLHaar[1].

YCoCg-R is well known, but may have not existed when FFV1 was originally written and chose JPEG2000-RCT.

PLHaar is interesting (at least to me) since, unlike JPEG2000-RCT and YCoCg-R. you can construct a RCT from it that, for an N-bit input, also has an N-bit output (as opposed to N+1 bits for the Co/Cg channels, for example):

int sum, diff;
plhaar_int(&sum, &diff, r, b, 1<<(bits-1));
int u = diff;
int tmp = sum;
plhaar_int(&sum, &diff, g, tmp, 1<<(bits-1));
int v = diff;
int y = sum;

Where plhaar_int is defined in [1] and y, u, and v are coded in the bitstream. This is particularily interesting for 16-bit content, where currently we require a 17-bit buffer.

CC: @michaelni - curious to hear your thoughts.

[0] https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/2008_ColorTransforms_MalvarSullivanSrinivasan.pdf
[1] Based on https://digital.library.unt.edu/ark:/67531/metadc1404896/m2/1/high_res_d/15014239.pdf

The text was updated successfully, but these errors were encountered:

retokromer · 2020-07-10T17:13:08Z

My preferred is YCoCg-R, but I’m not objective, because I regularly work in Y′CoCg for old additive colour systems (cinema from the 1920s and 1930s), which often don’t fit well in RGB. I will delve info PLHaar.

dwbuiten · 2020-07-10T17:26:29Z

My understanding of PLHaar from some tests is that it may lose some compression at lower bit depths, but work better at higher depths (this is just from some small tests - I would expect much larger testing before any is adopted.)

Aside: PLHaar has a problem, for example, on greyscale content, because it centres on 127.5, so your chroma (which should be all 128) is a mix of 127/128.

michaelni · 2020-07-11T22:39:00Z

I think the key questions are

how does it affect compression across a representative set of inputs
hoes does it affect speed across a representative set of inputs
are we missing some "better" choices. Has someone reviewed the current state of the art from other similar simple transforms to per image, per slice, or per pixel adapted transforms?

There is also the question of

what is the best compression achievable by modifying the RCT if one disregards speed.
This would give a bound to what can be gained in the best case with a transform considering speed and compression

dwbuiten · 2020-09-22T16:15:45Z

how does it affect compression across a representative set of inputs

I plan to run a lot of RCTs over various sets of RGB videos soon, and post the results here - stay tuned.

hoes does it affect speed across a representative set of inputs

I would expect most RCTs to be small (with some exceptions, see my reply to the last question)

are we missing some "better" choices. Has someone reviewed the current state of the art from other similar simple transforms to per image, per slice, or per pixel adapted transforms?

Yep, there has been a fair bit of research since j2k-rct, as you would expect.

what is the best compression achievable by modifying the RCT if one disregards speed.
This would give a bound to what can be gained in the best case with a transform considering speed and compression

This is interesting, specifically for RDLS modifed RCTs, where the lifting step in various RCTs is replaced by a denoised lifting step in order to. In the paper, they use a set of 11 weighted linear averaging filters, on two planes, and choose the one that minimizes codign cost, so it would require a little more work from the encoder. In their paper they take the memoryless entropy of the median predicted plane as an estimate of BPP, but I think we can do better both in cost estimation and filters. This also requires transmitting some sort of info on which filter was used in the bitstream - in the paper, they use two four-bit numbers (since there are 11 filters), meaning one byte. The rest on here, aside from RKLT, are all extremely simple operations (you can find some of their complexities listed in[1]).

I plan to test the following over a large set of images and videos:

The current J2K-RCT as it is in the spec. (requires N+1 bit chroma)
YCoCg-R (Paper: see original post) (requires N+1 bit chroma)
PLHaar (Paper: see original post) (does not expand dynamic range)
RKLT (based on converting the 3x3 KLT via [0]) (requires up to N+2 bits, and transmission of the matrix)
RDgDb & mRDgDb (Paper [1]) (Requires N+1 bits chroma for non-m variant only)
LDgDb & mLDgDb (Paper [1]) (Requries N+1 bits chroma for non-m variant only)
LDgEb & mLDgEb (Paper [1]) (Requries N+1 bits chroma for non-m variant only)
The RDLS variants of all the above (Paper [2]). Requires choosing filters (if any is chosen at all), and may expand chroma by 1 bit on some transforms.

I've already implemented the annoying parts of these, but it should be just a matter of finding time to modify the FFV1 encoder to test them. I could also be unaware of other developments in RCTs, of course - this is just what I've found after a little playing around.

[0] https://www.semanticscholar.org/paper/General-Reversible-Integer-Transform-Conversion-Pei-Ding/bf1c3c63cfca437ac697c2ed135cf5540df5a9da?p2df
[1] http://sun.aei.polsl.pl/~rstaros/papers/s2014-jvcir-AAM.pdf
[2] https://arxiv.org/abs/1508.06106

michaelni · 2020-09-30T13:28:11Z

@dwbuiten thats nice, thanks alot for working on this!

michaelni · 2024-09-24T22:57:11Z

I plan to test the following over a large set of images and videos:

The current J2K-RCT as it is in the spec. (requires N+1 bit chroma)

YCoCg-R (Paper: see original post) (requires N+1 bit chroma)

PLHaar (Paper: see original post) (does not expand dynamic range)

RKLT (based on converting the 3x3 KLT via [0]) (requires up to N+2 bits, and transmission of the matrix)

RDgDb & mRDgDb (Paper [1]) (Requires N+1 bits chroma for non-m variant only)

LDgDb & mLDgDb (Paper [1]) (Requries N+1 bits chroma for non-m variant only)

LDgEb & mLDgEb (Paper [1]) (Requries N+1 bits chroma for non-m variant only)

The RDLS variants of all the above (Paper [2]). Requires choosing filters (if any is chosen at all), and may expand chroma by 1 bit on some transforms.

I've already implemented the annoying parts of these, but it should be just a matter of finding time to modify the FFV1 encoder to test them. I could also be unaware of other developments in RCTs, of course - this is just what I've found after a little playing around.

How are these tests going ?
or you need more time ? ;)

michaelni added the v4 label Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v4] Consider other RCTs #221

[v4] Consider other RCTs #221

dwbuiten commented Jul 10, 2020

retokromer commented Jul 10, 2020

dwbuiten commented Jul 10, 2020

michaelni commented Jul 11, 2020

dwbuiten commented Sep 22, 2020 •

edited

Loading

michaelni commented Sep 30, 2020

michaelni commented Sep 24, 2024

[v4] Consider other RCTs #221

[v4] Consider other RCTs #221

Comments

dwbuiten commented Jul 10, 2020

retokromer commented Jul 10, 2020

dwbuiten commented Jul 10, 2020

michaelni commented Jul 11, 2020

dwbuiten commented Sep 22, 2020 • edited Loading

michaelni commented Sep 30, 2020

michaelni commented Sep 24, 2024

dwbuiten commented Sep 22, 2020 •

edited

Loading