Tweak lossy modular #3575

jonsneyers · 2024-05-13T10:02:29Z

(builds on top of #3563)

Better align lossy modular quality with lossy vardct quality. Also tweaking the quantization choices in lossy modular, so it actually applies B quantization to B instead of using X quantization for both X and B (which was what it was effectively doing). Tweaking the allocation of bits between luma (Y) and chroma (X,B). Also slightly improves generation loss.

This alignment was needed because lossy modular and lossy vardct were producing quite diverging results for the same distance setting.

Before

These are lossy modular results for a range of distance settings:

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.1:m       53140 41104085    6.1879922   1.418   9.494   0.30245872  94.98616634  54.18   0.12241306  0.757491042333   6.188      0
jxl:d0.25:m      53140 26531556    3.9941787   1.836  10.823   0.58799230  92.91519850  48.63   0.25248080  1.008453426525   3.994      0
jxl:d0.5:m       53140 18560100    2.7941202   2.136  12.386   0.97323717  89.90044713  44.91   0.41017167  1.146068966366   2.868      0
jxl:d1:m         53140 12578622    1.8936418   2.408  14.741   1.53306369  85.39651704  41.53   0.63845833  1.209011402375   2.815      0
jxl:d1.5:m       53140  9896064    1.4897976   2.500  16.109   1.94179268  81.78122165  39.68   0.81156088  1.209061451889   2.833      0
jxl:d2:m         53140  8257728    1.2431552   2.700  16.385   2.32909668  78.59395966  38.38   0.96437651  1.198869672755   2.834      0
jxl:d2.5:m       53140  7165938    1.0787923   2.737  17.683   2.64266847  75.72665512  37.40   1.09564058  1.181968573612   2.796      0
jxl:d3:m         53140  6379569    0.9604088   2.671  18.399   2.96591632  73.09188953  36.62   1.21856073  1.170316404959   2.810      0
jxl:d4:m         53140  5216736    0.7853507   2.739  18.571   3.53384208  68.19005368  35.41   1.44350533  1.133657922321   2.729      0
Aggregate:       53140 11947971    1.7987008   2.301  14.588   1.47595412  81.82200146  41.47   0.61292338  1.102465789383   3.191      0

These are VarDCT results (after #3563) for those same distance settings:

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.1        53140 48950295    7.3691957   2.048  22.589   0.33223548  94.89087682  55.27   0.10573762  0.779201225712   7.369      0
jxl:d0.25       53140 29733459    4.4762075   2.194  28.727   0.48920441  93.31949572  50.14   0.20036277  0.896865340417   4.476      0
jxl:d0.5        53140 19001497    2.8605701   2.373  37.025   0.86379899  90.38862331  45.76   0.36794302  1.052526796888   2.873      0
jxl:d1          53140 11657700    1.7550021   2.216  38.838   1.58120143  84.82011959  41.70   0.64830139  1.137770330813   2.773      0
jxl:d1.5        53140  8552769    1.2875720   2.280  38.153   2.17524115  79.77756416  39.52   0.88292428  1.136828569085   2.806      0
jxl:d2          53140  6757501    1.0173043   2.357  40.917   2.74656156  75.21167304  38.03   1.09170924  1.110600543817   2.830      0
jxl:d2.5        53140  5566980    0.8380780   2.402  42.396   3.30428849  70.93466991  36.92   1.28268172  1.074987317940   2.798      0
jxl:d3          53140  4754330    0.7157380   2.333  44.111   3.71139568  67.14598243  36.09   1.45411408  1.040764762455   2.684      0
jxl:d4          53140  3705372    0.5578232   2.011  26.785   4.40002757  60.26093636  34.88   1.75191069  0.977256468691   2.491      0
Aggregate:      53140 10692833    1.6097468   2.242  34.701   1.60723056  78.78294653  41.55   0.63125830  1.016166008308   3.237      0

The alignment is not very good: at d > 1, lossy modular produces significantly larger files (and better quality) than vardct; at d < 1 it is the other way around.

After

VarDCT doesn't change, so see above.

Modular after this PR:

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.1:m       53140 48111781    7.2429620   1.069   7.828   0.21292319  95.28249929  56.26   0.08998637  0.651767869044   7.243      0
jxl:d0.25:m      53140 29564885    4.4508296   1.490   9.735   0.50322435  93.47504297  49.55   0.22326671  0.993722073864   4.451      0
jxl:d0.5:m       53140 18828076    2.8344625   1.847  11.233   0.95157843  90.42968909  44.93   0.41369676  1.172607967515   2.861      0
jxl:d1:m         53140 11507797    1.7324351   2.214  13.562   1.67635978  84.92530351  40.80   0.70952776  1.229210779892   2.862      0
jxl:d1.5:m       53140  8460690    1.2737100   2.371  14.976   2.30979969  80.14119187  38.55   0.95482124  1.216165357971   2.901      0
jxl:d2:m         53140  6800773    1.0238187   2.447  15.010   2.82803248  75.87385531  37.02   1.17041047  1.198288118289   2.866      0
jxl:d2.5:m       53140  5658292    0.8518245   2.491  16.323   3.32178761  71.74054250  35.89   1.36691242  1.164369502693   2.801      0
jxl:d3:m         53140  4817544    0.7252546   2.430  16.394   3.86773750  67.82693527  34.99   1.55815686  1.130060363618   2.775      0
jxl:d4:m         53140  3707630    0.5581632   2.431  17.478   4.69135275  60.56442194  33.65   1.90488446  1.063236314116   2.620      0
Aggregate:       53140 10669775    1.6062755   2.018  13.205   1.59628018  79.19721799  40.73   0.66870874  1.074130443596   3.286      0

Alignment is better this way.

To get a fair comparison of old modular vs tweaked modular, here is old modular at similar bpp (slightly higher) as the tweaked modular:

Encoding         kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-------------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.069:m       53140 48180808    7.2533537   1.206   8.820   0.22820268  95.38013283  56.51   0.09049648  0.656402984315   7.253      0
jxl:d0.203:m       53140 29658725    4.4649567   1.669   9.923   0.50094956  93.61143051  49.93   0.21073243  0.940911173439   4.465      0
jxl:d0.4882:m      53140 18970003    2.8558288   1.977  11.721   0.95064904  90.07911212  45.11   0.39761395  1.135517385801   2.931      0
jxl:d1.162:m       53140 11523550    1.7348066   2.426  15.053   1.68274125  84.15604147  40.83   0.69924133  1.213048474585   2.827      0
jxl:d1.932:m       53140  8476849    1.2761426   2.591  16.431   2.27561788  79.06236950  38.54   0.94145716  1.201433632354   2.835      0
jxl:d2.7:m         53140  6816508    1.0261875   2.718  17.352   2.79109546  74.67277859  37.07   1.14766209  1.177716497413   2.816      0
jxl:d3.57:m        53140  5662131    0.8524024   2.586  18.187   3.24948584  70.22513623  35.89   1.34787635  1.148933104665   2.714      0
jxl:d4.47:m        53140  4825670    0.7264779   2.734  18.763   3.86094162  66.21285341  34.97   1.53812001  1.117410165510   2.745      0
jxl:d6.345:m       53140  3710753    0.5586333   2.722  19.958   4.64623423  58.86782552  33.62   1.88202128  1.051359763331   2.599      0
Aggregate:         53140 10694587    1.6100108   2.220  14.591   1.59736411  78.20160956  40.80   0.65552967  1.055409873294   3.259      0

It's not a big difference if you compare it this way. It depends a bit on the metric; generally the tweaked modular does get slightly better scores at slightly lower bpp, but it's not very clear.

Generation loss

Now if we take those same settings and compare old vs tweaked after 100 generations, the difference does become a bit clearer:

Before

Generation loss testing with 100 intermediate generations

Encoding         kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-------------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.069:m       53140 48186234    7.2541705   0.013   0.090   1.40738202  93.44942015  55.52   0.25663587  1.861680401155  10.225      0
jxl:d0.203:m       53140 29663358    4.4656542   0.017   0.103   3.61123805  88.26189759  48.82   0.68444072  3.056475544558  15.034      0
jxl:d0.4882:m      53140 18972609    2.8562212   0.021   0.122   5.44384979  79.64749328  43.94   1.17285873  3.349943912412  14.259      0
jxl:d1.162:m       53140 11523460    1.7347931   0.025   0.154   8.38648889  71.73642423  39.78   1.89421391  3.286069131823  13.222      0
jxl:d1.932:m       53140  8478808    1.2764376   0.026   0.165  10.54550605  65.17318913  37.52   2.42053516  3.089661998308  11.461      0
jxl:d2.7:m         53140  6818442    1.0264787   0.027   0.181  12.40544452  60.27040697  36.09   2.98276217  3.061741719823  12.049      0
jxl:d3.57:m        53140  5662131    0.8524024   0.026   0.185  13.98764483  48.52200023  34.74   3.42904134  2.922923233436  10.723      0
jxl:d4.47:m        53140  4825245    0.7264139   0.026   0.191  14.72906274  46.69023150  33.94   3.58541314  2.604493940186   9.910      0
jxl:d6.345:m       53140  3710950    0.5586630   0.026   0.202  16.75319376  41.62773131  32.79   4.06947191  2.273463227360   8.592      0
Aggregate:         53140 10695631    1.6101679   0.022   0.149   7.78285708  63.77708959  39.75   1.73296491  2.790364542227  11.549      0

After

Generation loss testing with 100 intermediate generations

Encoding       kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
-----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.1:m       53140 48121249    7.2443874   0.012   0.089   1.56696973  93.06807012  55.24   0.27270687  1.975594218018  11.123      0
jxl:d0.25:m      53140 29568418    4.4513615   0.017   0.106   2.83078488  88.19197705  48.46   0.59107621  2.631093898289  11.945      0
jxl:d0.5:m       53140 18830462    2.8348217   0.021   0.124   5.17058480  81.94773331  43.98   1.08377486  3.072308512110  13.482      0
jxl:d1:m         53140 11509429    1.7326808   0.024   0.148   7.83642957  72.88840763  39.86   1.75882736  3.047486326788  11.667      0
jxl:d1.5:m       53140  8460357    1.2736599   0.025   0.159   9.38654162  68.61298269  37.68   2.21315694  2.818809178308  10.713      0
jxl:d2:m         53140  6801282    1.0238953   0.027   0.166  12.20334398  62.34515111  36.15   2.76856818  2.834724000347  11.049      0
jxl:d2.5:m       53140  5658290    0.8518242   0.027   0.179  11.54156491  58.36970144  35.02   2.98106722  2.539345222598   9.251      0
jxl:d3:m         53140  4815024    0.7248752   0.026   0.180  13.39350239  52.65983238  34.11   3.37138537  2.443833589105   9.049      0
jxl:d4:m         53140  3706018    0.5579205   0.026   0.191  16.35633845  40.39787424  32.68   4.06382649  2.267292006189   8.659      0
Aggregate:       53140 10669374    1.6062151   0.022   0.145   7.20085167  66.67487059  39.78   1.62010340  2.602234567530  10.670      0

So the tweaked modular overall produces better quality after 100 generations, at slightly lower bpp.

For comparison, this is VarDCT after 100 generations (and after #3563, which improves generation loss at lower distances):

Generation loss testing with 100 intermediate generations

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.1        53140 48054736    7.2343742   0.020   0.232   2.98395855  88.09270722  52.42   0.58493928  4.231669652969  20.454      0
jxl:d0.25       53140 29119972    4.3838504   0.021   0.303   6.33092032  79.91342609  46.15   1.22792087  5.383021438137  25.536      0
jxl:d0.5        53140 18804413    2.8309002   0.022   0.367  10.73691031  68.99695044  41.95   2.03063519  5.748525518387  27.893      0
jxl:d1          53140 11443245    1.7227171   0.021   0.367  27.88840310   1.80302525  34.28   6.38151745 10.993549416997  47.581      0
jxl:d1.5        53140  8337753    1.2552025   0.021   0.370  25.07149720  -4.07512149  33.40   6.06574820  7.613742392276  30.860      0
jxl:d2          53140  6580720    0.9906909   0.022   0.381  27.10828399 -12.06172369  32.38   6.55478743  6.493768468760  25.739      0
jxl:d2.5        53140  5417328    0.8155487   0.022   0.394  26.50101880 -17.21918124  31.69   6.92541770  5.648015497339  21.206      0
jxl:d3          53140  4612572    0.6943972   0.021   0.396  28.64547648 -25.46657559  30.87   7.61306438  5.286490260024  19.214      0
jxl:d4          53140  3562695    0.5363440   0.018   0.240  32.74205892 -35.48011391  29.78   8.97608814  4.814270944756  17.115      0
Aggregate:      53140 10438026    1.5713870   0.021   0.333  16.59701343  30.59129436  36.31   3.82469631  6.010078169944  25.018      0

(ignore the aggregate for ssimulacra2, we need to fix that when the scores can get negative like they can)

Before #3563 (i.e. current git head), it was even worse, and it seems like d < 1 even produces worse artifacts than d1 after generation loss — presumably because encoder gaborish compensates for d1 artifacts but overcompensates for it when using a lower distance:

Generation loss testing with 100 intermediate generations

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0.1        53140 47087608    7.0887785   0.022   0.244  43.07154727 -37.04004591  24.87  14.44179314 102.37467239926 302.839      0
jxl:d0.25       53140 28933867    4.3558334   0.023   0.313  41.49435199 -20.28801989  28.38  12.19661991 53.126443848102 179.788      0
jxl:d0.5        53140 18648093    2.8073671   0.024   0.380  36.78454894  -3.27442274  32.51   9.26443044 26.008656965538 102.846      0
jxl:d1          53140 11443245    1.7227171   0.023   0.410  27.88840310   1.80302525  34.28   6.38151745 10.993549416997  47.581      0
jxl:d1.5        53140  8337753    1.2552025   0.023   0.414  25.07149720  -4.07512149  33.40   6.06574820  7.613742392276  30.860      0
jxl:d2          53140  6580720    0.9906909   0.024   0.425  27.10828399 -12.06172369  32.38   6.55478743  6.493768468760  25.739      0
jxl:d2.5        53140  5417328    0.8155487   0.024   0.438  26.50101880 -17.21918124  31.69   6.92541770  5.648015497339  21.206      0
jxl:d3          53140  4610257    0.6940486   0.023   0.443  28.43532502 -25.54085196  30.84   7.66117028  5.317224864580  19.050      0
jxl:d4          53140  3564574    0.5366269   0.020   0.270  32.76653380 -33.53512790  29.88   8.91190170  4.782365869574  17.144      0
Aggregate:      53140 10397438    1.5652767   0.023   0.363  31.52666354   1.80302525  30.79   8.34269599 13.058627536853  48.430      0

Conclusion

With these changes, the alignment between lossy modular and lossy vardct improves, making both modes behave more similarly when using the same distance setting.

For single-generation encoding (which is the bulk of the use cases), vardct mode is still clearly preferable in all aspects (compression density, encode speed, decode speed). However, for multi-generation encoding, lossy modular does have benefits: it suffers significantly less from generation loss. Whether or not that is a good trade-off that offsets the worse single-generation compression density and slower encode/decode, depends on the use case (I assume usually it doesn't).

lib/jxl/enc_detect_dots.cc

jonsneyers requested review from veluca93 and jyrkialakuijala May 13, 2024 10:02

jyrkialakuijala reviewed May 13, 2024

View reviewed changes

lib/jxl/enc_detect_dots.cc Outdated Show resolved Hide resolved

jonsneyers force-pushed the tweak_lossy_modular branch 3 times, most recently from 33b1660 to bd8aac6 Compare May 13, 2024 15:54

improve precision and generation loss at very low distance

67c5b3d

jonsneyers force-pushed the tweak_lossy_modular branch from bd8aac6 to 77796b1 Compare May 28, 2024 09:43

tweak lossy modular

db9382d

jonsneyers force-pushed the tweak_lossy_modular branch from 77796b1 to db9382d Compare May 28, 2024 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tweak lossy modular #3575

Tweak lossy modular #3575

jonsneyers commented May 13, 2024

Tweak lossy modular #3575

Are you sure you want to change the base?

Tweak lossy modular #3575

Conversation

jonsneyers commented May 13, 2024

Before

After

Generation loss

Before

After

Conclusion