- This is just not part of spec, there is no enforcement for multiple of block size for width or height of s3tc compressed images.
- This restriction does indeed exist for ASTC and ETC but we're not using those formats.
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
- Replace old format-based detection with proper aspect test.
Explicit image aspect has been available for a long time, but older
code was not updated.
- Specifically fixes a corner case where double transforms are required.
Technically this can be made more readable using transformation matrices:
* M1 = transform_virtual_to_physical()
* M2 = transform_image_to_virtual()
* M3 = M1 * M2
* Result = Input * M3
But we don't use a CPU-side matrix library and it is not reasonable to do this on the GPU.
- Just use RGB565 for all blit targets. Avoids really dumb transforms done by GPU hw.
- When X16 is used, all the channels get written to R channel alone. CmdBlit does perform format conversion!
- gl: Force image copy when blit is requested with compatible targets. Avoids format conversion issues.
- Partial clears either in active clear channels or scissor region must get barrier inserts to load previous data.
- Fixes some incorrectly discarded data during clear where data in untouched/uninitialized channels is lost.
- Mainly affects nvidia where x/w * w can sometimes return a value smaller than x.
In such conditions, floor(x) will return x-1 if x is an integer which is horribly wrong and exaggerates minor precision drift to great proportions.
Depending on the dpi settings, the debug overlay was almost unreadable.
I also took the liberty to refactor some redundant client size calls and to add some margin to the left of the debug text.
Rewritten the following global utility constants:
`umax` returns max number, restricted to unsigned.
`smax` returns max signed number, restricted to integrals.
`smin` returns min signed number, restricted to signed.
`amin` returns smin or zero, less restricted.
`amax` returns smax or umax, less restricted.
Fix operators == and <=> for synthesized rel-ops.
- Transfer writes are expected to clobber surface cache contents. Do NOT reload from CPU memory for writes.
- TODO: During transfer write to surface cache objects, lock memory if it was unlocked to avoid silly problems.