- While the mask for surface_a is at index 0, the surface cache expects the order to be maintained correctly!
Set the correct mask since surface store now checks each RTT individually
- Avoid silly broken tests due to queue_tag being called before pitch is initialized.
- Return actual memory range covered and exclude trailing padding.
- Coordinates in src are to be calculated with src_pitch, not required_pitch.
- This allows creating buffers with no MAP bits set which should ensure they are created for VRAM usage only
- TODO: Implement compute kernels to avoid software fallback mode for pack/unpack operations
- Fix 2D coordinate sampling of W coordinate.
W is actually HPOS.w and not 1. Z is however always 0.
- Optimize register usage a bit
Disassembling compiled SPV shows that global declaration results in less ops than using inout modifiers. Modifiers generate extra mov instructions.
- Fix reading of varying registers in FP
Different registers have different behavior
- Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1)
- Reimplements two-sided lighting correctly without hacks
- Also bumps shader cache version
- Do not allow offloader to handle its own faults. Serialize them on RSX instead.
This approach introduces a GPU race condition that should be avoided with improved synchronization.
- TODO: Use proper GPU-side synchronization to avoid this situation
- Avoids memory appearing older when used for depth test without depth write
The write_barrier before the call will inherit new data but the tag will not update as no new information is added.
- Properly commit orphaned blocks not invalidating existing cache structures
- Do not ignore overwritten objects when commiting as unprotected fbo. Avoids stale references to invalidated surface objects.
- Load into memory as straightforward BGRA
- Fixes a bug in vulkan caused by byte shuffling in blit engine vs shader access
- Removes the need for memory shuffling when transferring into a rendertarget