- gl/vk: Fix subresource copy/blit
- gl/vk: Fix default_component_map reading
- vk: Reimplement cell readback path and improve software channel decoder
- Properly name the subresource layout field - its in blocks not bytes!
- Implement d24s8 upload from memory correctly
- Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules
- NOTE: Redirection of 16-bit textures is not implemented yet
- Properly detect inline array registers vs constant value registers
- Silence needless spam, 306E is 2D surface engiine, the assumption that y is multiplied by 306E pitch is not crazy
- Reimplements render target views used for sampling
- Optimizes access using an encoded control token
- Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR)
- Adds proper encoding for ABGR textures (ABGR8 -> ARGB8)
- Silence some compiler warnings as well
- TODO: Real texture views for OGL current method is a hack
- Separate TXB from TXL: They are completely different!
- Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment
- Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work
- Document some more bits in SHADER_CONTROL register
- Export some debug information in the free texture register space components zw
Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
Some engines (UE3) read all the components in the shader and use mul/mad with the result
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
- Removes the duplicate local_transform_constants
- Resets the transform constants on every context reset
- Simplifies the code abit which should make it faster
- NOTE: Transform constants are persistent across context re-init events (VF5)
- TODO: Alot of work is still needed to execute draw commands out of order
Thats the only solution to games sending many draw calls with high frequency of state changes
- Do not bother rechecking the dirty sampler pool for hits. Its faster to create new sampler than to search the pool
- Reserve some memory on vertex layout struct to reduce reallocation penalty
- opengl driver optimization for nvidia. On nvidia glTextureBufferRange performance is horrendous
-- Initialize texture buffer to whole buffer at startup and use absolute offsets to read data instead
-- Over 2x performance in some cases (Resogun, TNT racers)
- gl/vk: Do not flip non-existent display buffers. Fixes spec violation at boot in TNT racers demo
- whitespace fixes for sys_rsx
- The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances
-- If depth falls within near/far, clamp depth value to [0,1]
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
- Reorganize storage hash vs ucode hash
- Scan for actual fragment program start in case leading NOPed code precedes the actual instructions
-- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes
- Use AA mode to predict surface compression. Compression mode is useless without AA activated
- Rewrites most image subresource fetch routines to use the new heuristic
- Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code
-- Fixes alot of failures when looking for tiled regions
rsx: Fix antialiased unnormalized coords
- scaling factors are inverse to allow proper coordinates to be computed in fs
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well
addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine
vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb