- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
- Index offset is ignored anyway and only used to calculate vertex attribute divisor index
- Specialized optimization for untouched xfer without primitive restart
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
usage. Allows for operations like optional memory initialization before
reading
- If draw call resources consume memory that intersects with NA parts of the texture cache, we get a framebuffer test mismatch.
This mismatch is false and happens because the thread has not yet reached the point of relocking the pages
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
- Do not bind companion framebuffer when clearing single aspect; let the
contest mechanism sort it out instead
- Do not prematurely tag framebuffers, instead only do so at
write-confirmation time. Should avoid false tagging if setup does not
allow a render to occur.
- Implements a mirror view of D24S8 data that accesses the stencil components.
Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data
- Add a few missing destructors
Image classes are inherited a lot and I forgot to make the dtors virtual
- Per-channel conditional execution introduces RAW hazards all over the place
- Its cheaper to process both branches and select between the two
- Also improves ShaderVariable functionality to allow functionality such as match_size and taking complex variables as inputs
To avoid the need (and performance hit) of Read Color/Depth Buffers, we
may not invalidate overlapping fbos inside lock_memory_region unless
they are guaranteed to be superseded by the new one.
This avoids e.g. issues with overblooming, among others.
Fixes VRAM leaks and incorrect destruction of resources, which could
lead to drivers crashes.
Additionally, lock_memory_region is now able to flush superseded
sections. However, due to the potential performance impact of this
for little gain, a new debug setting ("Strict Flushing") has been
added to config.yaml
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
- Also fix visual corruption when using disjoint indexed draws
- Refactor draw call emit again (vk)
- Improve execution barrier resolve
- Allow vertex/index rebase inside begin/end pair
- Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]
- gl bringup
- Simplify
- using the simple_array gets back a few more fps :)
- Orders flushing to preserve memory at all cost
- Avoids false positive where flushing overlapping sections can falsely invalidate another with head/tail test
- Forcefully downloads and reuploads data from the CPU in case of unexpected overlaps
- Properly detect correct size of newly created blit targets
- Remember to clear any existing views when changing the default component map!
- NOTE: The address swizzle index is only for use as src. The address registers are only used one channel at a time.
- When the destination of ARL, the encoding is the same as the other temp registers
- Retag resources reprotected under flush_always rules
- Properly check for blit resource fitting taking into account format
mismatch, pitch mismatch and typeless transfers
- The x value contains the VP output value interpolated across primitive surface
- The y coordinate contains the fog fraction according to the selected fog formula