- Newer nvidia drivers are not exposing IMMEDIATE present mode unless you change options in nvidia control panel
This can cause severe performance degradation unless the vsync option is set to "off" in control panel
- A possible deadlock is still present if rsx is trying to get a super_ptr whilst the vm lock holder is in an access violation
This patch makes this scenario very unlikely since each block need only be touched once
- Implement forced reading when calling update method to sync partial lists
- Defer conditional render evaluation and use a read barrier to avoid extra work
- Fix HLE gcm library when binding tiles & zcull RAM
- Do not do a full sync on a texture read barrier
- Avoid calling zcull sync in FIFO spin wait
- Do not flush memory to cache from the renderer side; this method is now obsolete
- Defer compilation process to worker threads
- vulkan: Fixup for graphics_pipeline_state.
Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
- Adds dead code elimination
- Fix absolute branch target addresses to take base address into account
- Patch branch targets relative to base address to improve hash matching
- Bumps shader cache version
- Enables shader logging option to write out vertex program binary,
helpful when debugging problems.
- Avoid re-locking memory if there is no reason to do so (no draws issued)
- Actively bound regions should always get written to the backing cache
- Forcefully read memory during download if writes to the target have occured since last sync event
- Unroll main compute queue loop
- Do NOT run GPU cores on mappable memory! This has a dreadful impact on performance for obvious reasons
- Enable dynamic SSBO indexing (affects AMD)
- Make loop unrolling and loop length variable depending on hardware and find optimum
- Region pitch of 64 (disabled) can be used to indicate packed contents - do not assume it is the actual pitch!
- Also fixes interaction of AA factors with lockable_region size
- Use names for overlay command config and vertex data instead of std::pair.
- Make a couple of compiled_resource constructors explicitly named functions.
- Used to transfer D32S8 data where it makes sense to use this variant
- On nvidia cards, it is very slow to move aspects from D24S8 probably due to the format being faked.
For this reason, the unsafe variant is used for both D16 and D24S8 to avoid the heavy performance loss
- RADV does not keep a mapping ptr around for subsequent remap and falls back to heavy amdgpu methods every time
Explicitly manage pointer in the ring buffer structure to fix this