Commit graph

389 commits

Author SHA1 Message Date
eladash 56d553f10d Rsx: fix cmd jump over put register 2018-08-22 12:20:31 +03:00
kd-11 dd21e43ed5 rsx: Force disable draw reordering when capturing a frame 2018-08-18 16:14:30 +03:00
kd-11 0f36e87010 zcull: Improve the delay algorithm to be more consistent
- Use proper time checking; depending on what is being done one 'tick' can
  be almost a millisecond long or several nanoseconds
- Avoid spamming the system timer unless necessary
2018-08-18 16:14:30 +03:00
kd-11 38191c3013 rsx: Avoid acquiring the vm lock; deadlock evasion
- A possible deadlock is still present if rsx is trying to get a super_ptr whilst the vm lock holder is in an access violation
  This patch makes this scenario very unlikely since each block need only be touched once
2018-08-18 16:14:30 +03:00
kd-11 373e02e91c rsx: Timestamp accuracy workaround 2018-08-18 16:14:30 +03:00
kd-11 1200ca8172 rsx: Optimize hash_struct; vk cleanup 2018-08-18 16:14:30 +03:00
kd-11 d0165290b6 rsx: Refactor and fix framebuffer layout checks
- Refactors shared code back into rsx core
- Adds extra check to avoid contest confusion
2018-08-18 16:14:30 +03:00
kd-11 9f0fada17a ZCULL: lower notice severity to avoid spam 2018-08-18 16:14:30 +03:00
kd-11 8800c10476 zcull synchronization tweaks
- Implement forced reading when calling update method to sync partial lists
- Defer conditional render evaluation and use a read barrier to avoid extra work
- Fix HLE gcm library when binding tiles & zcull RAM
2018-08-18 16:14:30 +03:00
kd-11 3b47e43380 rsx: Synchronization rewritten
- Do not do a full sync on a texture read barrier
- Avoid calling zcull sync in FIFO spin wait
- Do not flush memory to cache from the renderer side; this method is now obsolete
2018-08-18 16:14:30 +03:00
eladash f349695a75 Rsx: rewrite address translation 2018-08-13 16:16:34 +03:00
eladash 9e380a4a4a Rsx: avoid invalid cmds execution 2018-08-12 23:37:24 +03:00
eladash c80eb1ba02 Rsx: fix CALL and RET cmd 2018-08-12 23:37:24 +03:00
Megamouse eecb984689 RSX/Qt: add more performance overlay options to the gui 2018-07-28 23:10:45 +02:00
kd-11 19d808d378 rsx/gl: Minor cleanup and optimization
- Track register change status
- Remove unused gl classes
2018-07-22 17:19:59 +03:00
kd-11 e7f30640ef rsx: Async shader compilation
- Defer compilation process to worker threads
- vulkan: Fixup for graphics_pipeline_state.
  Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)
2018-07-14 15:19:56 +03:00
kd-11 fa55a8072c rsx: Improve vertex textures support
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
2018-07-12 18:02:28 +03:00
kd-11 4d40ed9dbd rsx: Silence harmless warning 2018-07-07 16:20:33 +03:00
kd-11 2ca935a26b vp: Improve vertex program analyser
- Adds dead code elimination
- Fix absolute branch target addresses to take base address into account
- Patch branch targets relative to base address to improve hash matching
- Bumps shader cache version
- Enables shader logging option to write out vertex program binary,
  helpful when debugging problems.
2018-07-07 16:20:33 +03:00
eladash 345f92ab0a rsx: more efficient command reading 2018-06-27 21:59:34 +03:00
kd-11 1730708f47 rsx: Rework memory protection management for framebuffer access
- Avoid re-locking memory if there is no reason to do so (no draws issued)
- Actively bound regions should always get written to the backing cache
- Forcefully read memory during download if writes to the target have occured since last sync event
2018-06-26 20:07:20 +03:00
eladash b456955688 rsx: fix hardcoded rsx allocation address 2018-06-24 10:57:30 +03:00
VelocityRa dd0684b58a overlays/perf_overlay: Make pos, font, opacity, margin configurable
- Also some perf overlay refactoring
2018-06-18 22:34:26 +03:00
kd-11 6362942928 rsx: Avoid semaphore acquire deadlock 2018-05-30 13:30:23 +03:00
VelocityRa c8d8a81ccd overlays: Performance Overlay 2018-05-30 12:35:41 +03:00
eladash 23b380eb41 allow deallocations to unmap rsx mapped memory 2018-05-29 19:57:28 +03:00
kd-11 83f9be2524 rsx: Promote FIFO optimizations outside of strict mode
- The benefits of FIFO optimizations are huge in some cases.
  The optimizations also do not break any tested applications so no need to disable with strict mode
- A debug option is provided to disable this behaviour for testing
2018-05-29 13:54:30 +03:00
kd-11 493d4e8613 fixup - Improve invalidated region checks for performance 2018-05-24 10:36:04 +03:00
kd-11 92b5a705d8 fixup - locking 2018-05-23 19:07:08 +03:00
kd-11 b957eac6e8 rsx: Avoid calling any blocking callbacks from threads that are not rsx::thread
- Defers on_notity_memory_unmapped to only run from within rsx context
- Avoids passive_lock + writer_lock deadlock
2018-05-23 19:07:08 +03:00
kd-11 8fcd5c1e5a rsx: Texture cache fixes
1. rsx: Rework section synchronization using the new memory mirrors
2. rsx: Tweaks
    - Simplify peeking into the current rsx::thread instance.
      Use a simple rsx::get_current_renderer instead of asking fxm for the same
    - Fix global rsx super memory shm block management
3. rsx: Improve memory validation. test_framebuffer() and
tag_framebuffer() are simplified due to mirror support
4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine
5. rsx: Explicitly mark clobbered flushable sections as dirty to have them
removed
6. rsx: Cumulative fixes
    - Reimplement rsx::buffered_section management routines
    - blit engine subsections are not hit-tested against confirmed/committed memory range
      Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops
2018-05-23 19:07:08 +03:00
kd-11 f6f45b8699
Native UI refactored (#4623)
Refactor and improve native overlays
2018-05-20 23:05:00 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 bff6060bd6 rsx: Improve puller state management
- Properly identify puller spin primitives
- Add a small wake delay after exiting a spin delay. Fixes desynchronization
  It seems real hw has a small delay between cell edits to commandbuffer memory at the GET address and the changes becoming visible to the DMA puller
  Simulated with a short busy_wait, large values will improve sync but degrade performance
2018-05-13 14:44:14 +03:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 440a31ef18 rsx: Optimizations for program management 2018-05-13 14:44:14 +03:00
kd-11 a52ea7f870 rsx: Improve fragment and vertex program usage
- Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search
  - Avoids detecting shader as being different because of unused textures having state changes
  - Adds better program size detection for vertex programs
- Improved vertex program decompiler
  - Properly support CAL type instructions
  - Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes
  - Fix SRC checks and abort
  - Fix CC register initialization
  - NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)
2018-05-13 14:44:14 +03:00
Jake 75b40931fc rsx: initial capture/replay functionality (#4510)
* rsx: initial capture/replay functionality
2018-05-13 12:18:05 +03:00
kd-11 c5d1f30e82 rsx: Fix performance counters
- Detect jump-to-self type idling
2018-04-25 19:14:36 +03:00
kd-11 a42b00488d rsx: Texture fixes
- gl/vk: Fix subresource copy/blit
- gl/vk: Fix default_component_map reading
- vk: Reimplement cell readback path and improve software channel decoder
- Properly name the subresource layout field - its in blocks not bytes!
- Implement d24s8 upload from memory correctly
- Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules
- NOTE: Redirection of 16-bit textures is not implemented yet
2018-04-25 19:14:36 +03:00
kd-11 cfd0b8a975 rsx: Fix alphakill 2018-04-05 01:06:50 +03:00
kd-11 93b2776604 rsx: Fix vertex input detection
- Properly detect inline array registers vs constant value registers
- Silence needless spam, 306E is 2D surface engiine, the assumption that y is multiplied by 306E pitch is not crazy
2018-04-05 01:06:50 +03:00
Jake 6d6d6fa827 dx12/vk/gl: implement use of vertex_data_base_index when calculating index 2018-03-30 13:30:04 +03:00
kd-11 7627ad04f1 rsx: Disable gamma control on WZYX textures
- Gamma is seemingly used for (D/X/A)RGB only. Data textures are unaffected
2018-03-29 13:52:11 +03:00
kd-11 321c360dcb rsx: Overhaul rendertarget sampling/shuffles
- Reimplements render target views used for sampling
- Optimizes access using an encoded control token
- Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR)
- Adds proper encoding for ABGR textures (ABGR8 -> ARGB8)
- Silence some compiler warnings as well
- TODO: Real texture views for OGL current method is a hack
2018-03-25 13:31:06 +03:00
kd-11 9fc1740608 rsx/fp: Fragment program overhaul
- Separate TXB from TXL: They are completely different!
- Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment
- Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work
- Document some more bits in SHADER_CONTROL register
2018-03-25 13:31:06 +03:00
kd-11 27552891ad rsx/fp: Improvements
- Export some debug information in the free texture register space components zw
  Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
  Some engines (UE3) read all the components in the shader and use mul/mad with the result
2018-03-25 13:31:06 +03:00
kd-11 5f047034ae rsx: Disable async count verification to avoid lockup due to zombie reports in ZCULL 2018-03-13 18:55:03 +03:00
kd-11 f00d9a7c7f rssx" Halfplement alpha-to-coverage AA transparency 2018-03-13 18:55:03 +03:00
kd-11 2dce55d036 rsx: ZCULL synchronization fixes
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
  Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
2018-03-13 18:55:03 +03:00