Commit graph

120 commits

Author SHA1 Message Date
Nekotekina 367f039523 Build transactions at runtime
Drop _xbegin family intrinsics due to bad codegen
Implemented `notifier` class, replacing vm::notify
Minor optimization: detach transactions from global mutex on TSX path
Minor optimization: don't acquire vm::passive_lock on PPU on TSX path
2018-05-16 17:31:58 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 440a31ef18 rsx: Optimizations for program management 2018-05-13 14:44:14 +03:00
Jake 75b40931fc rsx: initial capture/replay functionality (#4510)
* rsx: initial capture/replay functionality
2018-05-13 12:18:05 +03:00
kd-11 93b2776604 rsx: Fix vertex input detection
- Properly detect inline array registers vs constant value registers
- Silence needless spam, 306E is 2D surface engiine, the assumption that y is multiplied by 306E pitch is not crazy
2018-04-05 01:06:50 +03:00
kd-11 2dce55d036 rsx: ZCULL synchronization fixes
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
  Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
2018-03-13 18:55:03 +03:00
kd-11 315798b1f4 rsx: ZCULL rewrite and other improvements
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
2018-03-13 18:55:03 +03:00
kd-11 dece1e01f4 rsx: Improve transform constants management
- Removes the duplicate local_transform_constants
- Resets the transform constants on every context reset
- Simplifies the code abit which should make it faster
- NOTE: Transform constants are persistent across context re-init events (VF5)
2018-03-13 18:55:03 +03:00
kd-11 e230867492 rsx: Properly implement raster window offsets 2018-03-13 18:55:03 +03:00
kd-11 84b8a08d26 rsx: Basic performance counters 2018-03-13 18:55:03 +03:00
kd-11 a8ab408f64 rsx: Account for null blit ops (memcpy)
- Do not perform extra memory tasks if no actual image copy was performed
2018-02-16 16:14:54 +03:00
kd-11 bd297d079d rsx: Minor optimizations 2018-02-16 16:14:54 +03:00
kd-11 a5500ebfa4 rsx: Fix disjoint draw range splitting
- Fixes flickering and missing draws in R&C and other games such as Motorstorm Apocalypse and Okami HD when strict mode is disabled
2018-02-16 16:14:54 +03:00
kd-11 89c548b5d3 rsx: fbo fixes 2.5
- Implement flush-always behaviour to partially fix readback from a currently bound fbo
  - Without this, only the first read is correct, as more draws are added the results become 'wrong'
  - Fixes WCB and cpublit behviour
- Synchronize blit_dst surfaces to avoid data loss when gpu texture scaling is used
  - Its still faster in such cases to disable gpu texture scaling but some types cannot be disabled without force cpu blit (e.g framebuffer transfers)
- Memory management tuning
  - rsx: on-demand texture cache rescanning for unprotected sections
  - rsx: Only framebuffer resources are upscaled
  - Do not resize regular blit engine resources
  - Lazy initialize readback buffer when using opengl
  -- These measures should help minimize vram usage
2018-02-16 16:14:54 +03:00
Nekotekina cce0ad0c35 Clean vm::ps3 namespace use 2018-02-09 17:49:37 +03:00
Jake 7ca2c444cc rsx: Fix depth clipping 2018-01-14 20:50:55 +03:00
Jake ac53fc54dc rsx: fix image_in arg and swizzle fix 2018-01-14 20:50:55 +03:00
kd-11 ee009ec99c rsx: Robustness fixes
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
2018-01-02 21:17:56 +03:00
kd-11 55c324e062 rsx: Invalidate surface configuration if stencil state is changed
- Stencil state afects validity of a depth/stencil surface same as depth state
2017-12-31 12:43:40 +03:00
kd-11 4819847c46 rsx: Modify semaphore_acquire timeout detection
- Take paused state into account
- Make timeout configurable
2017-12-22 20:08:14 +03:00
kd-11 de5dab35e0 rsx: Raise semaphore timeout duration bacause some games are very slow 2017-12-18 10:45:37 +03:00
kd-11 ff0f1510e5 rsx: Minor fixes
- Abort nv406e semaphore acquire if the rsx thread stalls/crashes
- Fix texture size approximation to take mipmaps into account. Fixes some games hanging with WCB
2017-12-18 10:45:37 +03:00
Jake d0013679c0 rsx: fix image_in swizzled texture crash 2017-12-08 15:19:17 +04:00
kd-11 970d2a06e0 rsx: Properly fix DATA3F_M register alignment 2017-12-04 18:22:18 +03:00
kd-11 17340c44cc rsx: method register fixes
- Fix VERTEX_DATA_3F_M element alignment (its 16 bytes per attribute)
- Fix DATA_2S_X interpretation type. Its signed 16-bit unnormalized (s32k) and not signed normalized (s1)
2017-12-01 21:00:50 +03:00
kd-11 da1e97618b rsx: Changes to surface pitch handling
- Zeta pitch is ignored by real HW for some reason
- Monitor ptch value changes as well since they may affect disabled surfaces
- TODO: Verify if MRT pitch is really taken into consideration
2017-12-01 21:00:50 +03:00
kd-11 63f261a66d rsx: Improve framebuffer check heuristics for contested memory buffers 2017-12-01 21:00:50 +03:00
Nekotekina d366823949 RSX: fix fix (406E semaphore release) 2017-11-27 23:15:28 +03:00
Nekotekina 1344f15efd RSX: improve nv406e::semaphore_release 2017-11-26 09:02:37 +03:00
kd-11 6d2dcbd164 rsx: Enable hw blit engine for local->main memory blit operations as well 2017-11-20 15:18:57 +03:00
Nekotekina 9ef00b4a12 RSX: Rewrite frame limit 2017-11-15 21:00:02 +03:00
kd-11 b2a7eee1ec rsx: Bump shader cache ver and fix blit engine crash
- Disables blit operations if the target will have a size of 0 in any dimension
- Bumps shader cache ver to 1.1
2017-11-09 14:39:50 +03:00
kd-11 173d05b54f rsx: Optimizations
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well

addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine

vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb
2017-11-08 13:15:34 +03:00
kd-11 e4ef85b6e0 rsx: Experimental; Try to calculate pixel offset using nv3062 pitch register since we know the target block is defined with 3062 registers 2017-11-02 14:35:19 +03:00
kd-11 479aa91368 rsx: Add a debug option to force full software emulation of blit engine 2017-10-14 14:19:14 +03:00
kd-11 3836b40bf7 rsx: Fixups 2017-09-21 16:17:06 +03:00
kd-11 571dbfb7b1 rsx: Texture cache improvements
- Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO)
rsx: Fixups to allow large textures for blit operation
- Also includes checks for both leaking sections and blit regions for vulkan
hotfix for hanging when using WCB
addendum - unlock both ro and no blocks before attempting to copy memory blocks
gl: Fixups for ARB_explicit_uniform_location
- Forces glsl v 430 to make use of the extension
rsx/vk: Rework texture cache to minimize recursive access violations
- Also modifies the vulkan commandbuffer begin/end/submit mechanism
gl: Fix cached_texture_section::is_flushable to take memory protection into account
rsx: Fix blit dst offset calculation
2017-09-21 16:17:06 +03:00
kd-11 e37a2a8f7d rsx: Texture cache fixes and improvments
gl/vk/rsx: Refactoring; unify texture cache code
gl: Fixups
- Removes rsx::gl::texture class and leave gl::texture intact
- Simplify texture create and upload mechanisms
- Re-enable texture uploads with the new texture cache mechanism
rsx: texture cache - check if bit region fits into dst texture before attempting to copy
gl/vk: Cleanup
- Set initial texture layout to DST_OPTIMAL since it has no data in it anyway at the start
- Move structs outside of classes to avoid clutter
2017-09-21 16:17:06 +03:00
kd-11 061824a7ec rsx: Add support for batched multidraw
gl: Fix multidraw [WIP]
rsx: Ignore vertex base when data source is generated using arithmetic
vk: Check pending flag before doing fence poke
vk/gl: Fix for inlined array and immediate draws
rsx: Collapse joined draws when batching
2017-09-21 16:17:06 +03:00
kd-11 2d0f1f27a8 esx: Fixes to the texture cache
rsx: Blit engine improvements
- Always handle blits to and from framebuffers through the GPU
- Handle depth surfaces properly when using GL
- Check for format mismatches when blitting to the surface store [WIP]
2017-09-21 16:17:06 +03:00
kd-11 73312fc363 rsx: Several fixes and improvements
- Do not ignore non-centered pixel blitting
- Register method ac00+16
- Bump texture memory heap to account for GPU texture scaling requirements (vulkan)
- Explicit MRT location index output to better convey intent (openGL)
2017-09-21 16:17:06 +03:00
kd-11 2033f3f7dc rsx/vk/gl: Refactoring and reimplementation of blit engine
Fix rsx offscreen-render-to-display-buffer-blit surface reads
- Also, properly scale display output height if reading from compressed tile

gl: Fix broken dst height computation
- The extra padding is only there to force power-of-2 sizes and isnt used

gl: Ignore compression scaling if output is rendered to in a renderpass

rsx/gl/vk: Cleanup for GPU texture scaling. Initial impl [WIP]
- TODO: Refactor more shared code into RSX/common
2017-09-21 16:17:06 +03:00
kd-11 8358bda133 gl/rsx: Fixes to zcull pixel counting 2017-08-26 21:53:54 +03:00
Jake 7ecf6cb014 rsx: Ignore sending system reserved semaphores to renderer 2017-08-19 12:27:53 +03:00
kd-11 d54c2dd39a gl: Move vertex processing to the GPU
- Significant gains from greatly reduced CPU work
- Also reorders command submission in end() to improve throughput

- Refactors most of the vertex buffer handling
- All vertex processing is moved GPU side
2017-08-16 23:58:30 +03:00
kd-11 4c019c55d2 rsx/gl: Fix zcull queries and log conditional render modes
- Fixes a situation where a query readback is requested while zcull render is still active
2017-08-10 00:16:20 +03:00
kd-11 fcb7072fee Implement hardware zcull emulation
rsx/gl: Support s1 immediate values; ogl minor refactoring
2017-08-06 14:29:42 +03:00
RipleyTom 2d7e91ba8a Yield instead of sleeping rsx thread. (#3158)
Another Yield
2017-08-06 01:46:01 +01:00
Jake 21dd715b42 sys_rsx: implement support for lle-gcm 2017-08-02 01:33:12 +03:00