Commit graph

1777 commits

Author SHA1 Message Date
kd-11 63f803b68a Refactoring
- Make the memory allocator a unique child of the render device.
  Fixes object lifetime issues with swapchain management due to cyclic dependencies
2018-06-08 22:17:50 +03:00
kd-11 c9e367befd rsx/debug: Fix rendering when FIFO reordering is disabled 2018-06-08 22:17:50 +03:00
kd-11 1c5667f0ce vk: Avoid use-after-free fence object 2018-06-08 22:17:50 +03:00
kd-11 e9c3ab7ae6 fix some linux issues
- Fix build
- Fix VMA incompatibility with swapchain_NATIVE
2018-06-08 22:17:50 +03:00
kd-11 1b9c9267f0 rsx: Update memory flags after memory transfer 2018-06-08 22:17:50 +03:00
kd-11 fc18e17ba6 vk: Implement depth scaling using hardware blit/copy engines
- Removes the old depth scaling using an overlay.
  It was never going to work properly due to per-pixel stencil writes being unavailable
- TODO: Preserve stencil buffer during ARGB8->D32S8 shader conversion pass
2018-06-08 22:17:50 +03:00
kd-11 3150619320 rsx: Preserve read AA state separate from write AA state
- Some applications (e.g Backbreaker) use an evil hack to resolve MSAA.
  The application respecifies a formerly AA region as a region with no AA then performs a framebuffer feedback lookup.
  The old memory keeps AA during read, but writes back to itself with AA resolved.
  This is evil on several levels but it just happens to work on PS3
2018-06-08 22:17:50 +03:00
kd-11 0f24379c0e rsx: Obey MSAA resolve during memory persistence transfer
- Ugh. This is a bandaid on a festering wound, AA badly needs a rewrite

 Also silence some warnings
2018-06-08 22:17:50 +03:00
Dravonic 400079a006 Parallel shader cache loading (#4677)
* Parallel shader cache loading
2018-06-01 19:49:29 +03:00
kd-11 87b510d5bf vulkan improvements
- Remove deprecated device layers
- Reimplement overlays resource management using real heap instead of using first_vertex hack
2018-05-31 11:22:40 +03:00
kd-11 9f9e1b5fe0 overlay; Fix leak 2018-05-31 11:22:40 +03:00
kd-11 cf2cb7978b facepalm 2018-05-30 13:30:23 +03:00
kd-11 f543fb0243 vk/gl: Fix flush synchronization to be kinder to weaker CPUs but not harm higher end CPUs 2018-05-30 13:30:23 +03:00
kd-11 6362942928 rsx: Avoid semaphore acquire deadlock 2018-05-30 13:30:23 +03:00
VelocityRa c8d8a81ccd overlays: Performance Overlay 2018-05-30 12:35:41 +03:00
VelocityRa c2e17d04e1 overlays: Font improvements
* Support for using arbitrary firmware fonts
* Support for specifying font extension in `font` constructor (useful for most firmware fonts that use .TTF)
2018-05-30 12:35:41 +03:00
VelocityRa 33b01d9306 overlays: Allow for non-interactable UI components
* Also fix a few warnings in overlay_controls
2018-05-30 12:35:41 +03:00
eladash 23b380eb41 allow deallocations to unmap rsx mapped memory 2018-05-29 19:57:28 +03:00
kd-11 83f9be2524 rsx: Promote FIFO optimizations outside of strict mode
- The benefits of FIFO optimizations are huge in some cases.
  The optimizations also do not break any tested applications so no need to disable with strict mode
- A debug option is provided to disable this behaviour for testing
2018-05-29 13:54:30 +03:00
kd-11 2adb2ebb00 overlays: Avoid race condition on remove-on-update views
- Improves cleanup code to consist of 2 parts, remove then dispose. Remove
  does not deallocate the item until dispose is called on it, allowing the
  backends to first deallocate external references.
- Caller is responsible for managing list locking and tracking disposable list
  of items when external references have been cleaned up before using
  dispose method.
2018-05-29 13:54:30 +03:00
kd-11 0fc67aa2f6 gl: fix wcb regression
- Partial framebuffers and blit targets are possible!
2018-05-24 10:36:04 +03:00
kd-11 493d4e8613 fixup - Improve invalidated region checks for performance 2018-05-24 10:36:04 +03:00
kd-11 b030d1900c rsx: Fixup - fix broken memory protection fail caused by region respec
- Some applications will alternate memory between framebuffer and texture data
2018-05-24 10:36:04 +03:00
kd-11 f38f61d110 vk: Clean up memory allocation and fix GPUOpen VMA for Radeon 2018-05-23 19:07:08 +03:00
kd-11 92b5a705d8 fixup - locking 2018-05-23 19:07:08 +03:00
kd-11 b957eac6e8 rsx: Avoid calling any blocking callbacks from threads that are not rsx::thread
- Defers on_notity_memory_unmapped to only run from within rsx context
- Avoids passive_lock + writer_lock deadlock
2018-05-23 19:07:08 +03:00
kd-11 d2bf04796f Optimized cached write-through
- Allows grabbing an unsynchronized memory block if overwriting contents
anyway
- Allows flushing only specified range of memory
2018-05-23 19:07:08 +03:00
kd-11 f8d999b384 fixup - range check 2018-05-23 19:07:08 +03:00
kd-11 fbf6581249 rsx: Fix segmented memory access for rsx::super_ptr 2018-05-23 19:07:08 +03:00
kd-11 d283200e13 vk: Do not do extension test if in a fast context (enum only) 2018-05-23 19:07:08 +03:00
kd-11 3f14bc6961 rsx: Silence some meaningless error 2018-05-23 19:07:08 +03:00
kd-11 f2a3167193 rsx: Lower format compatibility severity since it confuses some people 2018-05-23 19:07:08 +03:00
kd-11 8fcd5c1e5a rsx: Texture cache fixes
1. rsx: Rework section synchronization using the new memory mirrors
2. rsx: Tweaks
    - Simplify peeking into the current rsx::thread instance.
      Use a simple rsx::get_current_renderer instead of asking fxm for the same
    - Fix global rsx super memory shm block management
3. rsx: Improve memory validation. test_framebuffer() and
tag_framebuffer() are simplified due to mirror support
4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine
5. rsx: Explicitly mark clobbered flushable sections as dirty to have them
removed
6. rsx: Cumulative fixes
    - Reimplement rsx::buffered_section management routines
    - blit engine subsections are not hit-tested against confirmed/committed memory range
      Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops
2018-05-23 19:07:08 +03:00
pauls-gh f8a0be8c3e Performance enhancement - Vulkan memory allocator (#4635)
* Incorporates the vulkan memory allocator from the AMD GPUOpen project
2018-05-23 17:02:35 +03:00
scribam 2270b8d15c vulkan: link with vulkan-1.lib instead of VKstatic.1.lib 2018-05-23 13:54:27 +03:00
Nekotekina 72574b11ff SPU: use reservation spinlocks on writes (non-TSX)
This should decrease contention by avoiding global lock
2018-05-21 21:56:14 +03:00
kd-11 c9669818eb Facepalm
- overlays: Do not free self handle!!!!
2018-05-21 15:55:25 +03:00
kd-11 f6f45b8699
Native UI refactored (#4623)
Refactor and improve native overlays
2018-05-20 23:05:00 +03:00
Nekotekina 367f039523 Build transactions at runtime
Drop _xbegin family intrinsics due to bad codegen
Implemented `notifier` class, replacing vm::notify
Minor optimization: detach transactions from global mutex on TSX path
Minor optimization: don't acquire vm::passive_lock on PPU on TSX path
2018-05-16 17:31:58 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 4836a03a7d rsx: Fix build 2018-05-13 14:44:14 +03:00
kd-11 9d1f4a2538 vk: Include RADV POLARIS and RADV VEGA in the primitive restart
blacklist
2018-05-13 14:44:14 +03:00
kd-11 bff6060bd6 rsx: Improve puller state management
- Properly identify puller spin primitives
- Add a small wake delay after exiting a spin delay. Fixes desynchronization
  It seems real hw has a small delay between cell edits to commandbuffer memory at the GET address and the changes becoming visible to the DMA puller
  Simulated with a short busy_wait, large values will improve sync but degrade performance
2018-05-13 14:44:14 +03:00
kd-11 1aa44ede31 gl: Improve AMD multidraw workaround
- Reimplements the AMD workaround using an identity buffer to avoid the performance hit of doing multiple glDrawArrays for every single compiled set
- Reimplements first/count allocation using a scratch buffer to reduce allocation overhead when large number of draw calls is used
2018-05-13 14:44:14 +03:00
kd-11 eccb57d4b8 vk: AMD primitive restart bug workaround
- Emulate primitive restart with degenerate triangles
2018-05-13 14:44:14 +03:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 440a31ef18 rsx: Optimizations for program management 2018-05-13 14:44:14 +03:00
kd-11 a52ea7f870 rsx: Improve fragment and vertex program usage
- Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search
  - Avoids detecting shader as being different because of unused textures having state changes
  - Adds better program size detection for vertex programs
- Improved vertex program decompiler
  - Properly support CAL type instructions
  - Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes
  - Fix SRC checks and abort
  - Fix CC register initialization
  - NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)
2018-05-13 14:44:14 +03:00
Jake 75b40931fc rsx: initial capture/replay functionality (#4510)
* rsx: initial capture/replay functionality
2018-05-13 12:18:05 +03:00
Nekotekina 5d15d64ec8 Memory mirror support
Implemented utils::memory_release (not used)
Implemented utils::shm class (handler for shared memory)
Improved sys_mmapper syscalls
Rewritten ppu_patch function
Implemented vm::get_super_ptr (ignores memory protection)
Minimal allocation alignment increased to 0x10000
2018-05-09 23:35:34 +03:00