Commit graph

1322 commits

Author SHA1 Message Date
kd-11 bc7ed8eaab rsx/vk: Rework MSAA implementation 2022-03-17 22:02:20 +03:00
kd-11 1943d9819f rsx: Clean up surface cache routines around RTT invalidate 2022-03-10 20:43:58 +03:00
kd-11 cfecbb24ca rsx: Avoid calling slow functions every draw call
- Use TSC for timing where interval duration matters.
- Use atomic counter for ordering timestamps otherwise.
2022-03-08 22:06:26 +03:00
kd-11 0df903090d rsx: Optimize metrics a bit
- For some reason this has a massive impact on performance above some arbitrary threshold of calls
  Shows up under surface_cache::get_merged_memory_region when doing gathers.
2022-03-08 22:06:26 +03:00
kd-11 6812fa4764 rsx: Fix surface write coherency when MSAA is active 2022-03-08 22:06:26 +03:00
kd-11 0dbfe314a3 vk: Encode image type when caching resources 2022-03-01 21:51:55 +03:00
kd-11 00a1864a95 Revert "rsx: Downgrade depth-1 3D images to 2D (#11593)"
This reverts commit 6c096b72b5.
2022-03-01 21:51:55 +03:00
kd-11 6c096b72b5
rsx: Downgrade depth-1 3D images to 2D (#11593)
- Fixes problems with implicit view types derived from dimensions.
2022-03-01 10:45:50 +03:00
kd-11 e035000864 vk: Do not enable passthrough DMA unconditionally (yet)
- There are still some kinks to work out. Host labels do not fix all the bugs which means I missed something.
2022-02-26 10:28:46 +03:00
kd-11 f3823232e0 Disable passthrough DMA for proprietary intel driver 2022-02-23 21:15:08 +03:00
kd-11 6b8b23c401 vk: Drain the label queue before using the CPU fallback to avoid out-of-order signals
- This avoids crashes in some game engines which expect RSX semaphores to signal in the order they are submitted.
2022-02-23 12:57:04 +03:00
kd-11 6fd2a9b677 rsx: Remove leftover dprints 2022-02-23 12:57:04 +03:00
kd-11 da559b5568 vk/rsx: Tuning and optimization for host labels 2022-02-23 12:57:04 +03:00
kd-11 c7e49b58a8 rsx: Implement host GPU sync labels 2022-02-23 12:57:04 +03:00
kd-11 10e6b43a2f Drop redundant declaration 2022-02-21 23:58:01 +03:00
kd-11 0809e7cf9f Fix build 2022-02-21 23:58:01 +03:00
kd-11 12fd43e1c6 vk: Remove unused variables 2022-02-21 23:58:01 +03:00
kd-11 397a795e75 vk: Remove hardcoded command buffer list length 2022-02-21 23:58:01 +03:00
kd-11 1f9ade0ab6 vk: Remove pointless function (VKGSRender::open_command_buffer)
A relic of the past, back before we wrote wrappers for raw handles.
2022-02-21 23:58:01 +03:00
kd-11 83407c386c vk: Move renderer types to a separate file
- Makes my life easier managing conflicts
2022-02-21 23:58:01 +03:00
kd-11 b791d90b35 vk: Rewrite command buffer chains 2022-02-21 23:58:01 +03:00
nastys 7801e8368b Add MoltenVK Semaphore setting 2022-02-20 08:47:16 +01:00
kd-11 254ddcad51 vk/dma: Initialize COW DMA block contents to avoid leaks
- It is possible to lose data when uploading since the result of map_dma can change types and handles.
- Consider sync-on-exit for inherited spans

Not a problem when using passthrough DMA, but this extension does not work properly on NVIDIA + windows
2022-02-16 16:33:27 +03:00
kd-11 314b63eebf vk: Drop unused native format ABGR8 2022-02-13 15:31:39 +03:00
kd-11 df5295ae85 vk: Per work-queue scratch resources
- Avoids parallel tasks from trampling over each other's data
2022-02-13 14:39:42 +03:00
kd-11 c8ad8b18bb vk: Ignore queue transfer stuff when using 'fast' mode 2022-02-13 14:39:42 +03:00
kd-11 44cc254620 Fix linux build 2022-02-13 14:39:42 +03:00
kd-11 cef512a123 vk: Spec-compliant async compute 2022-02-13 14:39:42 +03:00
kd-11 f667b52cca vk: Rewrite resource management 2022-02-10 22:20:56 +03:00
kd-11 48b54131f6 vk: Fix up multiple resource allocation routines
- Originally part of async bringup. Imported to allow smoother transition.
2022-02-10 22:20:56 +03:00
kd-11 2d9f21a2ea rsx: Lower performance warnings to 'warn' level instead of 'error' level to avoid causing panic for users 2022-02-07 09:25:01 +03:00
kd-11 90d368ae30 vk: Speed up cached image search a bit 2022-02-06 15:49:50 +03:00
kd-11 a2d33a7d76 vk: Fix WCB crash 2022-02-06 15:49:50 +03:00
kd-11 51f9310b9f vk: Silence compiler warnings 2022-02-06 15:49:50 +03:00
kd-11 dca3d477c9 vk: Use image hot-cache for faster allocation times
- Creating new images is expensive.
- We can keep around a set of images that have been recently discarded and use them instead of creating new ones from scratch each time.
2022-02-06 15:49:50 +03:00
kd-11 86919ec0e1 rsx: Validate requested images before attempting to upload them
- Do not allow dimensions of 0 to reach the backend APIs
2022-01-30 14:58:51 +03:00
kd-11 0e320d17c1 vk: Fix 'grow' behavior when we reach the size limit
- Just swap out the current heap ptr and spawn a fresh one. Chances are, we can spare 1GB of host memory.
2022-01-30 10:56:15 +03:00
kd-11 d063f0b335 vk: Fix working buffer calculation for emulated D16F operations 2022-01-30 10:56:15 +03:00
kd-11 ffe00e8619 gl: Clean up format bitcast checks and register D32F type for FORMAT_CLASS16F
- Also hides a dangerous export for vulkan, same as GL
2022-01-26 12:08:36 +03:00
kd-11 3a1676e558 vk: Fix float16 requirement issue 2022-01-25 21:34:21 +03:00
kd-11 1fa82eec89 vk: Rework format feature validation
- Requirements have changed a lot over the years. We no longer blit Z formats around for example because they never support linear filtering
- Removing some unused requirements allows more hardware to be usable
2022-01-24 19:14:27 +03:00
kd-11 6ffd38c393 vk: Only enable DCC workaround if the format features allow it 2022-01-22 13:16:48 +03:00
nastys c7140df5f8 Initial support for Apple GPUs 2022-01-22 00:17:17 +01:00
nastys 6b5f0957ce Disable macOS swizzling workaround 2022-01-22 00:17:17 +01:00
kd-11 3942a464fe vk: Avoid leaking descriptor copies 2022-01-20 19:21:24 +03:00
kd-11 2331dc3256 vk: Keep the total number of allocated samplers under control 2022-01-20 19:21:24 +03:00
kd-11 000ec71629 Fix invalid descriptor setup if subdraw0 has broken vertex setup 2022-01-17 12:38:10 +03:00
kd-11 c38ca21a81 rsx: Round up 8-bit ROP output on NVIDIA cards
- NV GPUs have a tendancy to be off by a very small margin, breaking rendering when greaterThan/lessThan checks are used.
- NOTE: Currently this setting is using the sRGB flag which indicates 8-bit output.
  Only one game is currently known to care about this behaviour so this is good enough for now.
2022-01-17 10:28:23 +03:00
kd-11 f923eaf09a rsx: Surface format remapping enhancements 2022-01-17 10:28:23 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00