Commit graph

49 commits

Author SHA1 Message Date
DH cccfb89aa0 [Config] Use std::less<> for std::map<...>
Reduces amount of string copies
[Utilities] fmt::replace_all: avoid creation of temporary strings
2021-12-02 21:36:57 +03:00
kd-11 4752c4014b vk: Implement basic descriptor updates batching 2021-09-28 17:43:15 +03:00
kd-11 24642a4c18 vk: Refactor descriptors a bit 2021-09-28 17:43:15 +03:00
kd-11 69b34693f0 vk: Simplify compute job cleanup on exit
- Just call destroy automatically on object destruct
2021-08-06 17:18:48 +03:00
Megamouse cbd895a29c
Move code to cpp (#9938)
* GL: move GLOverlays code to cpp
* GL: move GLCompute code to cpp
* VK: move VKOverlays code to cpp
* VK: move VKCompute code to cpp
2021-03-10 00:58:08 +01:00
kd-11 c2cbc62be6 vk: Refactor some uber-headers
- VKHelpers was the rug everything was swept under for a long time.
  This commit essentially deprecates its usage across most of the backend.
2021-01-10 12:04:31 +03:00
Nekotekina a8e0d261b7 types.hpp: more cleanup
Also fix compilation.
2020-12-22 19:08:09 +03:00
Nekotekina eec11bfba9 Move align helpers to util/asm.hpp
Also add some files:
GLTextureCache.cpp
VKTextureCache.cpp
2020-12-18 18:07:42 +03:00
Nekotekina 77352a2a86 Replace uint32_t with u32 2020-12-18 12:23:53 +03:00
Nekotekina 36c8654fb8 Remove HERE macro
Some cleanup.
Add location to some functions.
2020-12-10 12:30:22 +03:00
Nekotekina e055d16b2c Replace verify() with ensure() with auto src location.
Expression ensure(x) returns x.
Using comma operator removed.
2020-12-09 15:43:38 +03:00
RipleyTom af8c661a64 Remove BOM markers 2020-12-06 15:30:12 +03:00
kd-11 3ddfa288cf rsx: Use multithreaded shader compiler backend 2020-11-21 20:43:15 +03:00
Megamouse 2cee26c3e7 Cleanup some includes 2020-10-31 11:53:46 +01:00
kd-11 85dd1b4ea9 vk: Fix fconvert job issues
- Fix compilation bug caused by typo
- Invert to/from for consistent declarations
- Fix dst_swap when From == 2
2020-09-07 18:25:54 +03:00
kd-11 af9e217fa4 vk: Improve D16F handling
- Adds upload and download routines. Mostly untested, which is why the error message exists
2020-08-30 09:26:37 +03:00
kd-11 f6c6c04648 vk: Implement transport for D24S8_FLOAT data 2020-08-27 12:52:28 +03:00
kd-11 b4bf48c33b vk: Integrate shader interpreter 2020-04-30 15:02:59 +03:00
kd-11 d25ba03e82 vk: Lazy evaluate renderpass scope
- Spamming the driver with renderpass open/close cycles is bad for performance.
2020-03-15 18:39:40 +03:00
Nekotekina 15391f45d0 Modernize RSX logging (rsx_log variable) 2020-02-01 11:52:22 +03:00
kd-11 74ad525566 vk: Fixup for cs_scatter job
- Access to the stencil output has to be atomic as each 'word' is shared among 4 adjacent texels
- TODO: Can be optimized using mirrored buffer views
2020-01-15 21:12:51 +03:00
kd-11 2984300385 vk: Fix invocation alignment to support non-power-of-2 alignment 2020-01-15 15:42:36 +03:00
kd-11 ac4cadf538 vk: Fix word index counting for shuffle tasks 2020-01-15 15:42:36 +03:00
kd-11 e1b734fd12 rsx: Fix linux build 2019-12-29 13:49:46 +03:00
kd-11 93895838c7 vk: Implement hw conditional rendering 2019-12-29 13:49:46 +03:00
Eladash db4041e079 Implement rounded_div
Round-to-nearest integral based division, optimized for unsigned integral.
Used in sceNpTrophyGetGameProgress.
Do not allow signed values for aligned_div(), align().
2019-12-20 14:47:04 +03:00
Nekotekina 377e7d2a73 C-style cast cleanup VI 2019-12-04 17:56:22 +03:00
kd-11 508ffcb775 vk: Compute kernel fixups
- Adhere to workgroup count limits as exposed by the GPU vendor.
  They already execute properly even when going beyond the limits but this removes validation noise.
- Fix invocation counts for deswizzle kernel. The count was incorrect if blocksize was not 4, causing a bunch of useless work to be done.
2019-11-05 22:07:22 +03:00
kd-11 99d71fdc2a vk: Implement layer batching for the GPU swizzle decoder
- Handles all LODs per layer meaning cubemaps are now fully handled in 6 passes instead of 6 * (log2(width)) passes.
- Handles all LODs of a 3D texture in one pass as well.
- The improvements do warrant dropping down the number of allowed compute invocations a bit
2019-11-05 22:07:22 +03:00
kd-11 7a0b94f343 vk: Minor compute optimizations
- Remove use of uniform buffers for compute static data. Use push
constants instead.
- Minor touchups to the deswizzle code to avoid redundant data copies.
2019-11-05 22:07:22 +03:00
kd-11 1266b63135 vk: Enable gpu deswizzling 2019-11-05 22:07:22 +03:00
kd-11 9cd3530c98 rsx: Set up framework for hw deswizzle 2019-11-05 22:07:22 +03:00
kd-11 858014b718 rsx: Experiments with nul sink 2019-09-12 23:32:21 +03:00
kd-11 61af2b7dfc vk: Workgroup tuning for different vendors 2019-08-30 21:46:19 +03:00
JohnHolmesII a124ec4a26 Remove braces around shader source strings (warnings) 2019-06-28 01:45:29 +03:00
scribam 185fd3d257 rsx: Minor cleanup after #6055 2019-06-17 00:31:38 +03:00
kd-11 4a5bbba277 rsx: Enable MSAA
- vk: Enable depth buffer resolve+unresolve
- vk: Add AMD stenciling extension support
- rsx: Temporarily disables MSAA-compatible hacks such as transparency AA
- TODO: Add paths to optionally disable MSAA
2019-06-14 16:19:52 +03:00
scribam c4667133c4 gl/vk: Add constexpr to varying_registers and sync functions between the two backends 2019-06-12 10:59:31 +01:00
kd-11 370b9e196d vk: Improve descriptor pool management
- Add double-buffered descriptor pools to avoid use-after-free situations
- Make descriptor pools more configurable
- Also adds in a hack to allow renderdoc to capture properly
2019-05-22 01:18:46 +03:00
kd-11 1c439f6198 vk: Fix some spec violations 2019-05-16 19:25:26 +03:00
kd-11 2bec304cca vk: Allow some drivers to bypass window polling if not needed 2019-05-05 13:37:55 +03:00
kd-11 0f7af391d7 vk: Implement copy-to-buffer and copy-from-buffer for depth_stencil
formats
- Allows D24S8 and D32S8 transport via typeless channels
- Allows uploading and downloading D24S8 data easily
- TODO: Implement optional byteswapping to fix flushed readbacks with
the same method
2019-04-09 13:40:54 +03:00
kd-11 6d932b042b vk: bump max number of compute jobs from 120 to 1024
- It is possible without bugs to have a very high number of compute invocations.
2019-01-06 10:44:40 +03:00
kd-11 42851a93d4 vk: Fixup 2018-06-26 20:07:20 +03:00
kd-11 df2137781d vk: Strip 'stencil' MSB when writing d24x8 data
- Seems to contains garbage in MSB when DEPTH aspect is read back
- TODO: Implement custom depth and stencil readback routine
2018-06-26 20:07:20 +03:00
kd-11 bda65f93a6 vk: Tuning [WIP]
- Unroll main compute queue loop
- Do NOT run GPU cores on mappable memory! This has a dreadful impact on performance for obvious reasons
- Enable dynamic SSBO indexing (affects AMD)
- Make loop unrolling and loop length variable depending on hardware and find optimum
2018-06-26 20:07:20 +03:00
kd-11 5fb4009a07 vk; Add more compute routines to handle texture format conversions
- Implement le D24x8 to le D32 upload routine
- Implement endianness swapping and depth format conversions routines (readback)
2018-06-26 20:07:20 +03:00
kd-11 c60f7b89ba vk: Implement safe typeless transfer
- Used to transfer D32S8 data where it makes sense to use this variant
 - On nvidia cards, it is very slow to move aspects from D24S8 probably due to the format being faked.
   For this reason, the unsafe variant is used for both D16 and D24S8 to avoid the heavy performance loss
2018-06-18 17:32:22 +03:00
kd-11 2afcf369ec vk: Add synchronous compute pipelines
- Compute is now used to assist in some parts of blit operations, since there are no format conversions with vulkan like OGL does
- TODO: Integrate this into all types of GPU memory conversion operations instead of downloading to CPU then converting
2018-06-18 17:32:22 +03:00