Commit graph

882 commits

Author SHA1 Message Date
Nekotekina 4704367382 Remove unnecessary asmjit::imm_ptr 2022-01-18 00:10:32 +03:00
Nekotekina 14cca55b50 PPU: refactor vector rounding instructions
Fix: nearbyint -> roundeven
2022-01-18 00:10:32 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
kd-11 6d737e61fd rsx: Use 32 bit integers for pitch
- RSX max pitch = 65536 which requires 17 bits
2022-01-10 12:27:30 +03:00
kd-11 83026fd263 rsx: use coverage ratio to determine when too much data is overlapping 2022-01-07 22:55:27 +03:00
kd-11 92824b6729 rsx: Rework invalidation tagging 2022-01-07 22:55:27 +03:00
kd-11 7563655221 rsx: Bump surface removal threshold values
- It is much slower to attempt surface removal than to render duplicates on the host GPU
2022-01-07 22:55:27 +03:00
kd-11 6889b48973 rsx: Add optimized version of section removal code 2022-01-07 22:55:27 +03:00
Nekotekina cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina 8b4b6ba946 copy_data_swap_u32: build AVX-512 path 2021-12-26 14:40:21 +03:00
Nekotekina 599e00d6da BufferUtils: remove dead code (vertex streaming)
RIP. It won't be useful.
2021-12-26 14:40:21 +03:00
Nekotekina 3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
nastys a0040e6fb1
macOS: Implement texture converter for Metal (2) (#11289)
* macOS: Implement texture converter for Metal (2)

* Fix texture conversion formatting
2021-12-24 15:46:37 +03:00
kd-11 39ef39aa4e rsx: Exercise caution when testing for overlaps in invalidated sections 2021-12-24 15:13:33 +03:00
Nekotekina 262ff01619 Use aligned stores in write_index_array_data_to_buffer
Ensure that target buffer is cache line aligned.
Improve stx::make_single to support alignment.
2021-12-21 23:28:09 +03:00
Nekotekina 76ccaf5e6f BufferUtils: refactoring
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
2021-12-21 23:28:09 +03:00
Nekotekina d6420b8803 Put std::hash specialization out of std 2021-12-07 13:04:10 +03:00
kd-11 44fe6f6d39 rsx: Fix sloppy format matching test 2021-11-27 17:47:41 +03:00
kd-11 4df1a938b1 Unused var 2021-11-24 16:02:24 +03:00
kd-11 94a3b1cfe8 rsx: Roll back some optimizations
- Just use RGB565 for all blit targets. Avoids really dumb transforms done by GPU hw.
- When X16 is used, all the channels get written to R channel alone. CmdBlit does perform format conversion!
- gl: Force image copy when blit is requested with compatible targets. Avoids format conversion issues.
2021-11-24 16:02:24 +03:00
kd-11 a21c6c4628 rsx: Fix handling of scaling requests for packed formats
- One does not simply interpolate RGB565 components as U16 data!
2021-11-24 16:02:24 +03:00
kd-11 97bd8f7bc1 rsx: Update sampler format class when inheriting mipmap slices/sections 2021-11-24 16:02:24 +03:00
Megamouse 4d0330bf82 rsx: fix possible segfault 2021-11-16 09:31:16 +01:00
kd-11 ad00c44231 rsx: Configure pitch correctly for pitch-zero textures (1D) 2021-11-03 16:58:30 +03:00
kd-11 5b0ef401f7 rsx: Fix sampling in X when 0 pitch is given
- A pitch of 0 still allows 1-dimensional addressing.
2021-10-31 14:32:42 +03:00
kd-11 78bcb0fd53 rsx: Do not reuse/destroy sections that have references held
- Avoids a situation where blit-dst and blit-src have overlapping ranges. Uploading blit-dst destroys blit-src and vice-versa.
  This is not the end of the world, but blit-src should be kept around until the operation is completed to avoid stale references!
2021-10-27 12:30:43 +03:00
kd-11 479150b214 rsx: Fix decoding of linear cubemaps
- 128-byte boundary is not observed in linear tiling. Verified in hw.
2021-10-10 16:15:28 +03:00
kd-11 dc8fc9fc79 vk: Clean up around vkQueueSubmit handling
- Explicitly declare one version for CB flush and the other for Async flush
- Always flush descriptors on CB flush in case of page fault handling.
  Other threads other than offloader can also enter the method and require normal flow.
- Fix overlapping interrupt IDs.
- Minor formatting fixes
2021-09-28 23:18:26 +03:00
kd-11 7b9fb7ad9c rsx: refactor rsx_utils a bit
- Move obviously standalone things to their own utility files
2021-09-28 17:43:15 +03:00
kd-11 3c7ada8e83 rsx: Fix 3D texture decode
- 3D mipmaps are shrunk in all 3 axes, they are not 2D array textures.
- Fixes mip1-mipN for all situations
2021-09-21 19:53:46 +03:00
kd-11 3ab9e04db7 rsx: Fix surface access bit flags
- The previous enumeration was a holdover from older access management.
- A bitflag of 0 seriously messes up the mask tests
2021-08-29 11:10:30 +03:00
kd-11 f745971cc8
rsx: Fix coordinate scaling for shadow access (#10668)
- For shadow2DProj the 3rd coordinate is actually the depth value, do not scale
2021-08-06 22:49:50 +01:00
kd-11 da3c9948e6 rsx: Revert use of std::has_single_bit
- Zero is not a power of 2 in this situation, and we do not want to treat it as such
2021-08-04 20:28:25 +03:00
kd-11 daa8265a47 rsx: Fix interpreter texture fetch 2021-08-04 20:28:25 +03:00
kd-11 8aec943093 Use c++20 has_single_bit for POT test 2021-08-03 00:36:04 +03:00
kd-11 99b6963fab rsx: Improve unnormalized coordinate sampling
- Improve rounding when sampling nearest neighbour. This is mostly a problem with NVIDIA
- Implement unnormalized 3D sampling
2021-08-03 00:36:04 +03:00
kd-11 b3c65b7bca rsx: Implement vtc encoding for NVIDIA OpenGL support 2021-08-03 00:36:04 +03:00
kd-11 0ec526c5f1 rsx: Do not use VTC tiling on NPOT textures
- Seems to be ignored for 'normal' textures. Mostly verified through games.
2021-08-03 00:36:04 +03:00
Megamouse 0a7a12bbff RSX: fix 'Working buffer not big enough' 2021-07-27 23:59:12 +02:00
kd-11 92d1534917 rsx: Set composite images upload context based on their actual contents 2021-07-27 10:52:21 +03:00
kd-11 6a9d1edee1 vk: Fix use-after-free hazard by checking if we're faulting from within the texture cache
- If we're using the texture cache, DO NOT delete resources.
2021-07-25 20:55:09 +03:00
kd-11 0d87d909c6 vk: Fix double-spill for invalidated resources 2021-07-17 21:28:11 +03:00
kd-11 d53f2f10fb rsx/vk: Improve recovery during OOM situations
- Do not spill when running on IGP with only one heap as it will just crash anyway.
- Do not handle collapse operations when OOM. This will likely just crash and there are better ways to handle old surfaces.
- Spill or remove everything not in the current working set
- TODO: MSAA spill without VRAM allocations
2021-07-17 21:28:11 +03:00
kd-11 aaac4c1bde Clang workaround for c++20 non-compliance 2021-07-15 18:05:35 +03:00
kd-11 2524c35638 vk: Improve handling of texture cache temporary resources
- Temp resources from the texture cache are used to hold composite objects being sent to the GPU and can waste a lot of memory.
- Remove them if we run out of memory as they can linger around for a long time.
2021-07-15 18:05:35 +03:00
kd-11 a2f93b0696 rsx: Implement a simple cache eviction routine
- Can remove all non-essential textures from the cache except those passed as an exclusion list
2021-07-15 18:05:35 +03:00
kd-11 77c9dff054 vk: Minor whitespace fix
- Non-functional formatting and warning fixes
2021-07-15 18:05:35 +03:00
kd-11 c18e5e07cc vk: Implement VRAM spilling
- The idea is to shift memory to "shared graphics memory" when VRAM is running out
2021-07-15 18:05:35 +03:00
kd-11 000414c47d vk: Refactor surface cache by moving code to cpp file 2021-07-15 18:05:35 +03:00