Commit graph

907 commits

Author SHA1 Message Date
kd-11 bc7ed8eaab rsx/vk: Rework MSAA implementation 2022-03-17 22:02:20 +03:00
kd-11 78b8bd80e4 rsx: Unconditionally set MSAA flags if MSAA is active 2022-03-11 01:15:13 +03:00
kd-11 1943d9819f rsx: Clean up surface cache routines around RTT invalidate 2022-03-10 20:43:58 +03:00
kd-11 59a0cf94ab rsx: Fix msvc build 2022-03-08 22:06:26 +03:00
kd-11 3e4faf602a rsx: Fix clang build 2022-03-08 22:06:26 +03:00
kd-11 454a724f4e rsx: Reduce the performance impact of enabling the profiling timer
- Just use TSC if available
2022-03-08 22:06:26 +03:00
kd-11 cfecbb24ca rsx: Avoid calling slow functions every draw call
- Use TSC for timing where interval duration matters.
- Use atomic counter for ordering timestamps otherwise.
2022-03-08 22:06:26 +03:00
kd-11 762b594927 rsx: Fully process texture if surface cache configuration changed 2022-03-08 22:06:26 +03:00
kd-11 8d3d290e33 rsx: Fix build 2022-03-08 22:06:26 +03:00
kd-11 0df903090d rsx: Optimize metrics a bit
- For some reason this has a massive impact on performance above some arbitrary threshold of calls
  Shows up under surface_cache::get_merged_memory_region when doing gathers.
2022-03-08 22:06:26 +03:00
kd-11 6812fa4764 rsx: Fix surface write coherency when MSAA is active 2022-03-08 22:06:26 +03:00
kd-11 00a1864a95 Revert "rsx: Downgrade depth-1 3D images to 2D (#11593)"
This reverts commit 6c096b72b5.
2022-03-01 21:51:55 +03:00
kd-11 6c096b72b5
rsx: Downgrade depth-1 3D images to 2D (#11593)
- Fixes problems with implicit view types derived from dimensions.
2022-03-01 10:45:50 +03:00
kd-11 ec3e8de780 rsx: End the current frame before performing cache cleanup to release in-flight data 2022-02-10 22:20:56 +03:00
kd-11 2d9f21a2ea rsx: Lower performance warnings to 'warn' level instead of 'error' level to avoid causing panic for users 2022-02-07 09:25:01 +03:00
kd-11 247759b75b rsx: Fix memory tagging and add some security checks 2022-02-07 09:25:01 +03:00
kd-11 dca3d477c9 vk: Use image hot-cache for faster allocation times
- Creating new images is expensive.
- We can keep around a set of images that have been recently discarded and use them instead of creating new ones from scratch each time.
2022-02-06 15:49:50 +03:00
kd-11 86919ec0e1 rsx: Validate requested images before attempting to upload them
- Do not allow dimensions of 0 to reach the backend APIs
2022-01-30 14:58:51 +03:00
kd-11 3fa45ff994 Fix missing typeless info update 2022-01-26 12:08:36 +03:00
Nekotekina 0db9850a73 Add loop building utilities for ASMJIT
Refactor copy_data_swap_u32 a bit
2022-01-25 03:16:37 +03:00
Nekotekina 12c83b340d Remove built_function
With today's branch prediction techniques, it's hardly useful.
2022-01-24 22:21:41 +03:00
kd-11 2f7d38bb81 rsx: Improve coverage checking logic to handle 3D and cubemap resources 2022-01-23 00:03:03 +03:00
kd-11 4f8b5849b7 rsx: Take depth into account when calculating coverage 2022-01-23 00:03:03 +03:00
kd-11 7f216f2581 rsx: Fix local slice height calculation 2022-01-23 00:03:03 +03:00
nastys 801e7f3c2f macOS: Implement texture swizzling for 16-bit formats 2022-01-22 00:17:17 +01:00
Nekotekina 4704367382 Remove unnecessary asmjit::imm_ptr 2022-01-18 00:10:32 +03:00
Nekotekina 14cca55b50 PPU: refactor vector rounding instructions
Fix: nearbyint -> roundeven
2022-01-18 00:10:32 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
kd-11 6d737e61fd rsx: Use 32 bit integers for pitch
- RSX max pitch = 65536 which requires 17 bits
2022-01-10 12:27:30 +03:00
kd-11 83026fd263 rsx: use coverage ratio to determine when too much data is overlapping 2022-01-07 22:55:27 +03:00
kd-11 92824b6729 rsx: Rework invalidation tagging 2022-01-07 22:55:27 +03:00
kd-11 7563655221 rsx: Bump surface removal threshold values
- It is much slower to attempt surface removal than to render duplicates on the host GPU
2022-01-07 22:55:27 +03:00
kd-11 6889b48973 rsx: Add optimized version of section removal code 2022-01-07 22:55:27 +03:00
Nekotekina cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina 8b4b6ba946 copy_data_swap_u32: build AVX-512 path 2021-12-26 14:40:21 +03:00
Nekotekina 599e00d6da BufferUtils: remove dead code (vertex streaming)
RIP. It won't be useful.
2021-12-26 14:40:21 +03:00
Nekotekina 3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
nastys a0040e6fb1
macOS: Implement texture converter for Metal (2) (#11289)
* macOS: Implement texture converter for Metal (2)

* Fix texture conversion formatting
2021-12-24 15:46:37 +03:00
kd-11 39ef39aa4e rsx: Exercise caution when testing for overlaps in invalidated sections 2021-12-24 15:13:33 +03:00
Nekotekina 262ff01619 Use aligned stores in write_index_array_data_to_buffer
Ensure that target buffer is cache line aligned.
Improve stx::make_single to support alignment.
2021-12-21 23:28:09 +03:00
Nekotekina 76ccaf5e6f BufferUtils: refactoring
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
2021-12-21 23:28:09 +03:00
Nekotekina d6420b8803 Put std::hash specialization out of std 2021-12-07 13:04:10 +03:00
kd-11 44fe6f6d39 rsx: Fix sloppy format matching test 2021-11-27 17:47:41 +03:00
kd-11 4df1a938b1 Unused var 2021-11-24 16:02:24 +03:00
kd-11 94a3b1cfe8 rsx: Roll back some optimizations
- Just use RGB565 for all blit targets. Avoids really dumb transforms done by GPU hw.
- When X16 is used, all the channels get written to R channel alone. CmdBlit does perform format conversion!
- gl: Force image copy when blit is requested with compatible targets. Avoids format conversion issues.
2021-11-24 16:02:24 +03:00
kd-11 a21c6c4628 rsx: Fix handling of scaling requests for packed formats
- One does not simply interpolate RGB565 components as U16 data!
2021-11-24 16:02:24 +03:00
kd-11 97bd8f7bc1 rsx: Update sampler format class when inheriting mipmap slices/sections 2021-11-24 16:02:24 +03:00
Megamouse 4d0330bf82 rsx: fix possible segfault 2021-11-16 09:31:16 +01:00
kd-11 ad00c44231 rsx: Configure pitch correctly for pitch-zero textures (1D) 2021-11-03 16:58:30 +03:00