Commit graph

3541 commits

Author SHA1 Message Date
kd-11 3942a464fe vk: Avoid leaking descriptor copies 2022-01-20 19:21:24 +03:00
kd-11 2331dc3256 vk: Keep the total number of allocated samplers under control 2022-01-20 19:21:24 +03:00
Nekotekina 4704367382 Remove unnecessary asmjit::imm_ptr 2022-01-18 00:10:32 +03:00
Nekotekina 14cca55b50 PPU: refactor vector rounding instructions
Fix: nearbyint -> roundeven
2022-01-18 00:10:32 +03:00
kd-11 000ec71629 Fix invalid descriptor setup if subdraw0 has broken vertex setup 2022-01-17 12:38:10 +03:00
kd-11 3e794e7fdb rsx: Optimize 8-bit rounding logic a bit
- NV hw does not like the raw use of round()
2022-01-17 10:28:23 +03:00
kd-11 c38ca21a81 rsx: Round up 8-bit ROP output on NVIDIA cards
- NV GPUs have a tendancy to be off by a very small margin, breaking rendering when greaterThan/lessThan checks are used.
- NOTE: Currently this setting is using the sRGB flag which indicates 8-bit output.
  Only one game is currently known to care about this behaviour so this is good enough for now.
2022-01-17 10:28:23 +03:00
kd-11 f923eaf09a rsx: Surface format remapping enhancements 2022-01-17 10:28:23 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
kd-11 d6aa834b5f vk: Enable shading rate hack for all GPUs
- This is a hack, ideally we should be using coverage-based masking when writing the exploded texture.
- We do not have access to the fragment coverage mask and it is non-trivial to integrate it in a competent manner.
2022-01-14 10:21:38 +03:00
kd-11 6d737e61fd rsx: Use 32 bit integers for pitch
- RSX max pitch = 65536 which requires 17 bits
2022-01-10 12:27:30 +03:00
kd-11 83026fd263 rsx: use coverage ratio to determine when too much data is overlapping 2022-01-07 22:55:27 +03:00
kd-11 92824b6729 rsx: Rework invalidation tagging 2022-01-07 22:55:27 +03:00
kd-11 7563655221 rsx: Bump surface removal threshold values
- It is much slower to attempt surface removal than to render duplicates on the host GPU
2022-01-07 22:55:27 +03:00
kd-11 6889b48973 rsx: Add optimized version of section removal code 2022-01-07 22:55:27 +03:00
Eladash bba528e2ae
rsx: Fix wrong fault report in initialization (#11323)
* rsx: Fix wrong fault report in initialization

* Ensure emu.isstopped() == true at RPCS3 startup

Based on zero initialization.
2022-01-05 20:41:01 +03:00
kd-11 7c47b0029c gl: Fully drop alignment restriction for compressed textures
- This is just not part of spec, there is no enforcement for multiple of block size for width or height of s3tc compressed images.
- This restriction does indeed exist for ASTC and ETC but we're not using those formats.
2022-01-02 14:29:38 +03:00
Nekotekina cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina 510041a873 rsx_methods.cpp: optimize compile time (120s to 10s)
Untemplate NV308A_COLOR
2021-12-26 14:40:21 +03:00
Nekotekina 8b4b6ba946 copy_data_swap_u32: build AVX-512 path 2021-12-26 14:40:21 +03:00
Nekotekina 599e00d6da BufferUtils: remove dead code (vertex streaming)
RIP. It won't be useful.
2021-12-26 14:40:21 +03:00
Nekotekina 3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
kd-11 a9303acfdf rsx: Fix zclip w scaling 2021-12-26 12:50:31 +03:00
nastys a0040e6fb1
macOS: Implement texture converter for Metal (2) (#11289)
* macOS: Implement texture converter for Metal (2)

* Fix texture conversion formatting
2021-12-24 15:46:37 +03:00
kd-11 28d7af313b rsx: Remove noisy debug print 2021-12-24 15:13:33 +03:00
kd-11 39ef39aa4e rsx: Exercise caution when testing for overlaps in invalidated sections 2021-12-24 15:13:33 +03:00
kd-11 56dd09f4fe rsx: Handle floating point shenanigans
- If near and far clip are too close together, the API will not distinguish between them leading to out of bounds values
2021-12-22 22:08:53 +03:00
kd-11 de495952fd rsx: Enable fallback for devices without wide integer Z buffers 2021-12-22 22:08:53 +03:00
kd-11 1ce5349199 rsx: Remove zclip hackery
- Calculates precise Z value as requested by the game
- Works properly if the underlying Z format matches the PS3 1:1 but may cause minor problems otherwise
2021-12-22 22:08:53 +03:00
Nekotekina 12e3c9e08b Use PAUSE in vk::query_pool_manager::get_query_result 2021-12-21 23:28:09 +03:00
Nekotekina 262ff01619 Use aligned stores in write_index_array_data_to_buffer
Ensure that target buffer is cache line aligned.
Improve stx::make_single to support alignment.
2021-12-21 23:28:09 +03:00
Nekotekina 76ccaf5e6f BufferUtils: refactoring
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
2021-12-21 23:28:09 +03:00
nastys 47e4a95d8f
Fix remap_vector redefinition on macOS (#11271) 2021-12-21 10:36:09 +01:00
nastys 08333e0876
macOS moltenVK support and SIGBUS handling (#11252) 2021-12-12 21:35:56 +01:00
kd-11 d523f9cc6b rsx: Avoid skipping input mask checks due to static flow control 2021-12-08 23:58:32 +03:00
kd-11 7ca15c60bb rsx: Improve image aspect tests
- Replace old format-based detection with proper aspect test.
  Explicit image aspect has been available for a long time, but older
  code was not updated.
2021-12-08 23:58:32 +03:00
Nekotekina d6420b8803 Put std::hash specialization out of std 2021-12-07 13:04:10 +03:00
DH 49c02854f5 [rsx] reduce size of config structs 2021-12-02 21:36:57 +03:00
DH cccfb89aa0 [Config] Use std::less<> for std::map<...>
Reduces amount of string copies
[Utilities] fmt::replace_all: avoid creation of temporary strings
2021-12-02 21:36:57 +03:00
kd-11 02832d9623
rsx: Add some sensible fallbacks (#11219)
* rsx: Add some sensible fallbacks

* Update GLPresent.cpp

* Update VKPresent.cpp

* Update rsx_utils.h

* Update rsx_utils.cpp
2021-12-02 16:02:08 +03:00
kd-11 9bb46aa944 rsx: Simplify unconstrained aspect ratio conversion
- There is a reason resolutions are defined by only a height variable.
2021-12-01 21:55:53 +01:00
Megamouse aea1ec2594 avconf: Add const to fxo references 2021-12-01 21:55:53 +01:00
kd-11 22a7b026e7 rsx: Fix image scaling
- Specifically fixes a corner case where double transforms are required.
  Technically this can be made more readable using transformation matrices:
  * M1 = transform_virtual_to_physical()
  * M2 = transform_image_to_virtual()
  * M3 = M1 * M2
  * Result = Input * M3
  But we don't use a CPU-side matrix library and it is not reasonable to do this on the GPU.
2021-12-01 21:55:53 +01:00
Megamouse c8d4a0dcdc VK/GL: honor game's aspect ratio when scaling 2021-12-01 21:55:53 +01:00
kd-11 38bfefcdfa vk: Fix incorrect mixed transfer modes for mipmapped VTC 2021-11-28 01:44:21 +03:00
kd-11 44fe6f6d39 rsx: Fix sloppy format matching test 2021-11-27 17:47:41 +03:00
orbea a84223bdc6 rpcs3: Fix the DATADIR path for AppImage
Even when DATADIR is defined the other paths may still be correct.

Fixes: https://github.com/RPCS3/rpcs3/issues/11195
2021-11-24 19:14:06 +01:00
kd-11 4df1a938b1 Unused var 2021-11-24 16:02:24 +03:00
kd-11 94a3b1cfe8 rsx: Roll back some optimizations
- Just use RGB565 for all blit targets. Avoids really dumb transforms done by GPU hw.
- When X16 is used, all the channels get written to R channel alone. CmdBlit does perform format conversion!
- gl: Force image copy when blit is requested with compatible targets. Avoids format conversion issues.
2021-11-24 16:02:24 +03:00