Commit graph

151 commits

Author SHA1 Message Date
kd-11 d41fe80b8e Clamp MSAA sampling weights to avoid clipping 2023-07-05 02:51:04 +03:00
kd-11 465c421707 rsx: Wrap MSAA coordinates before texelFetch 2023-07-04 23:41:12 +03:00
kd-11 1671922f7e rsx: Fix shader interpreter compilation 2023-07-04 09:31:51 +03:00
kd-11 c9da795bf3 rsx: Fix vp codegen when unrestricted depth range extension is absent 2023-07-04 09:31:51 +03:00
kd-11 fac8bcc20c rsx: Formatting and tidying changes 2023-07-04 09:31:51 +03:00
kd-11 436ef1cff6 rsx: Fix shader compilation when texture ops are referenced 2023-07-04 09:31:51 +03:00
kd-11 d77a78cdf1 rsx: Rework texture coordinate handling to support clamping and a more sane scale-bias setup 2023-07-04 09:31:51 +03:00
kd-11 66cb855db0 rsx: Fix fragment program codegen 2023-07-04 09:31:51 +03:00
kd-11 fb3aa9628d rsx: Migrate vertex fetch out of the cpp file 2023-07-04 09:31:51 +03:00
kd-11 89c81d9f22 rsx: Switch common codegen to use the glsl scripts 2023-07-04 09:31:51 +03:00
kd-11 cffcfad42a rsx: Add the glsl files
- Generated from inline strings in GLSLCommon.cpp
2023-07-04 09:31:51 +03:00
kd-11 579a6c9311 rsx: Add a comment explaining the barycentric interpolation change 2023-05-02 20:46:39 +03:00
kd-11 08e7a23121 vk: Improved attribute interpolation for NVIDIA 2023-05-02 20:46:39 +03:00
kd-11 9ff6003dfc rsx: Add Ultra shader precision setting for costly accuracy settings 2023-04-18 16:25:16 +03:00
kd-11 f725ea7d0d vk: Promote barycentric interpolation to 64-bit 2023-04-18 16:25:16 +03:00
Margen67 5bb89328d0 Remove whitespace 2023-02-15 08:58:02 +01:00
Megamouse 5a63271f0e Fix openGl overlay colors 2023-02-07 13:40:47 +01:00
kd-11 9a35684507 rsx: Don't accept garbage shader input 2023-02-07 13:51:26 +03:00
kd-11 dc8652806e rsx/overlays: Support disabling vertex-snap on a per-draw-call basis 2023-02-05 01:30:20 +03:00
kd-11 64ec99be33 rsx: Unify UI rendering shaders 2023-02-05 01:30:20 +03:00
kd-11 eed9e56bf4 rsx: Allow vertex fetch from uninitialized register 2023-01-17 02:24:21 +03:00
kd-11 aa5097e0d4 glsl: Update fog enums in shaders 2023-01-11 16:48:53 +03:00
kd-11 f6027719d2 rsx: Fix vertex decode 2023-01-11 16:48:53 +03:00
kd-11 38402e78c0 rsx: Fixup vertex enums in shaders 2023-01-11 16:48:53 +03:00
kd-11 577b5ef2bd Support compiling with older SDK headers 2022-12-11 15:21:58 +03:00
kd-11 9c0b2338cf rsx: Fix shader compilation 2022-12-11 15:21:58 +03:00
kd-11 a0ef1a672c rsx: Implement interpolation using barycentrics 2022-12-11 15:21:58 +03:00
kd-11 e3b23822fd rsx: Pass on shader flags to the cache 2022-12-11 15:21:58 +03:00
kd-11 a97424d46c rsx: Fix low precision shader option 2022-11-22 12:15:18 +03:00
kd-11 c4b259e0f8 rsx: Always enable ROP output quantization on NV 2022-11-18 23:06:47 +03:00
kd-11 e04855a0da rsx: Improve ROP output handling
- Perform 8-bit quantization/rounding before emulated operations like ALPHA_TEST
2022-11-18 23:06:47 +03:00
Nekotekina ae809ad320 Unexpected bugfixes
Mostly unaligned memory access.
Also includes workarounds for ubsan execution.
2022-10-31 14:20:02 +03:00
kd-11 04f6302ecc Fix decode shader compilation 2022-10-16 19:58:30 +03:00
kd-11 6d43fcf8fb gl: Fall back to renderpass decoder on ATI drivers 2022-10-16 19:58:30 +03:00
kd-11 87411da95f gl: Explicitly declare gl_Position as invariant when using MESA 2022-10-06 06:41:24 +03:00
kd-11 362a26a404 gl: Fix D24X8 accelerated encode/decode
- PS3 D24X8 is swapped as a full word, unlike PC.
- Add missing paths to handle custom swap behavior.
2022-09-22 23:46:48 +03:00
kd-11 df36c44bc2 gl: Avoid UBO/SSBO binding index collisions
- Some drivers don't like this. Actually only RADV.
- Almost all GPUs going back 15 years have a large number of UBO slots but limited SSBO slots.
  Move UBO slots up as we have tons more headroom there.
2022-09-19 01:37:10 +03:00
Nekotekina b49a1f27eb Warning fixes 2022-09-17 16:35:02 +03:00
Eladash 4464a6c3f6 CG-Disasm: Name input/output vetex arrays 2022-08-12 15:20:48 +03:00
kd-11 3e923b4993 rsx: Optimize VTX_FMT_SNORM16 decoding
- Cuts down SNORM16 overhead by ~65%
2022-08-03 23:33:31 +03:00
Eladash b3162bd41c rsx/vp: Fix SNORM16 vertex decoding 2022-08-03 18:11:46 +03:00
kd-11 ab3cde1939 gl: Do some macro patching for intel driver 2022-07-21 22:29:40 +03:00
kd-11 82439327fa gl: Support loading data from SSBO using compute shaders
- Gives better performance than using raw draw calls.
- Does not work with all formats. The draw call version is still used when needed.
2022-07-13 02:09:58 +03:00
kd-11 9fc6382909 gl: Finalize BGRA storage format internals
- Performance is terrible but it works properly now
2022-07-13 02:09:58 +03:00
kd-11 f948ce399e gl: Implement CopyBufferToImage in software
- Overrides the drivers CopyBufferToImage handling where possible
2022-07-13 02:09:58 +03:00
Ani 2512e958fa
glsl: Avoid implicit int->uint conversions (#12220) 2022-06-12 18:05:43 +01:00
kd-11 167161d8ce rsx: Restore some accidentally removed depth-format conversion macros 2022-06-03 11:54:09 +03:00
kd-11 a6e6df1445 gl: Implement fast texture readback for D24X8 and RGBA8/BGRA8 2022-06-03 11:54:09 +03:00
kd-11 eb52ac55a7 gl: Fix AMD buffer decode 2022-05-31 23:34:14 +03:00
kd-11 d167582f6b gl: Implement on-chip buffer-to-d24x8 conversion 2022-05-31 23:34:14 +03:00
kd-11 60a2a39e88 gl: Deswizzle textures on the GPU 2022-05-31 23:34:14 +03:00
kd-11 7a434d19a6 rsx/vp: Zero-initialize temporary registers 2022-04-28 01:31:07 +03:00
kd-11 95ac7724a6 Fix typos 2022-04-28 01:31:07 +03:00
kd-11 e236ba4daf rsx: Improve lowered precision comparison emulation 2022-04-28 01:31:07 +03:00
kd-11 60cbd7a88c Automatically determine the epsilon value programatically 2022-04-13 15:48:28 +03:00
kd-11 2db68acab9 rsx: Implement Z value snapping to account for precision errors 2022-04-13 15:48:28 +03:00
kd-11 fc05511354 rsx: Optimize software sampling further for the 6-tap kernel 2022-04-04 16:51:03 +03:00
kd-11 ca35a75a7d rework weighting scheme 2022-04-04 16:51:03 +03:00
kd-11 15b7e4f05e 6-tap experiment 2022-04-04 16:51:03 +03:00
kd-11 49c84f099a rsx/glsl: Fixup 2022-04-04 16:51:03 +03:00
kd-11 43b267ea51 glsl: Rewrite MS sampling implementation 2022-04-04 16:51:03 +03:00
kd-11 a8441b28e8 rsx: Implement basic 2D bilinear filtering for MSAA images 2022-04-04 16:51:03 +03:00
kd-11 d057ffe80f rsx: Fix program generation and compact referenced data blocks 2022-03-26 16:10:18 +03:00
kd-11 9a2d4fe46b rsx: Relocatable transform constants 2022-03-26 16:10:18 +03:00
kd-11 bc7ed8eaab rsx/vk: Rework MSAA implementation 2022-03-17 22:02:20 +03:00
nastys 6b5f0957ce Disable macOS swizzling workaround 2022-01-22 00:17:17 +01:00
kd-11 3e794e7fdb rsx: Optimize 8-bit rounding logic a bit
- NV hw does not like the raw use of round()
2022-01-17 10:28:23 +03:00
kd-11 c38ca21a81 rsx: Round up 8-bit ROP output on NVIDIA cards
- NV GPUs have a tendancy to be off by a very small margin, breaking rendering when greaterThan/lessThan checks are used.
- NOTE: Currently this setting is using the sRGB flag which indicates 8-bit output.
  Only one game is currently known to care about this behaviour so this is good enough for now.
2022-01-17 10:28:23 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
kd-11 a9303acfdf rsx: Fix zclip w scaling 2021-12-26 12:50:31 +03:00
kd-11 28d7af313b rsx: Remove noisy debug print 2021-12-24 15:13:33 +03:00
kd-11 56dd09f4fe rsx: Handle floating point shenanigans
- If near and far clip are too close together, the API will not distinguish between them leading to out of bounds values
2021-12-22 22:08:53 +03:00
kd-11 de495952fd rsx: Enable fallback for devices without wide integer Z buffers 2021-12-22 22:08:53 +03:00
kd-11 1ce5349199 rsx: Remove zclip hackery
- Calculates precise Z value as requested by the game
- Works properly if the underlying Z format matches the PS3 1:1 but may cause minor problems otherwise
2021-12-22 22:08:53 +03:00
nastys 47e4a95d8f
Fix remap_vector redefinition on macOS (#11271) 2021-12-21 10:36:09 +01:00
nastys 08333e0876
macOS moltenVK support and SIGBUS handling (#11252) 2021-12-12 21:35:56 +01:00
kd-11 d523f9cc6b rsx: Avoid skipping input mask checks due to static flow control 2021-12-08 23:58:32 +03:00
DH 49c02854f5 [rsx] reduce size of config structs 2021-12-02 21:36:57 +03:00
DH cccfb89aa0 [Config] Use std::less<> for std::map<...>
Reduces amount of string copies
[Utilities] fmt::replace_all: avoid creation of temporary strings
2021-12-02 21:36:57 +03:00
kd-11 f7eacf70ec rsx: Restore shader disassembler to working state 2021-11-05 23:55:07 +03:00
kd-11 d58df667b9 rsx: Fix some texture decode instructions
- Fix TEX1D_PROJ definition
- Make TEX3D_PROJ cubemap-compatible
2021-10-12 13:47:08 +03:00
kd-11 b3725baf5a rsx: Rewrite shader decompiler texture dispatch 2021-10-09 15:10:36 +03:00
kd-11 3e09b97f58 rsx: Minor optimization; avoid preparing unused vertex streams
- Also discards unused program state variables
2021-09-28 17:43:15 +03:00
kd-11 b5dcfb3431 rsx: Rework gamma override mask from RGBA to ARGB to match other per-channel mask registers 2021-08-30 11:41:19 +03:00
kd-11 b0e352c44e Add missing const 2021-08-26 13:55:00 +03:00
kd-11 2ff407ac6a rsx/fp: Fix perspective correction handling
- Perspective correction flag multiplies VP output by HPOS.w.
  NOTE: Not the same as division by w when it comes to NaN/Inf problems!!
- Restructure indexed loads a bit to avoid re-initializing registers unnecessarily
2021-08-26 13:55:00 +03:00
kd-11 b0e5de4c9c rsx: Texcoord control mask affects decompiler output! 2021-08-26 13:55:00 +03:00
kd-11 57b9acec62 rsx: Implement indexed dynamic attribute load 2021-08-24 16:52:18 +03:00
kd-11 3eb37344cd rsx/fp: Fix indexed TEX[n] loads 2021-08-20 11:59:05 +03:00
kd-11 f745971cc8
rsx: Fix coordinate scaling for shadow access (#10668)
- For shadow2DProj the 3rd coordinate is actually the depth value, do not scale
2021-08-06 22:49:50 +01:00
kd-11 99b6963fab rsx: Improve unnormalized coordinate sampling
- Improve rounding when sampling nearest neighbour. This is mostly a problem with NVIDIA
- Implement unnormalized 3D sampling
2021-08-03 00:36:04 +03:00
Nekotekina 658b4f70ef Fix some warnings 2021-07-30 09:31:36 +03:00
Megamouse 50354253c8 replace some random Emu.Pause with fatal errors 2021-07-20 19:47:00 +02:00
kd-11 2c7c1c501d rsx: Implement support for extended vertex programs
- Some games are kinda pushing it with RSX register space and spilling VP data into adjacent unused space.
2021-06-28 10:52:05 +03:00
kd-11 d3ff67ffb5 rsx: Pass vertex attributes streamed via register write in PS3-correct format
- TODO: Optimize this, we can avoid the double bswap in FIFO and then in attribute push
  Not very important since nobody is doing register push in high-performance path.
2021-06-14 10:24:03 +03:00
kd-11 20bd723e7c rsx: Add floor workaround for GPUs with rounding issues
- Mainly affects nvidia where x/w * w can sometimes return a value smaller than x.
  In such conditions, floor(x) will return x-1 if x is an integer which is horribly wrong and exaggerates minor precision drift to great proportions.
2021-06-09 10:55:55 +03:00
kd-11 39815801aa rsx: Implement proper decoding for some obscure fragment instructions
PK4UBG and UP4UBG were dropped from the NV_fragment_program spec in 2002.
Not much information about them remains but seems pretty straightforward.
2021-06-05 21:02:14 +03:00
Ani a49446c9e9
Replace gsl::span for std::span (c++20) (#7531)
* Replace gsl::span for std::span (c++20)
* Replace gsl::byte with std::byte

Co-authored-by: Bevan Weiss <bevan.weiss@gmail.com>
2021-05-30 17:10:46 +03:00
Nekotekina 2491aad6f2 types.hpp: implement min_v<>, max_v<>, SignedInt, UnsignedInt, FPInt concepts
Restrict smax to only work with signed values for consistency.
Cleanup <climits> includes.
Cleanup <limits> includes.
2021-05-23 19:43:51 +03:00
kd-11 a84cf030bb Fixup
FreeBSD + concepts = fail
2021-05-15 23:51:12 +03:00