rpcsx

mirror of https://github.com/RPCSX/rpcsx.git synced 2025-12-06 07:12:14 +01:00

Author	SHA1	Message	Date
kd-11	3e28e4b1e0	rsx/decompiler: Restructure program register behavior - Fix reading of varying registers in FP Different registers have different behavior - Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1) - Reimplements two-sided lighting correctly without hacks - Also bumps shader cache version	2019-08-26 20:03:31 +03:00
kd-11	fe6ff8622a	rsx: Decompiler fixups for conditional execution - Cond actually obeys vector mask	2019-08-26 20:03:31 +03:00
kd-11	67dac94704	rsx/fp: Zero-initialize FragDepth register to match hw	2019-08-21 21:17:15 +03:00
kd-11	c9501b60ab	rsx: Use explicit fma for MAD emulation	2019-06-25 20:50:54 +03:00
kd-11	c655036920	rsx/fp: Ease pressure on fragment shaders when emulating clamp16 - TODO: Option to completely skip clamping in some architectures as it is not needed in most games - Mostly affects older GPUs that do not have access to native fp16	2019-06-14 16:19:52 +03:00
scribam	8f2647555a	rsx: Apply Clang-Tidy fix "readability-redundant-string-init"	2019-06-12 15:11:52 +03:00
scribam	db926ee671	rsx: Apply Clang-Tidy fix "performance-unnecessary-value-param"	2019-06-12 15:11:52 +03:00
Nekotekina	dfd50d0185	Implement std::bit_cast<> Partial implementation of std::bit_cast from C++20. Also fix most strict-aliasing rule break warnings (gcc).	2019-06-02 23:22:16 +03:00
scribam	6c5ea068c9	Remove redundant semicolons Fix "-Wextra-semi" warnings	2019-05-12 18:32:11 +03:00
kd-11	3cbccdd760	rsx: Fragment shader decompiler cleanup TODO: Investigate the _s input modifier behaviour further, in case it can avoid generating zeroes from a MAD instruction. x = MAD(+ve, -ve, -ve) with _s input modifier in BFBC expects result to be Non-zero	2019-04-25 16:23:05 +03:00
kd-11	4cd1c25729	"rsx: Ignore argument sign for SQRT operations"	2019-04-25 16:23:05 +03:00
kd-11	32396ba366	rsx: Simplify use of some mixed input functions using OPFLAGS to avoid implicit conversions	2019-04-25 16:23:05 +03:00
kd-11	f12bd8068c	rsx: Fragment decompiler fixups - Properly test for NaN and Inf when clamping down to fp16 - Optimize divsq a bit; mix(vec, vec, bvec) emits OpSelect which is what we want here, instead of component-wise selection which is much slower.	2019-04-25 16:23:05 +03:00
kd-11	abe7188acf	rsx: Proper workaround for broken DIVSQ instruction on realhw - While mul(0, nan) = nan and 0 / 0 = nan, 0 / sqrt(0) = 0 because of hw gremlins. normalize(0) is also nan so this behaviour does not work around that particular case either which makes it even more baffling.	2019-04-25 16:23:05 +03:00
kd-11	60f3059d22	rsx: Compensate for nvidia's low precision attribute interpolation - The hw generates inaccurate values when doing perspective-correct interpolation of vertex output attributes and makes the comparison (a == b) fail even when they are a fixed constant value. - Increase equality tolerance when doing comparisons in fragment shaders for NV cards only to work around this issue. - Teepo fix	2019-04-25 16:23:05 +03:00
kd-11	463b1b220d	rsx: Improve accuracy of shadow compare Ops when non-integer depth formats are used - The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour - D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders	2019-04-25 16:23:05 +03:00
kd-11	06a85f00d1	rsx: Shader decompiler cleanup and improvements - Improve support for float16_t by minimizing mixed inputs to functions (ambiguous overloads) - Minimize amount of downcasts in code by using opcode flags - Re-enable float16_t support for vulkan	2019-04-25 16:23:05 +03:00
kd-11	a668560c68	rsx: Use native half float types if available - Emulating f16 with f32 is not ideal and requires a lot of value clamping - Using native data type can significantly improve performance and accuracy - With openGL, check for the compatible extensions NV_gpu_shader5 and AMD_gpu_shader_half_float - With Vulkan, enable this functionality in the deviceFeatures if applicable. (VK_KHR_shader_float16_int8 extension) - Temporarily disable hw fp16 for vulkan	2019-04-25 16:23:05 +03:00
kd-11	ee319f7c13	rsx: Implement strict clamp16 operation needed for NVIDIA cards	2019-04-25 16:23:05 +03:00
kd-11	736415fcd9	rsx/fp: Detect broken/NOP shaders automatically - Do not compile body if the shader is of no consequence, leave as a passthrough shader	2019-01-25 14:34:22 +03:00
kd-11	4b79ef1ad9	rsx: Implement stencil mirror views - Implements a mirror view of D24S8 data that accesses the stencil components. Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data - Add a few missing destructors Image classes are inherited a lot and I forgot to make the dtors virtual	2018-12-24 09:05:19 +03:00
kd-11	696b91cb9b	rsx: Reimplement conditional execution in shaders - Per-channel conditional execution introduces RAW hazards all over the place - Its cheaper to process both branches and select between the two - Also improves ShaderVariable functionality to allow functionality such as match_size and taking complex variables as inputs	2018-12-24 09:05:19 +03:00
scribam	d7bb59cd99	c++17: use std::size	2018-09-06 13:15:59 +03:00
Nekotekina	ce4c4696dd	Try to get rid of SIZE_32 macro	2018-09-03 21:40:36 +03:00
kd-11	dea5193fd7	rsx: Fix FP temp register count	2018-09-03 21:39:18 +03:00
eladash	f349695a75	Rsx: rewrite address translation	2018-08-13 16:16:34 +03:00
scribam	04ad49de4d	typos	2018-05-14 21:14:39 +04:00
kd-11	9fc1740608	rsx/fp: Fragment program overhaul - Separate TXB from TXL: They are completely different! - Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment - Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work - Document some more bits in SHADER_CONTROL register	2018-03-25 13:31:06 +03:00
kd-11	27552891ad	rsx/fp: Improvements - Export some debug information in the free texture register space components zw Very useful when analysing renderdoc captures - Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read Some engines (UE3) read all the components in the shader and use mul/mad with the result	2018-03-25 13:31:06 +03:00
kd-11	d41b49d8b4	rsx/fp: Color output registers are always present and zero initialized - According to NV_fragment_program spec, registers are zero initialized always - A program even without writing to these registers will have black (0, 0, 0, 0) output Confirmed behaviour with MotorStorm games. Their engine uses this quirk to clear color buffers when doing depth replace Might be an unfixed game bug	2018-03-13 18:55:03 +03:00
kd-11	68b3229756	rsx/fp: Improve rgister component gather detection - Also avoids clobbering register data by keeping gathered bits in a temp var	2018-03-13 18:55:03 +03:00
kd-11	a64bea1286	rsx/fp: Discard shaders with undefined (non-existent) writes. On nvidia+vulkan, undefined writes autofill with blue color	2018-02-16 16:14:54 +03:00
kd-11	89c548b5d3	rsx: fbo fixes 2.5 - Implement flush-always behaviour to partially fix readback from a currently bound fbo - Without this, only the first read is correct, as more draws are added the results become 'wrong' - Fixes WCB and cpublit behviour - Synchronize blit_dst surfaces to avoid data loss when gpu texture scaling is used - Its still faster in such cases to disable gpu texture scaling but some types cannot be disabled without force cpu blit (e.g framebuffer transfers) - Memory management tuning - rsx: on-demand texture cache rescanning for unprotected sections - rsx: Only framebuffer resources are upscaled - Do not resize regular blit engine resources - Lazy initialize readback buffer when using opengl -- These measures should help minimize vram usage	2018-02-16 16:14:54 +03:00
kd-11	33bcdd476c	glsl/fp/vp: Avoid shader clutter - Do not add unused subroutines in shaders unless necessary -- makes shaders easier to read and disassembled spir-v has less clutter - glsl: Replace switch block with lookup table	2018-01-30 21:16:43 +03:00
kd-11	648fc92184	rsx/fp/vp: Epsilon value is too large! - Original epsilon value was 1.E-10 which nvidia linux driver could not read properly -- Restores the original value represented in decimal notation	2018-01-30 21:16:43 +03:00
kd-11	1ea5e7404a	rsx: Workaround for nvidia linux - For some reason, using 1.E-x notation does not work on nvidia linux. Could be a bug in spir-v generator or the driver itself	2017-12-31 12:43:40 +03:00
kd-11	47060cdc5f	rsx/fp: Fix typo	2017-12-18 10:45:37 +03:00
kd-11	7dd349ae8e	Update FragmentProgramDecompiler.cpp	2017-12-18 10:45:37 +03:00
kd-11	4e80858bed	rsx/fp: Hotfix for TEXBEM/TXPBEM	2017-12-18 10:45:37 +03:00
kd-11	e89a035e8b	rsx/fp: Implement TXPBEM	2017-12-18 10:45:37 +03:00
kd-11	f7c52d3bb7	rsx/fp: Implement TEXBEM (untested)	2017-12-18 10:45:37 +03:00
kd-11	6f8dd20f03	rsx/fp: Stuff - Implement BEM - Add LG2 to special instructions	2017-12-18 10:45:37 +03:00
kd-11	cdd4fd9867	rsx/fp: Explicitly insert global functions. - Functions such as pack/unpack ops must exist before the shared gather functions are declared	2017-12-04 18:22:18 +03:00
kd-11	896c8991de	rsx/fp: Properly implement PK/UP instructions based on NV_fragment_program documentation	2017-12-01 21:00:50 +03:00
kd-11	fe9090bd39	rsx/fp: Implement register gather (only for UP(X) instructions) - Workaround for temp register aliasing between H and R variants - TODO: Implement temp regs as 128 bit-blocks with r/w as pack/unpack	2017-12-01 21:00:50 +03:00
kd-11	a18ae0f6ac	rsx/fp: Reimplement PK(X) and UP(X) opcodes. The read back values are obviously in normalized range - Confirmed with a GOW shader which writes result of UP8 to BGRA8 output	2017-12-01 21:00:50 +03:00
kd-11	4d75e98647	rsx/fp: Do not apply input mods to all types of inputs - Temp registers are confirmed to be affected - Const registers are confirmed to be unaffected - Varying inputs are not confirmed yet	2017-12-01 21:00:50 +03:00
kd-11	df7d52b177	rsx/fp: Give abs higher prio as it invalidates any precision checks	2017-11-20 15:18:57 +03:00
kd-11	f5addbf751	rsx/fp: improve SRC modifier order - Neg modifier is applied after clamping. Abs has not been tested/proven so precision clamp goes first now, not last	2017-11-20 15:18:57 +03:00
kd-11	a8c0dd649e	rsx/fp: RE work on precision modifier bits - Testing DS2 has revealed clamping bits in SRC1 that were not respected and left negative values reaching the framebuffer	2017-11-20 15:18:57 +03:00

1 2 3

106 commits