rpcsx

mirror of https://github.com/RPCSX/rpcsx.git synced 2026-02-15 12:14:38 +01:00

Author	SHA1	Message	Date
kd-11	27aeaf66bc	gl: Restructure buffer objects to give more control over usage - This allows creating buffers with no MAP bits set which should ensure they are created for VRAM usage only - TODO: Implement compute kernels to avoid software fallback mode for pack/unpack operations	2019-08-27 21:59:02 +03:00
kd-11	eed32cf3a4	rsx: Decompiler fixups and improvements - Fix 2D coordinate sampling of W coordinate. W is actually HPOS.w and not 1. Z is however always 0. - Optimize register usage a bit Disassembling compiled SPV shows that global declaration results in less ops than using inout modifiers. Modifiers generate extra mov instructions.	2019-08-26 20:03:31 +03:00
kd-11	3e28e4b1e0	rsx/decompiler: Restructure program register behavior - Fix reading of varying registers in FP Different registers have different behavior - Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1) - Reimplements two-sided lighting correctly without hacks - Also bumps shader cache version	2019-08-26 20:03:31 +03:00
kd-11	fe6ff8622a	rsx: Decompiler fixups for conditional execution - Cond actually obeys vector mask	2019-08-26 20:03:31 +03:00
kd-11	f9aea076ae	rsx: Implement depth_buffer_float support. - Since this is transparent to the application at all time, it only becomes a problem when doing memory transfer or DEPTH->RGBA conversion in shaders.	2019-08-26 20:03:31 +03:00
kd-11	c67c97844e	rsx: Fixup for blit engine range calculations	2019-08-21 21:17:15 +03:00
kd-11	5d1b7eb945	rsx: Fix reference leaks in texture_cache<->surface_cache communication - Properly commit orphaned blocks not invalidating existing cache structures - Do not ignore overwritten objects when commiting as unprotected fbo. Avoids stale references to invalidated surface objects.	2019-08-21 21:17:15 +03:00
kd-11	141072023b	rsx: Fix handling of ARGB8 memory - Load into memory as straightforward BGRA - Fixes a bug in vulkan caused by byte shuffling in blit engine vs shader access - Removes the need for memory shuffling when transferring into a rendertarget	2019-08-21 21:17:15 +03:00
kd-11	9cd5325962	rsx: Free memory 'held hostage' by storage sections in the surface cache - Once the memory has been captured by another surface, release the allocation	2019-08-21 21:17:15 +03:00
kd-11	be98554b40	rsx: Fix surface split logic - Calculations are supposed to be done based on the properties of the outgoing surface	2019-08-21 21:17:15 +03:00
kd-11	67dac94704	rsx/fp: Zero-initialize FragDepth register to match hw	2019-08-21 21:17:15 +03:00
kd-11	dca29def5e	rsx: Temporary workaround for race condition in blit engine	2019-08-18 20:45:48 +03:00
kd-11	5e299111cc	rsx/vk: Restructure surface access barriers and implement RCB/RDB - Implements render target data load (aka Read Color Buffer/Read Depth Buffer) - Refactors vulkan surface barrier to be much cleaner. - Removes redundant surface barrier invocations after doing a merged load from surface cache. - Adds explicit access modes when gathering surfaces from cache.	2019-08-18 20:45:48 +03:00
kd-11	dfe709d464	rsx: Surface cache restructuring - Further improve aliased data preservation by unconditionally scanning. Its is possible for cache aliasing to occur when doing memory split. - Also sets up for RCB/RDB implementation	2019-08-18 20:45:48 +03:00
kd-11	1de90bdb1f	rsx: Improve aliased data preservation - Carve out inherited region if any - Perform pitch compatibility test before assigning old_surface	2019-07-27 16:09:21 +03:00
kd-11	e2574ff100	rsx: Support CSAA transparency without multiple rasterization samples enabled	2019-07-19 15:49:08 +03:00
kd-11	ea2f4d57fa	rsx: Fixups	2019-07-17 13:29:42 +03:00
kd-11	113a49e00c	rsx: Handle cyclic references when doing memory inheritance	2019-07-17 13:29:42 +03:00
kd-11	34b06453f9	rsx: Handle lost data due to unused data sections - After splitting, the sections may not be referenced at all for anything other than just pixel storage - In such cases, either merge down or sample from the upstream source instead	2019-07-17 13:29:42 +03:00
kd-11	009e01a347	rsx: Set up for multi-section inheritance	2019-07-17 13:29:42 +03:00
kd-11	fc09572648	rsx: Implement texel border decode - Texel borders are no longer actually supported in modern APIs - Removes the border texels and uses border color instead which is incorrect but should work fine	2019-07-11 13:22:13 +03:00
kd-11	d8f753f1e8	rsx: Do not allow framebuffer surfaces that exceed their allocated pitch dimensions - Truncate surfaces to forcefully fit inside the declared region	2019-07-11 13:22:13 +03:00
kd-11	c072c511a1	rsx: Add support for slice padding rows when gathering slices for cubemap/3d	2019-07-09 16:27:59 +03:00
kd-11	ad10eb391e	vk: Reuse discarded memory whenever possible instead of recreating new objects - Memory allocations are surprisingly expensive when spammed	2019-07-03 15:52:16 +03:00
kd-11	71e809a78b	rsx: Implement dma abort in case of a reset after misprediction	2019-07-03 15:52:16 +03:00
Eladash	43f919c04b	Fixup after #6143 (#6146 ) vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145. Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code. Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct. Used sequantially consistent ordering in semaphore_release for TSX path as well. Improved memory ordering for sys_rsx_context_iounmap/map. Fixed sync bugs in HLE gcm because of not using atomic instructions. Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier. Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync. Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485	2019-06-29 18:48:42 +03:00
Eladash	1ee7b91646	Refactoring (#6143 ) Prefer vm::ptr<>::ptr over vm::get_addr. Prefer vm::_ptr/base over vm::g_base_addr with offset. Added methods atomic_t<>::bts and atomic_t<>::btr . Removed obsolute rsx:🧵:Read/WriteIO32 methods. Removed wrong check in semaphore_release. Added handling for PUTRx commands for RawSPU MFC proxy. Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise. Fixed more potential overflows that may result in wrong behaviour. Added io/size alignment check for sys_rsx_context_iounmap. Added rsx::constants::local_mem_base which represents RSX local memory base address. Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.	2019-06-29 01:27:49 +03:00
JohnHolmesII	ebb1ae6408	Properly ignore SIMD macros to avoid warning	2019-06-28 01:40:52 +03:00
JohnHolmesII	be521ff0ab	Fix warnings related to parentheses	2019-06-25 20:36:32 -07:00
kd-11	6a32f716db	rsx: Reimplement vertex layout streaming - Remove string comparisons from the hot-path! - Use attribute streaming and push constants to avoid forcing a descriptor block copy every other draw call/pass. While this isn't so bad on nvidia cards, it makes AMD cards a slideshow.	2019-06-25 20:50:54 +03:00
kd-11	358169507c	rsx: Use SSE to accelerate index buffer uploads	2019-06-25 20:50:54 +03:00
kd-11	c9501b60ab	rsx: Use explicit fma for MAD emulation	2019-06-25 20:50:54 +03:00
kd-11	6be7c58fa4	glsl: Refactoring, cleanup and optimizations - Avoid generating unused code - Reduce GPR usage in emitted code	2019-06-25 20:50:54 +03:00
Lassi Hämäläinen	c963c51a60	Remove unnecessary header includes - Manually removed lot of unneeded #includes to clean code and reduce compilation time - Reordered some of the #includes to be in more logical order	2019-06-25 17:11:10 +03:00
Lassi Hämäläinen	e9e87b8bd9	Add missing #includes to header files - Multiple header files where missing #includes to other headers that where used in the header. Correct header was included in correct order in source files which caused everything to compile. - Added missing #includes so header files correctly include all their dependencies and fixes problems with IDEs being unable to parse headers correctly due to missing symbols	2019-06-25 17:11:10 +03:00
kd-11	86119f58d6	rsx: Typo fix	2019-06-14 16:19:52 +03:00
kd-11	9d166c5bed	rsx: Force invalidate of children by issuing a resolve notification whenever the parent is written to - Fixes successive reads of an antialiased surface that is still bound between reads	2019-06-14 16:19:52 +03:00
kd-11	8a1cf2c913	rsx: Attempt to reduce stencil load overhead for nvidia cards	2019-06-14 16:19:52 +03:00
kd-11	c655036920	rsx/fp: Ease pressure on fragment shaders when emulating clamp16 - TODO: Option to completely skip clamping in some architectures as it is not needed in most games - Mostly affects older GPUs that do not have access to native fp16	2019-06-14 16:19:52 +03:00
kd-11	bca5f94b3f	rsx: Add option to toggle MSAA	2019-06-14 16:19:52 +03:00
kd-11	ea8409dcfd	rsx: Re-enable optional sample-to-pixel transformation	2019-06-14 16:19:52 +03:00
kd-11	acb14320da	rsx: Fixup for resolution scaling support	2019-06-14 16:19:52 +03:00
kd-11	4a5bbba277	rsx: Enable MSAA - vk: Enable depth buffer resolve+unresolve - vk: Add AMD stenciling extension support - rsx: Temporarily disables MSAA-compatible hacks such as transparency AA - TODO: Add paths to optionally disable MSAA	2019-06-14 16:19:52 +03:00
kd-11	f6f3b40ecc	rsx: Fix AA coordinate transforms - Requires native_pitch value to take samples into account	2019-06-14 16:19:52 +03:00
kd-11	655eff29e8	rsx: Refactoring and cleanup after d3d12 separation - Remove deprecated functionality - Refactor to share code between common routines	2019-06-14 16:19:52 +03:00
kd-11	0d906d6974	rsx: Remove surface aa_mode hacks	2019-06-14 16:19:52 +03:00
scribam	13671d9684	rsx: Apply Clang-Tidy fix "modernize-loop-convert" + const when relevant	2019-06-12 15:11:52 +03:00
scribam	1e327ad31b	rsx: Apply Clang-Tidy fix "readability-avoid-const-params-in-decls"	2019-06-12 15:11:52 +03:00
scribam	44265aa27d	rsx: Apply Clang-Tidy fix "modernize-use-equals-default"	2019-06-12 15:11:52 +03:00
scribam	635695ac78	rsx: Apply Clang-Tidy fix "modernize-use-emplace"	2019-06-12 15:11:52 +03:00
scribam	cba828384d	rsx: Apply Clang-Tidy fix "modernize-pass-by-value"	2019-06-12 15:11:52 +03:00
scribam	b91bcdbbca	rsx: Apply Clang-Tidy fix "modernize-use-bool-literals"	2019-06-12 15:11:52 +03:00
scribam	35dc98be06	rsx: Apply Clang-Tidy fix "readability-string-compare"	2019-06-12 15:11:52 +03:00
scribam	801fa0113f	rsx: Apply Clang-Tidy fix "readability-inconsistent-declaration-parameter-name"	2019-06-12 15:11:52 +03:00
scribam	8f2647555a	rsx: Apply Clang-Tidy fix "readability-redundant-string-init"	2019-06-12 15:11:52 +03:00
scribam	db926ee671	rsx: Apply Clang-Tidy fix "performance-unnecessary-value-param"	2019-06-12 15:11:52 +03:00
scribam	81a3b49c2f	rsx: Apply Clang-Tidy fix "readability-container-size-empty"	2019-06-12 15:11:52 +03:00
scribam	f9ad635856	rsx: TextGlyphs optimizations	2019-06-09 23:09:11 +01:00
Nekotekina	dfd50d0185	Implement std::bit_cast<> Partial implementation of std::bit_cast from C++20. Also fix most strict-aliasing rule break warnings (gcc).	2019-06-02 23:22:16 +03:00
scribam	09c9996f31	Use empty() instead of comparing size() with 0 Recommendation from Clang-Tidy: https://clang.llvm.org/extra/clang-tidy/checks/readability-container-size-empty.html	2019-06-01 22:59:23 +03:00
scribam	bf557ea6e6	Use the more efficient character literal overload for find_first_of/find_last_of Recommendation from Clang-Tidy: https://clang.llvm.org/extra/clang-tidy/checks/performance-faster-string-find.html	2019-06-01 22:59:23 +03:00
scribam	78c7ef3039	rsx: Use clear() instead of resize(0) The result is the same but clear [1] has slightly less code than resize [2] and signals better the intent IMHO. [1] `fb7fb646fa/libstdc%2B%2B-v3/include/bits/stl_vector.h (L1495)` [2] `fb7fb646fa/libstdc%2B%2B-v3/include/bits/stl_vector.h (L934)`	2019-06-01 22:59:23 +03:00
kd-11	f2cac26154	rsx: Refactor out GLSLTypes from GLSLCommon to avoid warning spam due to unused functions when included in settings dialog code	2019-05-31 13:27:43 +03:00
kd-11	507ec8252b	vk: Refactor renderpass management - Ensures the current renderpass matches the image properties even when a cyclic reference is detected - Solves SDK debug output error spam due to mismatching layouts and renderpasses	2019-05-25 14:07:29 +03:00
kd-11	4037225e98	vk: Workaround for cyclic feedback loops - Transition attachments to LAYOUT_GENERAL in case of a feedback loop - Fixes appearance of garbage along polygon edges in some post-processing passes. - Also reverse this transition when rendering goes back to normal	2019-05-17 16:41:17 +03:00
kd-11	cb78522620	rsx: Fixup for uninitialized surface antialiasing mode	2019-05-16 19:25:26 +03:00
kd-11	45a13d0319	rsx: Fixup for lost aliased surfaces - Intersection routines were changed and require explicit identification of the "old surface"	2019-05-16 19:25:26 +03:00
kd-11	05eb1e9193	rsx: Fix zombie image references from inside the texture cache - Do not add locked orphans to the flush_always cache! They will not remove their cache entries as they are not bound	2019-05-16 19:25:26 +03:00
kd-11	214bb3ec87	rsx: Always initialize memory unless it is guaranteed to be wiped	2019-05-16 19:25:26 +03:00
kd-11	88290d9fab	rsx: Hack around using data regions as transfer targets	2019-05-16 19:25:26 +03:00
kd-11	4182f9984d	rsx: Propagate split section information back to the texture cache	2019-05-16 19:25:26 +03:00
kd-11	3c7d8a1099	rsx: Minor texture/surface scanning optimization - Also re-enable optimization in blit engine accidentally disabled during debugging	2019-05-16 19:25:26 +03:00
kd-11	9f0090772a	rsx: Fix write tagging when comments are transferred in by blit engine	2019-05-16 19:25:26 +03:00
kd-11	4b443be881	rsx: Fix self-intersection with previous occupant of the address being replaced	2019-05-16 19:25:26 +03:00
kd-11	b840f6da28	[WIP] rsx: Use a sane reference counting model	2019-05-16 19:25:26 +03:00
kd-11	e3cf3ab6b8	rsx: Minor fixes - Fix transfer scaling (inverted) - Fix under-estimated typeless acquisition when doing depth format scaling	2019-05-16 19:25:26 +03:00
kd-11	e02e27b2b3	rsx: Prevent out-of-bounds writes when resolving shader input textures - The target area can also have padding!	2019-05-16 19:25:26 +03:00
kd-11	88c20afd3a	rsx: Implement unaligned surface inheritance with hierachial contribution - Allows render targets to behave like stacked 3D views same as shader inputs are resolved - Basically implements most of 'Read Color/Depth Buffers" option for 'free'. - Allows splitting RTV/DSV resources if they are superceded by a partial surface - Also allows intersecting new resources through the surface cache for proper inheritance from other scattered data - TODO: Refactor bind_surface_as_rtt and bind_surface_as_ds to reduce asinine code duplication	2019-05-16 19:25:26 +03:00
scribam	6c5ea068c9	Remove redundant semicolons Fix "-Wextra-semi" warnings	2019-05-12 18:32:11 +03:00
kd-11	6b7cd458e3	rsx: Silence some diagnostics unless compiled with debugging options	2019-05-01 15:36:21 +03:00
kd-11	48cb265c2c	rsx: Bounds check on local resource for atlas merge. - Local resources can also have padded pitch dimensions and false-positives on range overlap tests	2019-05-01 15:36:21 +03:00
kd-11	ec9aa74008	rsx: Fix section base offset calculation for blit_dst targets which affects confirmed memory range - Fixes flushes only writing partially to target memory	2019-05-01 15:36:21 +03:00
kd-11	243df38360	rsx: Fix VP writes to CC with a MOV instruction - When moving to CC, the operation has VEC flag disabled and also temp regs disabled. Looks to be the catch-all ELSE in the selection logic.	2019-04-25 16:23:05 +03:00
kd-11	3cbccdd760	rsx: Fragment shader decompiler cleanup TODO: Investigate the _s input modifier behaviour further, in case it can avoid generating zeroes from a MAD instruction. x = MAD(+ve, -ve, -ve) with _s input modifier in BFBC expects result to be Non-zero	2019-04-25 16:23:05 +03:00
kd-11	4cd1c25729	"rsx: Ignore argument sign for SQRT operations"	2019-04-25 16:23:05 +03:00
kd-11	32396ba366	rsx: Simplify use of some mixed input functions using OPFLAGS to avoid implicit conversions	2019-04-25 16:23:05 +03:00
kd-11	f12bd8068c	rsx: Fragment decompiler fixups - Properly test for NaN and Inf when clamping down to fp16 - Optimize divsq a bit; mix(vec, vec, bvec) emits OpSelect which is what we want here, instead of component-wise selection which is much slower.	2019-04-25 16:23:05 +03:00
kd-11	abe7188acf	rsx: Proper workaround for broken DIVSQ instruction on realhw - While mul(0, nan) = nan and 0 / 0 = nan, 0 / sqrt(0) = 0 because of hw gremlins. normalize(0) is also nan so this behaviour does not work around that particular case either which makes it even more baffling.	2019-04-25 16:23:05 +03:00
kd-11	60f3059d22	rsx: Compensate for nvidia's low precision attribute interpolation - The hw generates inaccurate values when doing perspective-correct interpolation of vertex output attributes and makes the comparison (a == b) fail even when they are a fixed constant value. - Increase equality tolerance when doing comparisons in fragment shaders for NV cards only to work around this issue. - Teepo fix	2019-04-25 16:23:05 +03:00
kd-11	463b1b220d	rsx: Improve accuracy of shadow compare Ops when non-integer depth formats are used - The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour - D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders	2019-04-25 16:23:05 +03:00
kd-11	06a85f00d1	rsx: Shader decompiler cleanup and improvements - Improve support for float16_t by minimizing mixed inputs to functions (ambiguous overloads) - Minimize amount of downcasts in code by using opcode flags - Re-enable float16_t support for vulkan	2019-04-25 16:23:05 +03:00
kd-11	a668560c68	rsx: Use native half float types if available - Emulating f16 with f32 is not ideal and requires a lot of value clamping - Using native data type can significantly improve performance and accuracy - With openGL, check for the compatible extensions NV_gpu_shader5 and AMD_gpu_shader_half_float - With Vulkan, enable this functionality in the deviceFeatures if applicable. (VK_KHR_shader_float16_int8 extension) - Temporarily disable hw fp16 for vulkan	2019-04-25 16:23:05 +03:00
kd-11	ee319f7c13	rsx: Implement strict clamp16 operation needed for NVIDIA cards	2019-04-25 16:23:05 +03:00
kd-11	df3b46a611	rsx: Improve texture sourcing and clipping when reverse scanning is enabled - When reverse scanning, offsets are inverted and offset value of 0 is logically equivalent to an offset of -1 - Add an explicit message if clipping happens to avoid silent errors/bugs	2019-04-12 15:36:21 +03:00
kd-11	12dc3c1872	vk: Dynamic heap management to potentially fix ring buffer overflows - Allows checking one heap type at a time, on demand - Should avoid OOM situations unless inside an uninterruptible block	2019-04-09 13:40:54 +03:00
kd-11	a4495c35b7	rsx: Fixups for swizzled texture scanning - Revert to using block metrics, but with optional per-channel decode stage for the final transfer. Much cleaner than hacking in the width to be in channels instead of blocks.	2019-04-09 13:40:54 +03:00
kd-11	0a604e39f1	rsx: Implement RGB655 decode	2019-04-09 13:40:54 +03:00
kd-11	e4e86455f2	rsx: Fix temporary subresource caching behaviour - Do not cache if a gathered subresource contains a bound RTT - Change op to dynamic copy if parent is still bound	2019-04-09 13:40:54 +03:00
kd-11	3249000511	rsx: Improvements to texture scanning - Removes CPU-only transforms that broke GPU-side code. -- Channels in GPU compute are laid out in cell-order, but CPU was uploading in favorable order and compensating with swizzles. -- This leads to 2 different layouts depending on the location of the data (CPU vs GPU) - Implement R8G8_R8B8 interleaved format decode - General improvements	2019-04-09 13:40:54 +03:00
kd-11	366e4c2422	rsx: Preliminary support for format conversions using typeless resolve	2019-04-09 13:40:54 +03:00
kd-11	b7470cfc1a	rsx: Tighten format checks in cache hit tests	2019-04-09 13:40:54 +03:00
kd-11	443fde760f	rsx: Blit engine clipping fixes - Do not round up sub-pixel offsets, round down instead - Do not allow incomplete sources for hw blit transfer - Reimplement src clipping (slice_h) - Check 'area' of incoming texels and correct for them before RTT lookup/transfer - Filter out incomplete targets when performing RTT lookup (1 texel or less contribution)	2019-04-09 13:40:54 +03:00
kd-11	41b87cf577	rsx: Blit engine fixes - If a transfer writes to a RTT and depth mismatch happens, create a local target and the upload function will likely resolve between the two - If a surface is rejected, reset the target region!	2019-03-22 21:27:15 +03:00
kd-11	86ad204636	rsx: Rebase output region when using upload-fallback path	2019-03-22 21:27:15 +03:00
kd-11	dbc8e70ddd	rsx: Silence some compiler noise	2019-03-22 21:27:15 +03:00
kd-11	adc59f9810	rsx: Fix blit transfers when texel sizes mismatch - Also refactors some bpp handling code - Simplify texture intersection test to use a normalized/uniform coordinate space - Fix broken bounds checking as well	2019-03-22 21:27:15 +03:00
kd-11	03fca73cf4	rsx: Fix blit intersection falling outside the available texture - Just becaue we have a hit inside the tile of interest does not guarantee that it sits inside the texture!	2019-03-20 10:05:54 +03:00
kd-11	3ef16bee47	rsx: Fix texture lookups and avoid out-of-bounds copies/transfers	2019-03-17 21:50:11 +03:00
kd-11	bb65e45614	rsx: Implement GPU acceleration for rotated images	2019-03-17 21:50:11 +03:00
kd-11	5260f4b47d	rsx: Improvements to memory flush mechanism - Batch dma transfers whenever possible and do them in one go - vk: Always ensure that queued dma transfers are visible to the GPU before they are needed by the host Requires a little refactoring to allow proper communication of the commandbuffer state - vk: Code cleanup, the simplified mechanism makes it so that its not necessary to pass tons of args to methods - vk: Fixup - do not forcefully do dma transfers on sections in an invalidation zone! They may have been speculated correctly already	2019-03-17 21:50:11 +03:00
kd-11	385485204b	vk/gl: Omit unlocked data when grabbing flip sources from texture cache	2019-03-17 21:50:11 +03:00
kd-11	74eeacd091	vk/gl: Improve memory tag sync and test - Properly pass parameters such as rsx-pitch to the surface store - Do not crash if a surface fails verification in flip, use fall-back instead	2019-03-17 21:50:11 +03:00
kd-11	1a44446250	rsx: Fix dst upload block region - The section needed starts at image origin, not transfer origin!	2019-03-17 21:50:11 +03:00
kd-11	a49a0f2a86	vk/gl: Synchronization improvements - Properly wait for the buffer transfer operation to finish before map/readback! - Change vkFence to vkEvent which works more like a GL fence which is what is needed. - Implement supporting methods and functions - Do not destroy fence by immediately waiting after copying to dma buffer	2019-03-17 21:50:11 +03:00
kd-11	85cb703633	rsx/cache: Debugging bugs introduced by the atlas coverage check - Figured out why it breaks things, ofc can't actually check for coverage when there is no proper fbo data persistence	2019-03-17 21:50:11 +03:00
kd-11	3a4083263e	rsx: Fix texture transfer when pitch does not match exactly	2019-03-17 21:50:11 +03:00
kd-11	612160a8ff	rsx: Fix zero-pitch textures - Assumption here is that only texel (0, 0) is accessible. Inline with other pitch 0 operations. - TODO: Verify pitch 0 does not advance in Y either	2019-03-17 21:50:11 +03:00
kd-11	17c49d21a5	rsx/blit: Remove workarounds/hacks added for master. Start implementation/stubs for blit engine rotations in GPU	2019-03-17 21:50:11 +03:00
kd-11	745f8f9627	rsx: Remove pointless assert	2019-03-17 21:50:11 +03:00
kd-11	358558aaa7	cleanup and fixups	2019-03-10 16:09:05 +03:00
kd-11	04dda44225	rsx: Properly generate render target data with all parameters provided - Build-up to variable-sized framebuffers and AA implementation - Also allows accurate range calculation for our hit testing	2019-03-10 16:09:05 +03:00
kd-11	21bc6c7a87	rsx: Properly resolve data for upload when needed. - Avoids blindly reusing blit dst sections as they may contain garbage. If a section was unlocked for a flush, just discard it as its reuse introduces potential data corruption. Since the data needs to be reuploaded anyway (for now), its better to start afresh - In case of format mismatch, reset the calculated dst block - Add a bounds check to determine if data contained in an atlas is good enough for sampling the cache. If not enough data is provided, fall back to full upload	2019-03-10 16:09:05 +03:00
kd-11	9d4d3d9443	rsx: Reimplement render target intersection tests when using hw accelerated blit engine - Properly collapse memory tree when scanning in case of overlaps!	2019-03-10 16:09:05 +03:00
kd-11	7c379432dd	rsx: Implement proper pitch compatibility lookup - When a single row is required or is all that is available, pitch has no meaning as the coordinate space changed to 1D	2019-03-10 16:09:05 +03:00
kd-11	dccb4a4888	rsx/texture_cache: fixes to commit_framebuffer_memory	2019-03-10 16:09:05 +03:00
kd-11	b9e7b085fe	rsx/texture_cache: Fixups for local resource hit and fast-path added	2019-03-10 16:09:05 +03:00
kd-11	10dc3dadee	rsx/texture_cache: Improve framebuffer memory locking when WCB/WDB is not enabled - Adds a new mode that removes non-framebuffer stuff inside framebuffer range	2019-03-10 16:09:05 +03:00
kd-11	563e205a72	rsx/texture_cache: Fix 'AA' scaling hack and restore collection template selection	2019-03-10 16:09:05 +03:00
kd-11	fa628f0ac4	rsx/surface_store: More aggressive tag sampling - Use a 5-point tap with an X pattern across the target's memory space to reduce chances of false positives - TODO: Potential false positives identified, requires some minor restructuring of surface_store	2019-03-10 16:09:05 +03:00
kd-11	3a071a9c07	rsx: Texture search rewrite - Perform a full search across all resource types as needed without taking too many shortcuts/hacks	2019-03-10 16:09:05 +03:00
kd-11	6ef9dcd62e	rsx: Handle mismatched/invalidated framebuffer sections when WCB is enabled	2019-03-10 16:09:05 +03:00
kd-11	ef071ebb6b	rsx: Synchronize surface cache and texture cache data - TODO: The whole upload_texture thing is a big hack, fix it properly	2019-03-10 16:09:05 +03:00
kd-11	2163a59649	rsx: Typo fix	2019-01-25 14:34:22 +03:00
kd-11	fb778e4821	rsx: Reimplement attrib divisor	2019-01-25 14:34:22 +03:00
kd-11	736415fcd9	rsx/fp: Detect broken/NOP shaders automatically - Do not compile body if the shader is of no consequence, leave as a passthrough shader	2019-01-25 14:34:22 +03:00
kd-11	6fdc0fd7f0	rsx: Reimplement MSAA transparency - Apply dither to edges that almost fail the straight-up alpha test - Significantly improves alpha tested geometry far from the camera - Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding	2019-01-25 14:34:22 +03:00
kd-11	417a2e6731	rsx: Refactor index buffers - Index offset is ignored anyway and only used to calculate vertex attribute divisor index - Specialized optimization for untouched xfer without primitive restart	2019-01-25 14:34:22 +03:00
Nekotekina	bd9131ae1c	Implement fs::get_cache_dir Win32: equal to config dir for now Linux: respect XDG_CACHE_HOME if specified OSX: possibly incomplete	2019-01-13 14:45:36 +03:00
kd-11	52ac0a901a	rsx: improve memory coherency - Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test - Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data	2019-01-06 10:44:40 +03:00
kd-11	89c9c54743	rsx: Minor hot-fix - Pitch 0 makes sense if width == 1 and height == 1	2019-01-06 10:44:40 +03:00
kd-11	2a62fa892b	rsx: Texture cache refactor - gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods - texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields - Explicit read/write barrier for framebuffer resources depending on usage. Allows for operations like optional memory initialization before reading	2019-01-06 10:44:40 +03:00
kd-11	3be4b474d9	rsx: Handle rsx-self-tripping in draw call and triggering invalid invalidation - If draw call resources consume memory that intersects with NA parts of the texture cache, we get a framebuffer test mismatch. This mismatch is false and happens because the thread has not yet reached the point of relocking the pages	2019-01-06 10:44:40 +03:00
kd-11	a95a44cf66	rsx: Strictness cleanups - Also account for variable pitch textures (swizzled scan)	2019-01-06 10:44:40 +03:00
kd-11	362eea09a1	whitespace fix only	2019-01-06 10:44:40 +03:00
kd-11	15d5507154	rsx: Rewrite memory inheritance transfers - Implicitly invoke a memory barrier if actively reading from an unsynchronized texture - Simplify memory transfer operations - Should allow more games to work without strict mode	2019-01-06 10:44:40 +03:00
kd-11	97704d1396	rsx: Fix texture size calculations	2019-01-06 10:44:40 +03:00
kd-11	50c07833e4	rsx: Do not force upload for missing data - TODO: Finish implementing GPU RCB for mem-sync - TODO: Refactor mem-sync	2019-01-06 10:44:40 +03:00
kd-11	15488eb247	rsx: Avoid unnecessarily touching framebuffer memory - Do not bind companion framebuffer when clearing single aspect; let the contest mechanism sort it out instead - Do not prematurely tag framebuffers, instead only do so at write-confirmation time. Should avoid false tagging if setup does not allow a render to occur.	2019-01-06 10:44:40 +03:00
Megamouse	bb464b0b64	fix some warnings	2019-01-05 04:03:18 +01:00
kd-11	7555be232f	rsx/vp: Fix double dst commands - Test the vec_result mask before assigning to actual output Sometimes, VEC op is used to write to Rx, and SCA op is used to write to o[x]!	2018-12-24 09:05:19 +03:00
kd-11	4b79ef1ad9	rsx: Implement stencil mirror views - Implements a mirror view of D24S8 data that accesses the stencil components. Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data - Add a few missing destructors Image classes are inherited a lot and I forgot to make the dtors virtual	2018-12-24 09:05:19 +03:00
kd-11	696b91cb9b	rsx: Reimplement conditional execution in shaders - Per-channel conditional execution introduces RAW hazards all over the place - Its cheaper to process both branches and select between the two - Also improves ShaderVariable functionality to allow functionality such as match_size and taking complex variables as inputs	2018-12-24 09:05:19 +03:00
Rui Pinheiro	54bfe2e102	Add log warning on slow flush path	2018-12-11 22:37:10 +03:00
Rui Pinheiro	18b9ee4541	Reimplement overlapping fbo "hack" To avoid the need (and performance hit) of Read Color/Depth Buffers, we may not invalidate overlapping fbos inside lock_memory_region unless they are guaranteed to be superseded by the new one. This avoids e.g. issues with overblooming, among others.	2018-12-11 22:37:10 +03:00
Rui Pinheiro	5ab7296665	Fix xcode build	2018-12-11 22:37:10 +03:00
Rui Pinheiro	bcdf91edbb	Misc. Texture Cache fixes	2018-12-11 22:37:10 +03:00
Rui Pinheiro	9d1cdccb1a	Implement dedicated texture cache predictor	2018-12-11 22:37:10 +03:00
Rui Pinheiro	af360b78f2	Texture cache section management fixups Fixes VRAM leaks and incorrect destruction of resources, which could lead to drivers crashes. Additionally, lock_memory_region is now able to flush superseded sections. However, due to the potential performance impact of this for little gain, a new debug setting ("Strict Flushing") has been added to config.yaml	2018-12-11 22:37:10 +03:00
eladash	4ddafc481e	remove unreachable code	2018-12-04 13:01:29 +03:00
kd-11	504ab5a6d4	rsx: Minor cleanup to silence stupid compiler warnings	2018-12-03 20:01:23 +03:00
kd-11	7b065d7781	rsx: Fixup; input attributes blob decoding - Use an unstructured blob and index into the vec4 structures to extract the real data	2018-11-30 23:51:25 +03:00
kd-11	846daadd5d	rsx: Fixups - Improve vertex attribute layout format. Allows for full 16-bit attribute divisor - Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering	2018-11-30 23:51:25 +03:00
kd-11	1ad76ad331	rsx: Restructure programs - Also re-enable pipeline optimizations	2018-11-30 23:51:25 +03:00
kd-11	677b16f5c6	rsx: Fixups - Also fix visual corruption when using disjoint indexed draws - Refactor draw call emit again (vk) - Improve execution barrier resolve - Allow vertex/index rebase inside begin/end pair - Add ALPHA_TEST to list of excluded methods [TODO: defer raster state] - gl bringup - Simplify - using the simple_array gets back a few more fps :)	2018-11-30 23:51:25 +03:00
kd-11	e01d2f08c9	rsx: Refactor FIFO - Removes fifo structures from common RSXThread - Sets up a dedicated FIFO controller - Allows for configurable queue optimizations	2018-11-30 23:51:25 +03:00
eladash	83b6c98563	rsx: Fix u16 index arrays overflow Force u32 index array destinations to avoid overflows when adding vertex base index.	2018-10-08 16:39:47 +03:00
eladash	e361e0daa6	rsx: Fix restart index check for u16 index arrays Dont ignore upper bits of the restart index with u16 types	2018-10-08 16:39:47 +03:00
eladash	348db050ae	rsx: Fix texture height read	2018-10-03 20:57:46 +03:00
eladash	fa723f6dc4	rsx: Fix texture depth read	2018-10-03 20:57:46 +03:00
eladash	6586090307	rsx: Remove texture size hack	2018-10-03 20:57:46 +03:00
eladash	eacd1b8f13	rsx: Remove texture address hack	2018-10-03 20:57:46 +03:00
Nekotekina	da6ce80f4f	Make vm::get_super_ptr return contiguous memory Cleanup RSX code complexity	2018-09-27 23:37:13 +03:00
kd-11	dab30c0051	rsx: Disable predictions if 50% of predictions are wrong - This happens often in loading screens where the memory usage pattern is often randomized by loading in of assets	2018-09-24 21:19:38 +03:00
Rui Pinheiro	35139ebf5d	Texture cache cleanup, refactoring and fixes	2018-09-24 15:26:40 +03:00
kd-11	dafc914bcc	rsx: temporary hack - Removes all use of valid_count as a metric until the new refactor is merged	2018-09-21 16:32:23 +03:00
kd-11	fc486a1bac	rsx: Preserve memory order when doing flush - Orders flushing to preserve memory at all cost - Avoids false positive where flushing overlapping sections can falsely invalidate another with head/tail test	2018-09-21 16:32:23 +03:00
kd-11	a21bdb9f45	rsx; blit engine fixes - Forcefully downloads and reuploads data from the CPU in case of unexpected overlaps - Properly detect correct size of newly created blit targets - Remember to clear any existing views when changing the default component map!	2018-09-21 16:32:23 +03:00
kd-11	16dcbe8c74	rsx/vp: Fix ARL opcode properly - NOTE: The address swizzle index is only for use as src. The address registers are only used one channel at a time. - When the destination of ARL, the encoding is the same as the other temp registers	2018-09-15 11:57:06 +03:00
kd-11	f413996362	rsx: Minor texture cache fixes - Retag resources reprotected under flush_always rules - Properly check for blit resource fitting taking into account format mismatch, pitch mismatch and typeless transfers	2018-09-10 15:43:28 +03:00
Dzmitry Malyshau	27474316fd	Add missing virtual desctructors (#5094 )	2018-09-07 14:35:40 +03:00
kd-11	66610a28af	rsx/common: Clean up shared glsl header to minimize string concat operations	2018-09-06 21:11:11 +03:00
kd-11	346b97f871	rsx: Preserve fog coordinate across shader stages - The x value contains the VP output value interpolated across primitive surface - The y coordinate contains the fog fraction according to the selected fog formula	2018-09-06 21:11:11 +03:00
scribam	d7bb59cd99	c++17: use std::size	2018-09-06 13:15:59 +03:00
Nekotekina	ca5158a03e	Cleanup semaphore<> (sema.h) and mutex.h (shared_mutex) Remove semaphore_lock and writer_lock classes, replace with std::lock_guard Change semaphore<> interface to Lockable (+ exotic try_unlock method)	2018-09-03 23:00:36 +03:00
Nekotekina	ce4c4696dd	Try to get rid of SIZE_32 macro	2018-09-03 21:40:36 +03:00
kd-11	dea5193fd7	rsx: Fix FP temp register count	2018-09-03 21:39:18 +03:00
kd-11	6399833182	rsx: Fix endianness order when immediate mode register is updated, but used as register lookup - Simplify the code by unifying all the register-backed memory	2018-09-03 18:24:20 +03:00
Nekotekina	a93a40e8d9	Disable texture_cache::emit_once (MSVC crash)	2018-08-25 01:58:28 +03:00
Nekotekina	1c6c24f8ac	Update GSL and yaml-cpp submodules	2018-08-25 01:15:47 +03:00
Nekotekina	923314aef5	Rewrite texture_cache::emit_once Also trying to workaround MSVC bug	2018-08-25 01:15:47 +03:00
kd-11	c6e35706a3	vk: Support sw component swizzle decode because metal sucks	2018-08-23 22:54:56 +03:00
kd-11	f3d3a1a4a5	rsx: Rework section reuse logic	2018-08-22 17:22:54 +03:00
kd-11	937f1e8cd0	fix gcc build	2018-08-18 16:14:30 +03:00
kd-11	4b2b662c3a	rsx: Followup to the memory inheritance hierachy patch - Tags framebuffer resources on first use (when on_write is called to verify memory) - Texture cache now selects the best match and even sorts atlas writes with memory write order to avoid older data showing over newer one	2018-08-18 16:14:30 +03:00
kd-11	cca488d0cf	rsx: Enable swizzled decode for all formats unless proven otherwise - Some formats are proven to ignore swizzle flag - DXT compressed textures - COMPRESSED_BG_GB class textures - Some applications are using swizzled wide integer formats so those are confirmed to swizzle	2018-08-18 16:14:30 +03:00
kd-11	f8a9b1fa30	[WIP] rsx: Improve memory inheritance hierachy - Cascade memory writes by invalidating 'downstream' subsurfaces - Fixup; always resolve for overlapping surfaces before sampling (force atlas gather test)	2018-08-18 16:14:30 +03:00
kd-11	cc7848b3ef	vulkan: Fix blit engine transfer to ARGB8 render target memory	2018-08-18 16:14:30 +03:00
kd-11	0267221586	Minor optimizations and fixes - FIFO: avoid multiline spam - VK: Fix program setup counter - FS: Precalculate fragment constants buffer size during analysis step	2018-08-18 16:14:30 +03:00
Rui Pinheiro	23b52e1b1c	Mark unsync textures dirty when deferred flushing invalidate_range_impl_base does not mark all textures that will only be unprotected as dirty when doing a deferred flush, since that is done by flush_all. However, if there are no sections to flush, the deferred flush will use the same code path as non-deferred flushes for unprotecting textures and forget to mark them as dirty. This commit fixes this bug.	2018-08-16 15:38:36 +03:00
Rui Pinheiro	fa6a5761b3	Refactor get_intersecting_set The existing implementation restarts the loop immediately after finding a range_data instance that updates the trampled_range. This commit refactors this method to continue the loop with the updated trampled_range, and then repeat only those range_data instances that were iterated through before the trampled_range was last updated. As a result, the number of total iterations required is reduced.	2018-08-16 15:38:36 +03:00
Rui Pinheiro	b534d49e48	Fix off-by-one error in get_intersecting_set When the trampled range changes, get_intersecting_set restarts the outer loop. However, due to an off-by-one error, it skips the first cache entry when doing so. This can cause a texture not to be correctly unlocked, which could lead to issues or even deadlocks. This commit fixes this off-by-one error.	2018-08-16 15:38:36 +03:00
eladash	f349695a75	Rsx: rewrite address translation	2018-08-13 16:16:34 +03:00
kd-11	19d808d378	rsx/gl: Minor cleanup and optimization - Track register change status - Remove unused gl classes	2018-07-22 17:19:59 +03:00
kd-11	8695f95267	rsx: Reimplement cached textures and their views	2018-07-22 17:19:59 +03:00
scribam	65d270e5d8	clang-tidy: performance-faster-string-find https://clang.llvm.org/extra/clang-tidy/checks/performance-faster-string-find.html	2018-07-15 12:51:09 +04:00
kd-11	e7f30640ef	rsx: Async shader compilation - Defer compilation process to worker threads - vulkan: Fixup for graphics_pipeline_state. Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)	2018-07-14 15:19:56 +03:00
kd-11	fa55a8072c	rsx: Improve vertex textures support - Adds proper support for vertex textures, including dimensions other than 2D textures - Minor analyser fixup, removes spurious 'analyser failed' errors - Minor optimizations for program state tracking	2018-07-12 18:02:28 +03:00
kd-11	d78957d1cf	rsx/vp: CodeGen improvements - Fix double destination writes on conditional write masking - Fix codegen to simplify simple scalar comparisons vs vector functions	2018-07-07 16:20:33 +03:00
kd-11	2c34195954	rsx/vp: Discard broken vertex programs with no writes to POS register	2018-07-07 16:20:33 +03:00
kd-11	2ca935a26b	vp: Improve vertex program analyser - Adds dead code elimination - Fix absolute branch target addresses to take base address into account - Patch branch targets relative to base address to improve hash matching - Bumps shader cache version - Enables shader logging option to write out vertex program binary, helpful when debugging problems.	2018-07-07 16:20:33 +03:00
kd-11	bd915bfebd	rsx: vp decompiler fixes - Fix program abort logic to never abort before resolving later label addresses Fixes jumping over broken code and jumping over END markers - TEXTURE_CONTROL2 has indexing range of [0..15] without stride skipping! This register does not have interleaving with other texture registers - Track shader address poke as it seems to invalidate programs as well	2018-07-07 16:20:33 +03:00
kd-11	24f4c92759	rsx: Improve texture cache read speculation	2018-06-26 20:07:20 +03:00
kd-11	1730708f47	rsx: Rework memory protection management for framebuffer access - Avoid re-locking memory if there is no reason to do so (no draws issued) - Actively bound regions should always get written to the backing cache - Forcefully read memory during download if writes to the target have occured since last sync event	2018-06-26 20:07:20 +03:00
kd-11	d77e62c94e	rsx: Improve GPU resource read prediction	2018-06-18 17:32:22 +03:00
kd-11	2afcf369ec	vk: Add synchronous compute pipelines - Compute is now used to assist in some parts of blit operations, since there are no format conversions with vulkan like OGL does - TODO: Integrate this into all types of GPU memory conversion operations instead of downloading to CPU then converting	2018-06-18 17:32:22 +03:00
kd-11	dd4c13b625	rsx: Avoid race conditions in unsynchronized unprotect	2018-06-18 17:32:22 +03:00
kd-11	3150619320	rsx: Preserve read AA state separate from write AA state - Some applications (e.g Backbreaker) use an evil hack to resolve MSAA. The application respecifies a formerly AA region as a region with no AA then performs a framebuffer feedback lookup. The old memory keeps AA during read, but writes back to itself with AA resolved. This is evil on several levels but it just happens to work on PS3	2018-06-08 22:17:50 +03:00
kd-11	0f24379c0e	rsx: Obey MSAA resolve during memory persistence transfer - Ugh. This is a bandaid on a festering wound, AA badly needs a rewrite Also silence some warnings	2018-06-08 22:17:50 +03:00
Dravonic	400079a006	Parallel shader cache loading (#4677 ) * Parallel shader cache loading	2018-06-01 19:49:29 +03:00
kd-11	b030d1900c	rsx: Fixup - fix broken memory protection fail caused by region respec - Some applications will alternate memory between framebuffer and texture data	2018-05-24 10:36:04 +03:00
kd-11	f8d999b384	fixup - range check	2018-05-23 19:07:08 +03:00
kd-11	3f14bc6961	rsx: Silence some meaningless error	2018-05-23 19:07:08 +03:00
kd-11	8fcd5c1e5a	rsx: Texture cache fixes 1. rsx: Rework section synchronization using the new memory mirrors 2. rsx: Tweaks - Simplify peeking into the current rsx::thread instance. Use a simple rsx::get_current_renderer instead of asking fxm for the same - Fix global rsx super memory shm block management 3. rsx: Improve memory validation. test_framebuffer() and tag_framebuffer() are simplified due to mirror support 4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine 5. rsx: Explicitly mark clobbered flushable sections as dirty to have them removed 6. rsx: Cumulative fixes - Reimplement rsx::buffered_section management routines - blit engine subsections are not hit-tested against confirmed/committed memory range Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops	2018-05-23 19:07:08 +03:00
scribam	04ad49de4d	typos	2018-05-14 21:14:39 +04:00
kd-11	4836a03a7d	rsx: Fix build	2018-05-13 14:44:14 +03:00
kd-11	b7979d3f57	rsx/vk: Improvements and minor optimizations - Improve dirty state tracking affecting program state - vk: Refactor out transform constants upload into a separate channel to avoid if possible transform data uploads are quite expensive	2018-05-13 14:44:14 +03:00
kd-11	a52ea7f870	rsx: Improve fragment and vertex program usage - Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search - Avoids detecting shader as being different because of unused textures having state changes - Adds better program size detection for vertex programs - Improved vertex program decompiler - Properly support CAL type instructions - Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes - Fix SRC checks and abort - Fix CC register initialization - NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)	2018-05-13 14:44:14 +03:00
kd-11	f3210a9a33	rsx: Workaround for lost memory sections - TODO: surface_cache and texture_cache need a better method of persisting partial framebuffer resources	2018-04-25 19:14:36 +03:00
kd-11	291a828217	fixups	2018-04-25 19:14:36 +03:00
kd-11	da99f3cb9a	rsx: Critical fixes - texture cache: Avoid leaking memory sections - Avoid double ref increment on flush-always reprotection - Detect invalidated_resources entries in surface cache when protecting fbo memory - vk: Copypasta bugfix, properly initialize aspect mask	2018-04-25 19:14:36 +03:00
kd-11	a42b00488d	rsx: Texture fixes - gl/vk: Fix subresource copy/blit - gl/vk: Fix default_component_map reading - vk: Reimplement cell readback path and improve software channel decoder - Properly name the subresource layout field - its in blocks not bytes! - Implement d24s8 upload from memory correctly - Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules - NOTE: Redirection of 16-bit textures is not implemented yet	2018-04-25 19:14:36 +03:00
kd-11	9abbbb79ae	rsx: Blit engine fixes - Ignore unlocked blit sections [TODO] - Do not attempt blit on hw if bytesize is unsupported - gl: Implement typeless memory transfers Uses pbo to handle type-agnostic memory transfer	2018-04-25 19:14:36 +03:00
kd-11	cf1b700ebc	rsx: Improve format mismatch detection hack	2018-04-25 19:14:36 +03:00
kd-11	cfd0b8a975	rsx: Fix alphakill	2018-04-05 01:06:50 +03:00
kd-11	53f2533a08	rsx: Implement proper Z-order curve in 3 dimensions - Should fix garbage palette textures getting uploaded (LSD graphics)	2018-04-05 01:06:50 +03:00
kd-11	e291494282	rsx: Texture cache updates - Properly implement section gather for 3d and cubemaps Implements render-to-3d and fixes some corner cases for render-to-cubemap	2018-04-05 01:06:50 +03:00
Jake	6d6d6fa827	dx12/vk/gl: implement use of vertex_data_base_index when calculating index	2018-03-30 13:30:04 +03:00
kd-11	ee0fe28ddc	rsx: Fix copypasta	2018-03-29 13:52:11 +03:00
kd-11	5aac8aa424	rsx: Clamp negative fog distance	2018-03-25 16:02:47 +03:00
kd-11	887ea43e39	rsx: Fix some texture cache problems - gl/vk: Properly handle remapping temporary resources	2018-03-25 13:31:06 +03:00
kd-11	321c360dcb	rsx: Overhaul rendertarget sampling/shuffles - Reimplements render target views used for sampling - Optimizes access using an encoded control token - Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR) - Adds proper encoding for ABGR textures (ABGR8 -> ARGB8) - Silence some compiler warnings as well - TODO: Real texture views for OGL current method is a hack	2018-03-25 13:31:06 +03:00
kd-11	9fc1740608	rsx/fp: Fragment program overhaul - Separate TXB from TXL: They are completely different! - Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment - Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work - Document some more bits in SHADER_CONTROL register	2018-03-25 13:31:06 +03:00
kd-11	9f416e5ce1	rsx/gl/vk: Obey channel remapping on framebuffer resources if requested	2018-03-25 13:31:06 +03:00
kd-11	27552891ad	rsx/fp: Improvements - Export some debug information in the free texture register space components zw Very useful when analysing renderdoc captures - Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read Some engines (UE3) read all the components in the shader and use mul/mad with the result	2018-03-25 13:31:06 +03:00
kd-11	5817f9fe3f	rsx: Texture format fixes - Implement SRGB (gamma corrected) textures (DXT1, DXT3, DXT5, RGBA8 only) - Fix channel map decode for XY data texture formats - Fix remap layout for X16 textures (verified with Mass Effect 3)	2018-03-25 13:31:06 +03:00
pauls-gh	44cddda4b4	Fix VTC source index increment	2018-03-23 12:01:30 +03:00
pauls-gh	d79a544320	VTC tiling - fix source offset increment.	2018-03-23 12:01:30 +03:00
pauls-gh	e5b4710471	Add end condition for VTC copy. This handles the case when depth is not a multiple of 4.	2018-03-23 12:01:30 +03:00
pauls-gh	e6010ba2ca	Fix code formatting	2018-03-23 12:01:30 +03:00
pauls-gh	fd8d2ecbf4	Remove Volume Texture Compression (VTC) tiling for Vulkan, DX12 and ATI (OpenGL).	2018-03-23 12:01:30 +03:00
kd-11	ffe6c9ba5a	fix linux builds	2018-03-13 18:55:03 +03:00
kd-11	e230867492	rsx: Properly implement raster window offsets	2018-03-13 18:55:03 +03:00
kd-11	d41b49d8b4	rsx/fp: Color output registers are always present and zero initialized - According to NV_fragment_program spec, registers are zero initialized always - A program even without writing to these registers will have black (0, 0, 0, 0) output Confirmed behaviour with MotorStorm games. Their engine uses this quirk to clear color buffers when doing depth replace Might be an unfixed game bug	2018-03-13 18:55:03 +03:00
kd-11	20d4c09a1c	rsx/vk/gl: Enforce format matching for render target resources. Fall back to raw data copy if match fails - Forces Bitcast of texture data if input format cannot possibly be the same as the existing texture format - rsx: Other minor improvements to texture cache :- - remove obsolete blit engine incompatibility warning. The texture will be re-uploaded if it is indeed incompatible - Implement warn_once and err_once to avoid spamming the log with systemic errors - Track mispredicted flushes - Reswizzle bitcasted texture data to native layout TODO: Also needs reshuffle according to input remap vector	2018-03-13 18:55:03 +03:00
kd-11	68b3229756	rsx/fp: Improve rgister component gather detection - Also avoids clobbering register data by keeping gathered bits in a temp var	2018-03-13 18:55:03 +03:00
kd-11	87741141f1	rsx/vulkan: Add post-compilation key validation and dynamically determine attachment write maks based on decompiled shader - A new step is added between decompilation and pipeline object creation allowing for properties to be updated based on shader contents - Allos masking off attachment writes that are unmodified in the shader	2018-03-13 18:55:03 +03:00
kd-11	705820c430	rsx: Nvidia driver compatibility workarounds - Sanitize NaN values before they reach the driver. On nvidia (X * NaN = X)	2018-03-13 18:55:03 +03:00
kd-11	07cbf3da48	rsx/gl: Minor fixes - Identify depth textures reaching the gpu via shader_read upload path - Use correct timestamp counter for opengl - inline draw_state::test_property because msvc doesnt do it for us	2018-03-13 18:55:03 +03:00
kd-11	8ccaabb502	vulkan: Optimize vertex data upload - Reuse buffer views as much as possible, vkCreateBufferView is slow on NV Implemented as a large sliding window, reuseable until it is filled	2018-03-13 18:55:03 +03:00
kd-11	01349b8cee	rsx: Texture cache fixes - Optionally attempt to merge framebuffers into an atlas if partial resources are missing - Support for data update requests to the temporary subresource handler This is useful for framebuffer feedback loops where a new copy is needed after every draw call (resource is always dirty)	2018-03-13 18:55:03 +03:00
kd-11	4487cc8e7a	Remove an ugly hack pertaining to partial framebuffer-resident texture data - Its better to fill in the missing information with a wrap or clamp than to fake the texture reads in valid regions - Texture coordinate scaling is used to fill in for the cropped dimension available	2018-03-13 18:55:03 +03:00
kd-11	661b8b006f	rsx: Add texture readback statistics to the texture cache and debug overlay	2018-02-16 16:14:54 +03:00
kd-11	1bd77c2f51	rsx: Add cache pattern checking to blit engine resources - Feature was implemented long ago but was not functional due to bugs	2018-02-16 16:14:54 +03:00
kd-11	c191a98ec3	vulkan API fixes - Fix for texture barriers - vulkan: Rework texture cache handling of depth surfaces - Support for scaled depth blit using overlay pass - Support proper readback of D24S8 in both D32F_S8 and D24U_S8 variants - Optimize the depth conversion routines with SSE - vulkan: Replace slow single element copy with std::memcpy - Check heap status before attempting blit operations - Bump guard size on upload buffer as well	2018-02-16 16:14:54 +03:00
kd-11	3bbecd998a	infinitesimal fixes	2018-02-16 16:14:54 +03:00
kd-11	a64bea1286	rsx/fp: Discard shaders with undefined (non-existent) writes. On nvidia+vulkan, undefined writes autofill with blue color	2018-02-16 16:14:54 +03:00
kd-11	89c548b5d3	rsx: fbo fixes 2.5 - Implement flush-always behaviour to partially fix readback from a currently bound fbo - Without this, only the first read is correct, as more draws are added the results become 'wrong' - Fixes WCB and cpublit behviour - Synchronize blit_dst surfaces to avoid data loss when gpu texture scaling is used - Its still faster in such cases to disable gpu texture scaling but some types cannot be disabled without force cpu blit (e.g framebuffer transfers) - Memory management tuning - rsx: on-demand texture cache rescanning for unprotected sections - rsx: Only framebuffer resources are upscaled - Do not resize regular blit engine resources - Lazy initialize readback buffer when using opengl -- These measures should help minimize vram usage	2018-02-16 16:14:54 +03:00
Nekotekina	cce0ad0c35	Clean vm::ps3 namespace use	2018-02-09 17:49:37 +03:00
kd-11	eeb6e29e39	vulkan: implement proper texture read barriers	2018-02-02 10:07:55 +03:00
kd-11	b9cca71c47	gl: API compliance fixes - Do not assume texture2D when creating new textures - Flag invalid texture cache if readonly texture is trampled by fbo memory. Avoids binding a stale handle to the pipeline and is rare enough that it should not hurt performance	2018-02-02 10:07:55 +03:00
kd-11	33bcdd476c	glsl/fp/vp: Avoid shader clutter - Do not add unused subroutines in shaders unless necessary -- makes shaders easier to read and disassembled spir-v has less clutter - glsl: Replace switch block with lookup table	2018-01-30 21:16:43 +03:00
kd-11	2e04dceaf0	rsx: misc fixes - Supply explicit options for spv emit allowing optimizations (not yet compiled into the backend) - Add epsilon fix to glslcommon - Fix shader dialog crash when using qt (race condition)	2018-01-30 21:16:43 +03:00
kd-11	648fc92184	rsx/fp/vp: Epsilon value is too large! - Original epsilon value was 1.E-10 which nvidia linux driver could not read properly -- Restores the original value represented in decimal notation	2018-01-30 21:16:43 +03:00
kd-11	743928b379	vk/gl: Preserve clamped z precision to some extent - Use edges of depth range to map clamped stuff Disable range compression on regular draws vs extended range draws - Some applications require full 0-1 usage without compromises. -- TODO: This leaves the extended range z values to fight with regular draws in the .99 - 1.0 range	2018-01-22 11:43:35 +03:00
kd-11	6828fbf658	rsx/texture_cache: Remove hacks; it has been proven that in offsets are in x16 fixed point	2018-01-19 12:03:57 +03:00
kd-11	0a2992839b	rsx/gl/vk: Simulate z clipping with selective depth clamp - The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances -- If depth falls within near/far, clamp depth value to [0,1]	2018-01-19 12:03:57 +03:00
lewmpk	d64e79bd9f	fix clang warning: logical-op-parentheses	2017-12-31 22:08:17 +03:00
kd-11	1ea5e7404a	rsx: Workaround for nvidia linux - For some reason, using 1.E-x notation does not work on nvidia linux. Could be a bug in spir-v generator or the driver itself	2017-12-31 12:43:40 +03:00
kd-11	d6bc6ec2c1	rsx: fix initial swizzle ordering for render target data	2017-12-22 20:08:14 +03:00
kd-11	4a0c4259f0	c++ is hard - Remove unnecessary const definitions	2017-12-22 20:08:14 +03:00
Nekotekina	61de20a633	RSX: remove SSSE3 dependency	2017-12-20 00:04:08 +03:00
kd-11	47060cdc5f	rsx/fp: Fix typo	2017-12-18 10:45:37 +03:00
kd-11	7dd349ae8e	Update FragmentProgramDecompiler.cpp	2017-12-18 10:45:37 +03:00
kd-11	4e80858bed	rsx/fp: Hotfix for TEXBEM/TXPBEM	2017-12-18 10:45:37 +03:00
kd-11	e89a035e8b	rsx/fp: Implement TXPBEM	2017-12-18 10:45:37 +03:00
kd-11	f7c52d3bb7	rsx/fp: Implement TEXBEM (untested)	2017-12-18 10:45:37 +03:00
kd-11	6f8dd20f03	rsx/fp: Stuff - Implement BEM - Add LG2 to special instructions	2017-12-18 10:45:37 +03:00
kd-11	6891323c18	rsx: framebuffer textures do not have mipmaps! - Force mipmap count to 1 if sampling from an RTV/DSV - TODO: Better wcb flush detection, it should be better to re-upload the texture after it has been dwnloaded if expected mipmaps are > 1	2017-12-18 10:45:37 +03:00
kd-11	7c7cd4153e	rsx: Framebuffer setup fixes - Sometimes square renders are done to surfaces with pitch=64 and re-uploaded with swizzle scanning -- This setup avoids discarding targets if they are square and pitch == 64	2017-12-18 10:45:37 +03:00
kd-11	ff0f1510e5	rsx: Minor fixes - Abort nv406e semaphore acquire if the rsx thread stalls/crashes - Fix texture size approximation to take mipmaps into account. Fixes some games hanging with WCB	2017-12-18 10:45:37 +03:00
kd-11	3338fdb936	rsx: Fix RGB565 blits. Data is byteswapped on input - Fixes messed up BG on retroarch glyphs	2017-12-18 10:45:37 +03:00
kd-11	6dfe32c6d2	fix linux builds	2017-12-18 10:45:37 +03:00
kd-11	95966a467e	rsx: Texture cache fixes - Handle blit resources in a more consistent way - TODO: Handle some corner cases (piyotama)	2017-12-18 10:45:37 +03:00
kd-11	ac0022483a	rsx: Implement delayed swizzle remap for blit engine resources - Fixes remap vectors for memory copied via blit engine as it has no context	2017-12-18 10:45:37 +03:00
kd-11	0b3fbf1d4c	rsx: Narrow the race condition window further - Needs aliased paging to be implemented to fix properly or a re-entrant global IO lock	2017-12-06 12:55:49 +03:00
kd-11	a2b4cf22b5	rsx: Reimplement invalidate_range_base_impl - Avoid unprotecting memory until just before we have to write the data - Avoids race conditions where the caller thread takes too long to enter the second phase and another thread accesses the "bad" memory	2017-12-06 12:55:49 +03:00
kd-11	9853027f72	rsx/vp: Decide default return values in case of undefined attributes based on location ID - Different default values should be returned for different attributes	2017-12-04 18:22:18 +03:00
kd-11	90c2324e47	rsx: Program cache fixes - Reorganize storage hash vs ucode hash - Scan for actual fragment program start in case leading NOPed code precedes the actual instructions -- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes	2017-12-04 18:22:18 +03:00
kd-11	cdd4fd9867	rsx/fp: Explicitly insert global functions. - Functions such as pack/unpack ops must exist before the shared gather functions are declared	2017-12-04 18:22:18 +03:00
kd-11	896c8991de	rsx/fp: Properly implement PK/UP instructions based on NV_fragment_program documentation	2017-12-01 21:00:50 +03:00
kd-11	fe9090bd39	rsx/fp: Implement register gather (only for UP(X) instructions) - Workaround for temp register aliasing between H and R variants - TODO: Implement temp regs as 128 bit-blocks with r/w as pack/unpack	2017-12-01 21:00:50 +03:00
kd-11	a18ae0f6ac	rsx/fp: Reimplement PK(X) and UP(X) opcodes. The read back values are obviously in normalized range - Confirmed with a GOW shader which writes result of UP8 to BGRA8 output	2017-12-01 21:00:50 +03:00
kd-11	6c9c300fe0	rsx: Fix texture cache memory usage statistics	2017-12-01 21:00:50 +03:00
kd-11	ddebc334bf	rsx: Fixes - Discard intentionally invalidated framebuffer resources. These are created after a flush has happened, forcing reupload since contents cannot be guaranteed (strict mode only) - Fix for blits using vulkan; dont use the copy method if formats do not match, use generic blit instead	2017-12-01 21:00:50 +03:00
kd-11	145ecb00fc	rsx: Texture cache hotfixes	2017-12-01 21:00:50 +03:00
kd-11	4d75e98647	rsx/fp: Do not apply input mods to all types of inputs - Temp registers are confirmed to be affected - Const registers are confirmed to be unaffected - Varying inputs are not confirmed yet	2017-12-01 21:00:50 +03:00
kd-11	de5a4fe083	rsx: Reimplement depth <-> RGBA reinterpretation code - Implements proper channel order for fp24-ARGB8 conversion - Takes swizzle remap into account when reconstructing source bytes	2017-12-01 21:00:50 +03:00
kd-11	33f3a3e014	rsx: Major fixes - Handle aliased depth + color target by disabling depth writes. This looks to be the correct way - Add support for generic passes that cannot be done using general imaging operations. Lays the framework for tons of features and effects - Implement RGBA->D24D8 casting. Sometimes games will split depth texture into RGBA8 then use the new RGBA8 as a depth texture directly -- This happens alot in ps3 games and I'm not sure why. Its likely the ps3 did not sample fp values with linear filtering so this is a workaround -- Only implemented for openGL at the moment -- Requires a workaround for an AMD driver bug	2017-12-01 21:00:50 +03:00
kd-11	0aaae000b3	rsx: Minor improvements	2017-12-01 21:00:50 +03:00
kd-11	db58cd7513	rsx: Invalidate both depth and color surfaces when binding a new surface	2017-12-01 21:00:50 +03:00
Zion Nimchuk	3a9ae2df9e	silence warnings in RSX stuff	2017-11-30 18:07:19 +03:00
kd-11	df7d52b177	rsx/fp: Give abs higher prio as it invalidates any precision checks	2017-11-20 15:18:57 +03:00
kd-11	f5addbf751	rsx/fp: improve SRC modifier order - Neg modifier is applied after clamping. Abs has not been tested/proven so precision clamp goes first now, not last	2017-11-20 15:18:57 +03:00
kd-11	a8c0dd649e	rsx/fp: RE work on precision modifier bits - Testing DS2 has revealed clamping bits in SRC1 that were not respected and left negative values reaching the framebuffer	2017-11-20 15:18:57 +03:00
kd-11	6d2dcbd164	rsx: Enable hw blit engine for local->main memory blit operations as well	2017-11-20 15:18:57 +03:00
kd-11	be6b5922dd	rsx: research native texel byte order on cpu readback (WCB) [WIP]	2017-11-20 15:18:57 +03:00
kd-11	3c9126d91f	rsx: Ignore FENCE instruction as it seems like its ignored on realhw - This is likely a compiler hint for performance reasons and not a mandate	2017-11-09 14:39:50 +03:00
kd-11	ed21bb309f	rsx: Minor fixups - Fix texture cache blit behaviour when src has AA enabled and dst is a blit dst texture with or without AA -- This requires handling AA resolve by removing a half downscale on multisampled axes - Return all ones when a vertex attribute is disabled. -- Some games forget to enable vertex attributes actually needed by the fs	2017-11-08 13:15:34 +03:00
kd-11	4e9160104a	rsx/vk/gl: Cleanup and refector glsl::getFunctionImpl - Both backends now generate very similar code	2017-11-08 13:15:34 +03:00
kd-11	8733505d0a	rsx: Minor fixes - texture_cache: Fix internal size calculation for subresources - vk: Delay dynamic state updates until just about to draw to ensure no flush has discarded the cb state	2017-11-08 13:15:34 +03:00
kd-11	baa5a261a5	rsx: Rewrite invalidate_range_impl_base in a way that makes sense. Fixes wcb hanging	2017-11-08 13:15:34 +03:00
kd-11	3730b9d1da	rsx: More fixes - Support for raster offsets in surface descriptors (looks to be unused) - Do not tag disabled render targets when using MRT (pitch = 64) - Add missing notify_surface_changed() call for openGL	2017-11-08 13:15:34 +03:00
kd-11	4ca98e53a6	rsx: Fix for unnormalized texture access	2017-11-08 13:15:34 +03:00
kd-11	300a36d3d6	rsx: Fixes for cubemap reconstruction - Do not abort generation if sides are missing, replace with blank surfaces instead - Make cubemaps scale with res scaling	2017-11-08 13:15:34 +03:00
kd-11	60c7a508a7	rsx: Refactor create_subresource_view(deferred_subresource&) and implement a subresource cache - This limits the number of times an image is copied and improves performance	2017-11-08 13:15:34 +03:00
kd-11	1fa18757fc	rsx: Implement render-to-cubemap; Also simplify unnormalized samplers [WIP, DELETE SHADER CACHE, VERY SLOW] - Enables real-time cubemap reflections - TODO: Vulkan is broke; rsx is very slow with this feature	2017-11-08 13:15:34 +03:00
kd-11	fbb7186e66	rsx/gl: Addendum - Fix fragment shader to consume texture scale parameters	2017-11-08 13:15:34 +03:00
kd-11	0961a43997	rsx: Implement 1D<->2D image type casts	2017-11-08 13:15:34 +03:00
kd-11	7037504dcf	rsx: Workaround for missing AA flags on some surfaces - This just doesnt work right yet. It looks like AA is being used dynamically? (RDR) - TODO: Try to locate flags to set AA if AA mode is not changed	2017-11-08 13:15:34 +03:00
kd-11	eed55a446c	rsx: Minor optimization - Defer resolving image copy operations to the binding step	2017-11-08 13:15:34 +03:00
kd-11	bbcb6b6851	rsx: Fbo fixes 2 - Use AA mode to predict surface compression. Compression mode is useless without AA activated - Rewrites most image subresource fetch routines to use the new heuristic - Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code -- Fixes alot of failures when looking for tiled regions rsx: Fix antialiased unnormalized coords - scaling factors are inverse to allow proper coordinates to be computed in fs	2017-11-08 13:15:34 +03:00
kd-11	b95630d84a	rsx: Minor fixups - Optimize framebuffer memory invalidate conditions - Fix texture sampling of AA textures (wider by 2x surfaces)	2017-11-08 13:15:34 +03:00
kd-11	af1d3c2aa6	rsx: Improve surface store resource management - vk: Use frame testing to determine invalidated resources that can be safely deleted	2017-11-08 13:15:34 +03:00
kd-11	ec3e5c547f	rsx: More fixes - Tag surface store to help determine when contents have been invalidated - Crop framebuffer textures if they are not the requested dimensions!	2017-11-08 13:15:34 +03:00
kd-11	173d05b54f	rsx: Optimizations - Reimplement fragment program fetch and rewrite texture upload mechanism -- All of these steps should only be done at most once per draw call -- Eliminates continously checking the surface store for overlapping addresses as well addenda - critical fixes - gl: Bind TIU before starting texture operations as they will affect the currently bound texture - vk: Reuse sampler objects if possible - rsx: Support for depth resampling for depth textures obtained via blit engine vk/rsx: Minor fixes - Fix accidental imageview dereference when using WCB if texture memory occupies FB memory - Invalidate dirty framebuffers (strict mode only) - Normalize line endings because VS is dumb	2017-11-08 13:15:34 +03:00
scribam	4600094829	[RSX] Fix uninitialized value before usage	2017-11-04 01:28:53 +03:00
kd-11	daaa83b9ca	rsx: Fix for framebuffer validation	2017-11-04 00:08:30 +03:00
kd-11	30bba09fed	disable fb testing for partial framebuffer resources	2017-11-02 14:35:19 +03:00
kd-11	31b07f2c5c	rsx: Tweaks - Optimize get_surface_subresource - Add check_program_status time to draw call setup statistics. It can slow down games significantly	2017-11-02 14:35:19 +03:00
kd-11	c2ac05f734	rsx: Fix for rsx thread lockup due to nested access violations when WCB is enabled	2017-10-29 15:25:17 +03:00
kd-11	361e80f7dc	rsx: Tag cache blocks returned on access violation to validate data passed to flush_all is up to date. Should prevent recursive exceptions Partially revert Jarves' fix to invalidate cache on tile unbind. This will need alot more work. Fixes hangs	2017-10-29 15:25:17 +03:00
kd-11	7abf755a57	rsx: Avoid false positives by early rejection. Should keep cache thashing to a minimum	2017-10-28 13:26:16 +03:00
kd-11	055f0e2e4a	rsx: Export more information about affected cache sections when handling violations - This allows for better handling of deferred flushes. -- There's still no guarantee that cache contents will have changed between the set acquisition and following flush operation -- Hopefully this is rare enough that it doesnt cause serious issues for now	2017-10-28 13:26:16 +03:00
kd-11	49f4da3016	rsx: Fixes - vk: Always reopen primary command buffers. They should only be closed in flush_command_queue - If uploading a texture and there are collisions with protected buffers, do not rebuild the cache - Perform writes via flush before reprotecting pages that were not trampled - Only flush no pages once	2017-10-28 13:26:16 +03:00
kd-11	bf234dc668	rsx: Implement memory tags for strict mode to validate render target memory	2017-10-28 13:26:16 +03:00
Jake	e0d1ac676e	rsx: invalidate surface store address when tile is unbound	2017-10-28 12:46:20 +03:00
kd-11	e6849a59a2	rsx: Better detection of framebuffer memory copy operations - Still requires texture stitching to work correctly, but matching dimensions works well for now	2017-10-24 22:59:09 +03:00
kd-11	6918e265ec	rsx/vk: Be a little more frugal with texture memory to avoid running out of VRAM on 1GB cards	2017-10-24 22:59:09 +03:00
kd-11	e9f293f522	rsx: Improve separate treatment of write exceptions vs read exceptions - Optimizes search functionality and avoids thrashing valid sections	2017-10-24 22:59:09 +03:00
kd-11	5fc36d64b6	fix build	2017-10-24 22:59:09 +03:00
kd-11	95e6d78689	rsx: Workaround for 0 pitch textures. - Should these be ignored? Needs investigation	2017-10-24 22:59:09 +03:00
kd-11	f4a666345a	rsx: Even more texture cache fixes - Fix subresource sampling - Invalidate memory range before uploading textures to prevent hangs	2017-10-24 22:59:09 +03:00
kd-11	0de0dded53	rsx: Texture fixes continued - Fix buffer invalidate behaviour (wcb) - Disable auto rebuild with only framebuffer storage getting rebuilt - Fix vulkan subresource sampling	2017-10-24 22:59:09 +03:00
kd-11	5e58cf6079	rsx: Restructuring [WIP] - Refactor invalidate memory functions into one function - Add cached object rebuilding functionality to avoid throwing away useful memory on an invalidate - Added debug monitoring of texture unit VRAM usage	2017-10-24 22:59:09 +03:00
kd-11	ddcacb8258	general fixes; Force u32 return type for index_count and add RX Vega to primitive restart blacklist	2017-10-19 12:22:52 +03:00
kd-11	89dcafbe41	rsx: Reimplement index buffer generation - Emulate primitive restart in software whenever we get the chance - Ensure PRIMITIVE_RESTART is never active when LIST topologies are active - Reimplement TRIANGLE_FAN, POLYGON and QUAD expansion	2017-10-19 12:22:52 +03:00
kd-11	86bf61ad35	rsx: Fix memory protection - Fixes hanging when wcb is enabled	2017-10-14 14:19:14 +03:00
kd-11	eab9d06981	rsx: Texture cache fixes - Fix src/dst framebuffer detection - Silence some warnings	2017-10-13 15:23:48 +03:00
kd-11	12ab03b0b5	rsx/gl: Implement resolution scaling rsx: Revise wpos calculation to take resolution scale into account	2017-10-09 20:25:41 +03:00
kd-11	47202d5839	rsx: Set up patch functionality for program coeffecients	2017-10-09 20:25:41 +03:00
kd-11	393e3b702f	rsx: Clean up debug overlays. Add unreleased textures metric to track texture memory	2017-09-23 16:46:41 +03:00
kd-11	9ee21af524	vulkan: Optimize memory allocation	2017-09-23 16:46:41 +03:00
kd-11	b74cdcde00	rsx: Make the 3rd texture dimension matter - Affects cube maps and texture3D surfaces	2017-09-23 16:46:41 +03:00
kd-11	4d83d749a0	rsx: Texture cache fixes - Update section flags when requested - Fix nullptr dereference: cached_dest will be null if dst_is_render_target is true	2017-09-23 16:46:41 +03:00
kd-11	d0148728c6	rsx: Fixes - Fix section scanning range for early reject - Specify IMAGE_ASPECT_STENCIL when uploading image_from_cpu	2017-09-21 20:05:07 +03:00
kd-11	3499d089e7	rsx: Texture cache fixes and improvements rsx: Conditional lock hack removed vulkan - Fixes - Remove unused texture class - Fix native pitch calculation (WCB) rsx: Catch hanging begin/end pairs when flushing deferred draw calls vulkan: Register DXT compressed formats vulkan: Register depth formats gl: Workaround for 'texture stitching' when gathering flip surface - TODO: Add a proper flip hack option rsx: Fix texture memory size calculation - DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size rsx: Fix cpu copy detection rsx: Validate blit dst surface and dont make assumptions about region blit order - Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled rsx: Fix depth texture detection rsx: Do not manually offset into dst. The overlapped range check does so automatically rsx: Minor optimizations rsx: Minor fixes - Fix to detect incompatible formats when using GPU texture scaling and show message - Better 'is_depth_texture' algorithm to eliminate false positives	2017-09-21 16:17:06 +03:00
kd-11	6b96a2022a	rsx: Add support for non-projective shadow sampling - Fixes missing shadows in persona 5 vk: Enable polygon depth bias a.k.a polygonOffset - Fixes shadow acne in persona 5	2017-09-21 16:17:06 +03:00
kd-11	3836b40bf7	rsx: Fixups	2017-09-21 16:17:06 +03:00
kd-11	571dbfb7b1	rsx: Texture cache improvements - Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO) rsx: Fixups to allow large textures for blit operation - Also includes checks for both leaking sections and blit regions for vulkan hotfix for hanging when using WCB addendum - unlock both ro and no blocks before attempting to copy memory blocks gl: Fixups for ARB_explicit_uniform_location - Forces glsl v 430 to make use of the extension rsx/vk: Rework texture cache to minimize recursive access violations - Also modifies the vulkan commandbuffer begin/end/submit mechanism gl: Fix cached_texture_section::is_flushable to take memory protection into account rsx: Fix blit dst offset calculation	2017-09-21 16:17:06 +03:00
kd-11	45d0e821dc	gl: Minor optimizations rsx: Texture cache - improvements to locking rsx: Minor optimizations to get_current_vertex_program and begin-end batch flushes rsx: Optimize texture cache storage - Manages storage in blocks of 16MB rsx/vk/gl: Fix swizzled texture input gl: Hotfix for compressed texture formats	2017-09-21 16:17:06 +03:00
kd-11	e37a2a8f7d	rsx: Texture cache fixes and improvments gl/vk/rsx: Refactoring; unify texture cache code gl: Fixups - Removes rsx::gl::texture class and leave gl::texture intact - Simplify texture create and upload mechanisms - Re-enable texture uploads with the new texture cache mechanism rsx: texture cache - check if bit region fits into dst texture before attempting to copy gl/vk: Cleanup - Set initial texture layout to DST_OPTIMAL since it has no data in it anyway at the start - Move structs outside of classes to avoid clutter	2017-09-21 16:17:06 +03:00
kd-11	07c83f6e44	gl: cleanup; fix program linkage on mesa using GL_ARB_explicit_uniform_location, also make use of ARB_multidraw	2017-09-21 16:17:06 +03:00
kd-11	9359b8c170	rsx/fp: Shader decompiler fixes - Requires proper 2-pass impl rsx/fp: Catch hanging code blocks rsx/fp: Don't pause on scaling error	2017-09-21 16:17:06 +03:00
kd-11	2d0f1f27a8	esx: Fixes to the texture cache rsx: Blit engine improvements - Always handle blits to and from framebuffers through the GPU - Handle depth surfaces properly when using GL - Check for format mismatches when blitting to the surface store [WIP]	2017-09-21 16:17:06 +03:00
kd-11	2033f3f7dc	rsx/vk/gl: Refactoring and reimplementation of blit engine Fix rsx offscreen-render-to-display-buffer-blit surface reads - Also, properly scale display output height if reading from compressed tile gl: Fix broken dst height computation - The extra padding is only there to force power-of-2 sizes and isnt used gl: Ignore compression scaling if output is rendered to in a renderpass rsx/gl/vk: Cleanup for GPU texture scaling. Initial impl [WIP] - TODO: Refactor more shared code into RSX/common	2017-09-21 16:17:06 +03:00
kd-11	2e9405db4c	rsx: Remove index expansion for quad strips	2017-08-26 21:53:54 +03:00
kd-11	fe5828cb47	rsx: Implement QUAD_STRIP - QUAD_STRIP evaluates to TRIANGLE_STRIP in memory. The memory layout is identical. - The only difference between the two modes would be the primitive_ID but that doesnt matter on RSX - Its worth noting that results will be different between the two modes if input vertices are non-coplanar for every set of N verts	2017-08-26 21:53:54 +03:00
kd-11	9a7ce2fd29	rsx/vp: ARL fix	2017-08-26 21:53:54 +03:00
kd-11	f71f67c4ff	rsx: Make fragment state dynamic to reduce shader permutations	2017-08-26 21:53:54 +03:00
Jake	5d7c454e52	rsx: Vertex Decompiler, fix sca register assignment	2017-08-19 12:27:53 +03:00
kd-11	334327df67	rsx: Add a success message on program compile completion - Should help users wondering if rpcs3 'froze' during shader compile	2017-08-16 23:58:30 +03:00
kd-11	650c1c64f1	gl: Workarounds for intel GPUs which dont seem to be truly GL4 compliant	2017-08-16 23:58:30 +03:00
kd-11	c04aa05398	rsx: Shader pipeline fixes and improvements - Do not set zfunc if alphakill is not enabled. This is because at the moment alphakill requires a different shader to be built - use glsl loop-unroll friendly comparison; skip vertex input compare if either key requests it - Minor tweaks to fp key generation	2017-08-16 23:58:30 +03:00
kd-11	00c6a589a5	rsx/util: Add simple consistent hash function rsx/vk/shaders_cache: Move vp control mask to dynamic state rsx/vk/gl: adds a shader cache for GL. Also Separates pipeline storage for each backend rsx: Add more texture state variables to the cache	2017-08-16 23:58:30 +03:00
kd-11	c7dca1dbef	rsx/vk: Implement shaders cache and fix broken pipeline_storage comparison and hash - Fix pipeline state storage to uniquely store each pipeline variant - Adds a progress bar to indicate loading is taking place	2017-08-16 23:58:30 +03:00
kd-11	00b0311c86	rsx/gl/vulkan: Refactoring and partial vulkan rewrite - Updates vulkan to use GPU vertex processing - Rewrites vulkan to buffer entire frames and present when first available to avoid stalls - Move more state into dynamic descriptors to reduce progam cache misses; Fix render pass conflicts before texture access - Discards incomplete cb at destruction to avoid refs to destroyed objects - Move set_viewport to the uninterruptible block before drawing in case cb is switched before we're ready - Manage frame contexts separately for easier async frame management - Avoid wasteful create-destroy cycles when sampling rtts	2017-08-16 23:58:30 +03:00
kd-11	6a707f515e	vk/gl: Factorize shared GLSL code - prep vulkan for shared glsl backend	2017-08-16 23:58:30 +03:00
kd-11	d54c2dd39a	gl: Move vertex processing to the GPU - Significant gains from greatly reduced CPU work - Also reorders command submission in end() to improve throughput - Refactors most of the vertex buffer handling - All vertex processing is moved GPU side	2017-08-16 23:58:30 +03:00
mp-t	607d2486ea	Code review (#3114 ) * Fix always-true conditions in sceNp module * gl_render_targets: useless check on unsigned variable, possible bug * fixed UB in crypto utility functions * copy-paste error in vk::init_default_resources * pass strings by const ref * Dont copy vectors. Make sure copies are not needed because functions are used in a multi-threaded context.	2017-08-01 20:22:33 +03:00
kd-11	a7c28f5827	rsx: Fix remainder/iteration computations in BufferUtils	2017-07-24 16:52:42 +03:00
Jake	fde6099769	rsx: Fix vertex decompiler to support 2 arg destination	2017-07-22 09:41:00 +03:00
scribam	9747ab61f9	Missing function names (HLE) and small fixes (#3038 ) * Add sceNpScoreGetFriendsRanking and sceNpScoreGetFriendsRankingAsync functions * Add sceNpSnsFbGetLongAccessToken function * Add new functions for the sceNpTus module * Add new functions for the cellSailRec module * Stub cellCrossControllerInitialize * Add sceNpAuth* functions for the sceNp2 module * Remove unnecessary call to c_str() * Add missing module id "CELL_SYSMODULE_ADEC_AT3MULTI" * Add Turkish keyboard mapping constant * Add cellOskDialogExtRegisterKeyboardEventHookCallbackEx function * Update cellSubDisplay * Update cotire version to 1.7.10 * Replace cellSubdisplay by cellSubDisplay * Update cellSysutil.cpp with new functions stubbed	2017-07-21 18:41:11 +03:00
kd-11	2526626646	rsx: Surface cache bug fixes - Properly handle data 'transfer' when recycling frame buffer images - Clear 'recycled' surfaces before use	2017-07-19 23:28:33 +03:00
kd-11	94c1b74a17	fix build; restore asmjit reader_lock for now	2017-07-19 23:28:33 +03:00
kd-11	f69121116a	rsx/vk: Optimize framebuffer lifetime management - Significant gains due to avoiding aggressive create-delete cycles every frame	2017-07-19 23:28:33 +03:00
kd-11	9e7a42d057	rsx: Minor bug fixes - vk: Do not select first available format when choosing a swapchain format - gl/vk: Ignore rendering zero sized framebuffers/scissors - fp: Re-enable range clamp on fp16 registers; fix fx12 clamping [-2, 2]	2017-07-08 14:52:16 +03:00
kd-11	d43e06c0ea	rsx: Fix some fp bugs rsx/fp: Properly fix RCP - Input is always scalar, output is a vector rsx/fp: Ignore forced unit for SIP and TEX instructions	2017-07-08 14:52:16 +03:00
kd-11	3d935b64f2	rsx/gl/vk: Enable contents transfer when a new framebuffer is created and not cleared	2017-07-08 14:52:16 +03:00
kd-11	d7662e54cc	rsx/fp: Do not swizzle shadow lookups	2017-06-29 13:13:19 +03:00
kd-11	459a7ba5a2	rsx: Avoid using push_back/emplace_back on empty STL containers - Reckless management of STL containers causes significant slowdown - Also reorders vertex compare steps to fail quickly on simpler checks	2017-06-29 13:13:19 +03:00

... 6 7 8 9 10 ...

924 commits