rpcsx

mirror of https://github.com/RPCSX/rpcsx.git synced 2026-03-17 18:54:51 +01:00

Author	SHA1	Message	Date
Anuskuss	7e31c30133	Intel iGPU needs workaround on Windows	2019-11-15 12:08:16 +03:00
Nick Renieris	cc59d319e1	overlay: Performance graphs	2019-11-12 20:43:09 +01:00
kd-11	8234bdb8f0	vk: Check for heap change events after a grow to avoid spec violations - Avoid referencing the old buffer in stale views. Status can be set globally if requested during heap creation.	2019-11-10 17:53:12 +03:00
kd-11	5968427a2f	vk: Initialize queries before use - The spec does not guarantee that queries are initialized. In fact, it now says all queries must be reset before they are used for the first time.	2019-11-10 17:53:12 +03:00
kd-11	8ea9bc9874	vk: Reduce memory allocation sizes of default heaps - The heaps will grow as desired, no need to overallocate to cater to the most resource-hungry games	2019-11-10 17:53:12 +03:00
kd-11	0a32d478df	vk: Enable auto-growing of the data heaps for the performance case	2019-11-10 17:53:12 +03:00
kd-11	357e0d2097	vk: Implement explicit runtime flags to manage events like heap sync	2019-11-10 17:53:12 +03:00
kd-11	f359342721	rsx: Implement mutable ring buffers with grow support	2019-11-10 17:53:12 +03:00
kd-11	5f39a594ac	rsx: Clean up some unused legacy methods unnecessary after d3d removal	2019-11-10 17:53:12 +03:00
Emmanuel Gil Peyrot	56f82d2701	rsx: Wrap gsl::span definition into Utilities/span.h	2019-11-09 20:00:50 +01:00
Emmanuel Gil Peyrot	f76720ceb0	Remove extraneous ::narrow<int>() calls GSL’s gsl::span didn’t use the correct type for its index_type, which is why they were needed.	2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot	72cdf0b04c	Replace gsl::span’s implementation with tcbrindle’s This implementation optimises correctly on all relevant compilers, unlike GSL’s which gave extremely slow code on any compiler other than MSVC. Supersedes #6948.	2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot	ef368c5171	rsx: Replace gsl::byte with C++17’s std::byte	2019-11-09 19:30:05 +01:00
kd-11	7072489a6e	rsx: Implement point sprite coordinate generation - When the point sprite flag is set, overrides the input similar to the 2D mask. The returned X and Y values are always the gl_PointCoord values for the fragment. - Stacks with the 2D mask to override the z and w coordinates.	2019-11-09 12:50:53 +03:00
kd-11	63673b1a9f	rsx: Implement full color remap for the D24S8->ARGB8 converter	2019-11-08 19:11:59 +03:00
kd-11	8d1505752f	rsx: Validate depth test setup to avoid address contention	2019-11-07 11:32:44 +03:00
kd-11	508ffcb775	vk: Compute kernel fixups - Adhere to workgroup count limits as exposed by the GPU vendor. They already execute properly even when going beyond the limits but this removes validation noise. - Fix invocation counts for deswizzle kernel. The count was incorrect if blocksize was not 4, causing a bunch of useless work to be done.	2019-11-05 22:07:22 +03:00
kd-11	99d71fdc2a	vk: Implement layer batching for the GPU swizzle decoder - Handles all LODs per layer meaning cubemaps are now fully handled in 6 passes instead of 6 * (log2(width)) passes. - Handles all LODs of a 3D texture in one pass as well. - The improvements do warrant dropping down the number of allowed compute invocations a bit	2019-11-05 22:07:22 +03:00
kd-11	7a0b94f343	vk: Minor compute optimizations - Remove use of uniform buffers for compute static data. Use push constants instead. - Minor touchups to the deswizzle code to avoid redundant data copies.	2019-11-05 22:07:22 +03:00
kd-11	1266b63135	vk: Enable gpu deswizzling	2019-11-05 22:07:22 +03:00
kd-11	9cd3530c98	rsx: Set up framework for hw deswizzle	2019-11-05 22:07:22 +03:00
kd-11	57d3c9e171	rsx: Take empty queries into account for engines that spam report reads. - Some games will spam the report queue with requests but have zpass statistics enabled.	2019-11-04 18:48:41 +03:00
kd-11	2a8f2c64d2	rsx: Implement report transfer deferring - Allow delaying report flushes triggered by image_in or buffer_notify - When the report is ready, all the delayed transfers will automatically be done. - TODO: Make this configurable?	2019-11-04 18:48:41 +03:00
kd-11	3e0f9dff4d	vk: Improve zcull synchronization - Use zcull sync hints more aggressively	2019-11-04 18:48:41 +03:00
kd-11	fe3c290d03	vk: Reimplement occlusion result reading - Implement partial result reads	2019-11-04 18:48:41 +03:00
kd-11	51e0eaaddc	rsx: Implement backend notification for upcoming zcull reads	2019-11-04 18:48:41 +03:00
kd-11	df63de8f16	rsx: Allow u32 restart index with full index width	2019-11-04 16:56:34 +03:00
kd-11	6b3af09fa5	vk: Improved crash message for missing MSAA features	2019-11-04 16:56:34 +03:00
kd-11	bbed791ee0	vk: Add explicit support for identity image views - Allows bypassing all remap shenanigans to make some operations that rely on the raw image to work correctly.	2019-11-01 19:35:46 +03:00
kd-11	63bbf11a76	vk: Add video out calibration pass - Adds gamma correction and RGB range filters to output to match PS3	2019-10-31 14:43:24 +03:00
kd-11	78aefe5b5e	rsx/overlays: Add support for other primitive types other than triangle_strips	2019-10-31 14:43:24 +03:00
Nekotekina	e3e7051ed3	Minor optimization in BufferUtils.cpp Don't use PSHUFB for horizontal operations. Utilize PHMINPOSUW to compute max as well: + sse41_hmin_epu16 + sse41_hmax_epu16	2019-10-30 18:52:34 +03:00
Nekotekina	b1968769b7	Minor cleanup in BufferUtils.cpp Replace inline asm with intrinsic using target attribute trick.	2019-10-30 17:53:51 +03:00
linkmauve	cfd5cf6bdb	Optimise primitive_restart::upload_untouched() (#6881 ) * rsx: Optimise primitive_restart::upload_untouched() with SSE4.1 This optimisation is only applied when skip_restart is false. I’ve only tested the u16 codepath, as it is the one used in NieR. In some very unscientific profiling, this function used to take 2.76% of the total frame time at the save point of the port town, it now takes about 0.40%. * rsx: Mark all SSE4.1 functions with attributes on gcc and clang This assures the compiler we will take care of only calling these functions after having checked that the CPU does support these instructions. * rsx: Add an AVX2 implementation of primitive restart ibo upload * rsx: Remove redefinition of SSE4.1 instructions Now that clang is aware that our functions are compiled with SSE4.1, it lets us generate this code using its intrinsics. * rsx: Optimise vector to scalar conversion This is done using minpos and srli intrinsics and generate less code than before. Thanks Nekotekina for the suggestion!	2019-10-30 16:42:44 +03:00
kd-11	35794dc3f2	vk: Add checks for alphaToOne support - This feature is very rarely used, as alphaToCoverage is commonly used as a replacement for blending, not in addition to it.	2019-10-30 01:06:28 +03:00
kd-11	eda09489b2	vk: Optionally ignore depth bounds testing on hardware that does not support it.	2019-10-29 20:03:54 +03:00
kd-11	7a5c20ef85	vk: Minor spec touchups - Simplify active instance management. While multicontext support will be required in future, this is better done with multiple logical devices rather than multiple instances. - Destroy the WSI surface on exit - Enable depthBoundsTest explicitly. TODO: Properly check for supported features.	2019-10-29 20:03:54 +03:00
kd-11	aa3eeaa417	rsx: Separate subresource_layout:dim_in_block and subresource_layout::dim_in_texel - These two are not always linked when working with compressed textures. The actual texels extend past the actual size of the image if the size is not aligned. e.g if height is 1, the real height is 4, but its not possible to determine this from the aligned size. It could be 1, 2, 3 or 4 for example. - Fixes image out-of-bounds writes when uploading from CPU	2019-10-29 20:03:54 +03:00
Eladash	42fc698186	rsx: Enable primitive restart index only when needed (#6889 ) * rsx: Enable primitive restart index only when needed * rsx: Use if with initializer in read_put()	2019-10-28 23:16:27 +03:00
kd-11	479d92d075	vk: Fix uninitialized (and wrong) variable access	2019-10-28 15:20:45 +03:00
kd-11	b0708367c2	vk: Round lod bias to the nearest 0.5 to lower number of permutations when nearest mipmap sampling is used - The lambda values will be rounded to the nearest integer anyway	2019-10-28 15:20:45 +03:00
kd-11	3e8dfede1c	vk: Modify sampler cache to uniquely identify all the input parameters - Avoids iteration when variable mipmap counts or lod bias parameters change	2019-10-28 15:20:45 +03:00
kd-11	ad2add9574	rsx:: Use fcmp correctly	2019-10-28 15:20:45 +03:00
kd-11	d04241ad25	rsx: Allow compressed textures to be unaligned in size - Align based on row length but let the texture itself be of arbitrary dimensions	2019-10-28 15:20:45 +03:00
Emmanuel Gil Peyrot	69e9ee26f6	rsx: Make input_is_swizzled a template parameter This lowers the relative cost of this function from ~2.25% to ~1.80% on gcc 9 which I found quite surprising, some of it probably gets inlined better in the callers, but I haven’t been able to isolate which parts.	2019-10-28 13:28:51 +03:00
kd-11	d53d7bb598	vk: Restore vega native use of FP16 in shaders - AMD proprietary drivers should work fine	2019-10-23 12:20:06 +03:00
Emmanuel Gil Peyrot	54d95373d0	Support fullscreen properly on Wayland The current behaviour when going fullscreen from windowed was to keep the previous size of the swapchain, with black borders on all sides, which looks quite ugly. The root of this issue is that rpcs3 only checks for frame resize if vkQueuePresent() returns VK_SUBOPTIMAL_KHR, which drivers can’t do on Wayland, see https://gitlab.freedesktop.org/mesa/mesa/issues/1979	2019-10-23 12:19:46 +03:00
kd-11	e04b6cd7c0	rsx: Copypasta fix - r1 is always float4 never half4. Its a full-width register unlike the other outputs which are optionally half-width.	2019-10-23 00:50:24 +03:00
kd-11	00bc3fe658	Drop d3d12 backend	2019-10-22 21:45:14 +03:00
Emmanuel Gil Peyrot	14c63ec014	Fix misleading indent.	2019-10-22 16:11:43 +03:00

1 2 3 4 5 ...

2582 commits