rpcsx

mirror of https://github.com/RPCSX/rpcsx.git synced 2026-03-19 19:55:27 +01:00

Author	SHA1	Message	Date
kd-11	8ca53f9c84	rsx: Remember to min-max the anchor indices of a polygon or triangle fan	2019-11-24 19:01:57 +03:00
kd-11	429a76a140	rsx: Remove redundant check	2019-11-23 16:11:18 +03:00
kd-11	41e7d2aa0a	rsx: Select correct image aspect for blit engine targets.	2019-11-19 13:18:15 +03:00
kd-11	41c3180276	rsx: Fix invalid format checks for DMA sections which are typeless	2019-11-19 13:18:15 +03:00
kd-11	9dab0575fa	rsx: Add missing format check for the RTV<->DSV transfer case - TODO: Rewrite resource handling routines	2019-11-18 13:17:00 +03:00
kd-11	4a0e1c79ed	rsx: Improve format validation for blit engine - Check all possible cases where format mismatch is possible. - Warn if a slow path is going to be taken. Should help with future optimizations.	2019-11-18 13:17:00 +03:00
kd-11	2408922806	rsx: Do not ignore clamping for some routines that do not have implied range	2019-11-18 13:17:00 +03:00
kd-11	0a32d478df	vk: Enable auto-growing of the data heaps for the performance case	2019-11-10 17:53:12 +03:00
kd-11	f359342721	rsx: Implement mutable ring buffers with grow support	2019-11-10 17:53:12 +03:00
Emmanuel Gil Peyrot	56f82d2701	rsx: Wrap gsl::span definition into Utilities/span.h	2019-11-09 20:00:50 +01:00
Emmanuel Gil Peyrot	f76720ceb0	Remove extraneous ::narrow<int>() calls GSL’s gsl::span didn’t use the correct type for its index_type, which is why they were needed.	2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot	72cdf0b04c	Replace gsl::span’s implementation with tcbrindle’s This implementation optimises correctly on all relevant compilers, unlike GSL’s which gave extremely slow code on any compiler other than MSVC. Supersedes #6948.	2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot	ef368c5171	rsx: Replace gsl::byte with C++17’s std::byte	2019-11-09 19:30:05 +01:00
kd-11	7072489a6e	rsx: Implement point sprite coordinate generation - When the point sprite flag is set, overrides the input similar to the 2D mask. The returned X and Y values are always the gl_PointCoord values for the fragment. - Stacks with the 2D mask to override the z and w coordinates.	2019-11-09 12:50:53 +03:00
kd-11	63673b1a9f	rsx: Implement full color remap for the D24S8->ARGB8 converter	2019-11-08 19:11:59 +03:00
kd-11	1266b63135	vk: Enable gpu deswizzling	2019-11-05 22:07:22 +03:00
kd-11	9cd3530c98	rsx: Set up framework for hw deswizzle	2019-11-05 22:07:22 +03:00
Nekotekina	e3e7051ed3	Minor optimization in BufferUtils.cpp Don't use PSHUFB for horizontal operations. Utilize PHMINPOSUW to compute max as well: + sse41_hmin_epu16 + sse41_hmax_epu16	2019-10-30 18:52:34 +03:00
Nekotekina	b1968769b7	Minor cleanup in BufferUtils.cpp Replace inline asm with intrinsic using target attribute trick.	2019-10-30 17:53:51 +03:00
linkmauve	cfd5cf6bdb	Optimise primitive_restart::upload_untouched() (#6881 ) * rsx: Optimise primitive_restart::upload_untouched() with SSE4.1 This optimisation is only applied when skip_restart is false. I’ve only tested the u16 codepath, as it is the one used in NieR. In some very unscientific profiling, this function used to take 2.76% of the total frame time at the save point of the port town, it now takes about 0.40%. * rsx: Mark all SSE4.1 functions with attributes on gcc and clang This assures the compiler we will take care of only calling these functions after having checked that the CPU does support these instructions. * rsx: Add an AVX2 implementation of primitive restart ibo upload * rsx: Remove redefinition of SSE4.1 instructions Now that clang is aware that our functions are compiled with SSE4.1, it lets us generate this code using its intrinsics. * rsx: Optimise vector to scalar conversion This is done using minpos and srli intrinsics and generate less code than before. Thanks Nekotekina for the suggestion!	2019-10-30 16:42:44 +03:00
kd-11	aa3eeaa417	rsx: Separate subresource_layout:dim_in_block and subresource_layout::dim_in_texel - These two are not always linked when working with compressed textures. The actual texels extend past the actual size of the image if the size is not aligned. e.g if height is 1, the real height is 4, but its not possible to determine this from the aligned size. It could be 1, 2, 3 or 4 for example. - Fixes image out-of-bounds writes when uploading from CPU	2019-10-29 20:03:54 +03:00
kd-11	d04241ad25	rsx: Allow compressed textures to be unaligned in size - Align based on row length but let the texture itself be of arbitrary dimensions	2019-10-28 15:20:45 +03:00
kd-11	e04b6cd7c0	rsx: Copypasta fix - r1 is always float4 never half4. Its a full-width register unlike the other outputs which are optionally half-width.	2019-10-23 00:50:24 +03:00
Eladash	945abcc6cd	rsx: Align down index array offset * Also use improved to_be_t<> template (recetly ignoring one byte long types) for vm gsl::byte referencing, remove redundent narrow<> cast (same type)	2019-10-22 13:45:09 +03:00
kd-11	0b2f9f0f17	rsx: Add support for delayed shader discard. - Noticed a glitch on AMD hw and windows drivers where discard seems to affect entire 4x4 cells. - Dead fragments (outside the primitive boundary) could have their discards trigger as they do not have proper access to variables. - This introduces dead fragments along triangle edges, causing a diagonal line pattern across the screen that is very annoying.	2019-10-22 13:44:49 +03:00
kd-11	901942f24a	rsx: Replace pointless f32[4] restriction on texture parameters. - Use a struct instead to improve readability and remove pointless OpBitCast	2019-10-22 13:44:49 +03:00
kd-11	f7842b765f	rsx: Implement packed format renormalization - Renormalizes arbitrary N-bit values as 8-bit normalized. - NV hardware performs integer normalization at 8 bits if the size is less than 8. - This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.	2019-10-22 13:44:49 +03:00
kd-11	09de3b7974	rsx: Tweak behaviour of the "Use GPU texture scaling" option - If either source data or dest is a render target, do image operations on the GPU same as before - If swizzle is desired, use CPU fallback - If no scaling and no format conversion is required, use CPU fallback - If scaling is desired and the transfer target is in local memory, use the GPU - When doing trivial copies, use the routine in rsx_methods instead of duplicating code. Also has the benefit of better range checking.	2019-10-20 21:38:40 +03:00
kd-11	868547aec8	rsx: Minor improvement to fbo region invalidation - When commiting a block as fbo, keep blit_dst data as well. - Avoids removing (and losing data from) blit targets that just happen to share a page with a framebuffer.	2019-10-20 21:38:40 +03:00
kd-11	996534c559	rsx: Fixup for aspect mismatch	2019-10-20 15:25:07 +03:00
kd-11	404073c74a	rsx: Force-align compressed formats to 4x4 texel blocks and disable 1D compressed textures. - The PS3 allows defining 1D compressed images but this obviously doesn't work well on desktop.	2019-10-18 14:46:37 +03:00
kd-11	eff4e95c99	rsx: Minor cache fixup for cyclic references. - Logic was broken by mipmaps PR. Do not issue a texture barrier if a temp copy is being done.	2019-10-18 14:46:37 +03:00
kd-11	eee2237e19	rsx: Track uncached cache resources - Uncacheable resources can be reused as soon as they're made visible to the draw call. - Since they're likely to be reused every draw call until the shader changes, it is important to reuse as much as possible	2019-10-18 14:46:37 +03:00
kd-11	decf9cfcf6	rsx: Notify the backend to release or delete temporary surfaces after we're done with them.	2019-10-18 14:46:37 +03:00
kd-11	a936e43ff6	rsx: Fixup for slice gathering for structures with multiple mipmap levels - TODO: Proper multi-level assembly for non-2D structures	2019-10-17 18:18:00 +03:00
kd-11	e166dbccc8	rsx: Fix visibility of blit destination targets	2019-10-17 18:18:00 +03:00
kd-11	0c35595ce2	rsx: Remove the alpha-to-coverage hack that was added to hide the missing mipmaps in games - Moves to a purely stochastic function using dithering to simlulate coverage	2019-10-17 18:18:00 +03:00
kd-11	f0ed0285f3	rsx: Implement range-based subresource descriptor cache - The previous address-based approach was pretty awful when it comes to invalidating	2019-10-17 18:18:00 +03:00
kd-11	fbb9ed4e25	rsx: Add explicit range to cached subresource descriptors	2019-10-17 18:18:00 +03:00
kd-11	c9e3a321b2	rsx: Fixup for surface cache scanning - Fix regression when gathering cubemaps	2019-10-17 18:18:00 +03:00
kd-11	1ac976771c	rsx: Add some texture search options for the cache - Potentially optimizes texture cache searching using explicit options	2019-10-17 18:18:00 +03:00
kd-11	840b52fe80	rsx: Implement mipmap gathering from texture cache	2019-10-17 18:18:00 +03:00
kd-11	d6d8766f8d	rsx: Refactoring - Move some helper routines out of the cache core - Prep for multi-layered image search	2019-10-17 18:18:00 +03:00
kd-11	4a19a2dd24	rsx: Explicity describe transfer regions for both source and destination blocks	2019-10-04 18:10:46 +03:00
kd-11	ef5b56bc48	rsx: Align width properly when normalizing to avoid fractional results being lowered to 0	2019-09-29 11:39:22 +03:00
kd-11	c59cb1bdd3	rsx: Allow only sse4.1 capable CPUs to take the accelerated index path - Older sets lack the required min/max functionality	2019-09-13 12:28:52 +03:00
kd-11	cc313b052f	rsx: Improve hit testing when scanning for overlapping surfaces - Calculate exact sizes when doing hit tests to avoid false negatives - Defer page checking until actually require to do memory setup - Introduce align2 helper to do non-pow2 alignments	2019-09-12 23:32:21 +03:00
kd-11	9842823a8c	rsx: Check if memory actually exists when overallocating blit targets	2019-09-12 23:32:21 +03:00
kd-11	cd1345b6bb	rsx: Do not use nul section if resolution scaling is active on a surface	2019-09-12 23:32:21 +03:00
kd-11	858014b718	rsx: Experiments with nul sink	2019-09-12 23:32:21 +03:00
kd-11	212ac19c11	vk: Reimplement DMA synchronization	2019-09-12 23:32:21 +03:00
kd-11	60845daf45	rsx: Improve use of CPU vector extensions - Allow use of intrinsics when SSSE3 and SSSE4.1 are not available in the build target environment - Properly separate SSE4.1 code from SSSE3 code for some older proceessors without SSE4.1	2019-09-12 14:08:21 +03:00
kd-11	27af75fe71	rsx: Fixup for blit engine when moving inverted regions - Properly calculate overlap range when sections are inverted - Simplify transfer logic for inverted regions	2019-09-11 23:30:55 +03:00
kd-11	412c620b9d	rsx: Allow sampling from shader_read resources for blit engine - With harmonization between all texture types implemented, there is no difference between blit_engine_src and shader_read for supported formats - Adds extra format filtering to ensure no conflicts when copying data	2019-09-10 16:54:02 +03:00
kd-11	75fcfac00e	rsx: Modify find_cached_texture to respect gcm_format. Can pass 0 for "dont care"	2019-09-10 16:54:02 +03:00
kd-11	f53361b966	rsx: Fix fast texture copy when src_pitch != width * block_size - Happens on mipmapped linear images	2019-09-08 18:22:27 +03:00
kd-11	0af9685381	rsx: Deprecate surface_transform::argb_to_bgra which is no longer required. - vulkan now uses native swizzle mapping for both surface and texture	2019-09-08 13:56:41 +03:00
kd-11	6aa0b49dbc	vk: Prefer using native alignment when uploading. - Allows using fast copy paths and reduces memory and compute footprint	2019-09-07 16:23:20 +03:00
kd-11	a3a0cb8c17	rsx: Minor texture optimizations	2019-09-07 16:23:20 +03:00
kd-11	efa501dac6	rsx/vp: Set default inputs to (0, 0, 0, 1) - From some hw tests, it seems this is the default.	2019-09-06 17:08:28 +03:00
kd-11	f8dbe281a5	glsl: Explicitly declare const inputs as such - Avoids copying the values to temp variables before invoking function calls - Generates shorter, cleaner AST and SPV bytecode	2019-09-06 17:08:28 +03:00
kd-11	9dc06cef7f	rsx: Do not include ro data when attempting to do section merge - Avoids crazy situations like trying to merge from a 3d or cubemap in memory	2019-09-02 16:49:04 +03:00
kd-11	e99e8460fe	rsx/texture_cache_utils: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	27fabd7607	rsx/ring_buffer: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	0158a88c88	rsx/textures: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	401bd9112a	rsx/prog: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	652f18ebaa	rsx/buffers: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	94656ac1e3	rsx/vp: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	0ee9d7b46d	rsx/fp: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	7f99de36c1	rsx: Fixup for surface_target_a flag being broken - While the mask for surface_a is at index 0, the surface cache expects the order to be maintained correctly! Set the correct mask since surface store now checks each RTT individually	2019-08-30 21:46:19 +03:00
kd-11	99fb6d6a5d	rsx: Allow GPU-accelerated stream manipulation when doing texture uploads	2019-08-30 21:46:19 +03:00
kd-11	e55d216619	rsx: Workarounds for some buggy games - Replace assert with log message until hardware testing confirms findings	2019-08-28 14:54:51 +03:00
kd-11	e334a43169	rsx: Fix surface cache hit tests - Avoid silly broken tests due to queue_tag being called before pitch is initialized. - Return actual memory range covered and exclude trailing padding. - Coordinates in src are to be calculated with src_pitch, not required_pitch.	2019-08-28 14:54:51 +03:00
kd-11	2962e05f26	rsx: Implement per-RTT color masks - Also refactors and simplifies some common code in surface store and rsx core	2019-08-27 21:59:02 +03:00
kd-11	27aeaf66bc	gl: Restructure buffer objects to give more control over usage - This allows creating buffers with no MAP bits set which should ensure they are created for VRAM usage only - TODO: Implement compute kernels to avoid software fallback mode for pack/unpack operations	2019-08-27 21:59:02 +03:00
kd-11	eed32cf3a4	rsx: Decompiler fixups and improvements - Fix 2D coordinate sampling of W coordinate. W is actually HPOS.w and not 1. Z is however always 0. - Optimize register usage a bit Disassembling compiled SPV shows that global declaration results in less ops than using inout modifiers. Modifiers generate extra mov instructions.	2019-08-26 20:03:31 +03:00
kd-11	3e28e4b1e0	rsx/decompiler: Restructure program register behavior - Fix reading of varying registers in FP Different registers have different behavior - Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1) - Reimplements two-sided lighting correctly without hacks - Also bumps shader cache version	2019-08-26 20:03:31 +03:00
kd-11	fe6ff8622a	rsx: Decompiler fixups for conditional execution - Cond actually obeys vector mask	2019-08-26 20:03:31 +03:00
kd-11	f9aea076ae	rsx: Implement depth_buffer_float support. - Since this is transparent to the application at all time, it only becomes a problem when doing memory transfer or DEPTH->RGBA conversion in shaders.	2019-08-26 20:03:31 +03:00
kd-11	c67c97844e	rsx: Fixup for blit engine range calculations	2019-08-21 21:17:15 +03:00
kd-11	5d1b7eb945	rsx: Fix reference leaks in texture_cache<->surface_cache communication - Properly commit orphaned blocks not invalidating existing cache structures - Do not ignore overwritten objects when commiting as unprotected fbo. Avoids stale references to invalidated surface objects.	2019-08-21 21:17:15 +03:00
kd-11	141072023b	rsx: Fix handling of ARGB8 memory - Load into memory as straightforward BGRA - Fixes a bug in vulkan caused by byte shuffling in blit engine vs shader access - Removes the need for memory shuffling when transferring into a rendertarget	2019-08-21 21:17:15 +03:00
kd-11	9cd5325962	rsx: Free memory 'held hostage' by storage sections in the surface cache - Once the memory has been captured by another surface, release the allocation	2019-08-21 21:17:15 +03:00
kd-11	be98554b40	rsx: Fix surface split logic - Calculations are supposed to be done based on the properties of the outgoing surface	2019-08-21 21:17:15 +03:00
kd-11	67dac94704	rsx/fp: Zero-initialize FragDepth register to match hw	2019-08-21 21:17:15 +03:00
kd-11	dca29def5e	rsx: Temporary workaround for race condition in blit engine	2019-08-18 20:45:48 +03:00
kd-11	5e299111cc	rsx/vk: Restructure surface access barriers and implement RCB/RDB - Implements render target data load (aka Read Color Buffer/Read Depth Buffer) - Refactors vulkan surface barrier to be much cleaner. - Removes redundant surface barrier invocations after doing a merged load from surface cache. - Adds explicit access modes when gathering surfaces from cache.	2019-08-18 20:45:48 +03:00
kd-11	dfe709d464	rsx: Surface cache restructuring - Further improve aliased data preservation by unconditionally scanning. Its is possible for cache aliasing to occur when doing memory split. - Also sets up for RCB/RDB implementation	2019-08-18 20:45:48 +03:00
kd-11	1de90bdb1f	rsx: Improve aliased data preservation - Carve out inherited region if any - Perform pitch compatibility test before assigning old_surface	2019-07-27 16:09:21 +03:00
kd-11	e2574ff100	rsx: Support CSAA transparency without multiple rasterization samples enabled	2019-07-19 15:49:08 +03:00
kd-11	ea2f4d57fa	rsx: Fixups	2019-07-17 13:29:42 +03:00
kd-11	113a49e00c	rsx: Handle cyclic references when doing memory inheritance	2019-07-17 13:29:42 +03:00
kd-11	34b06453f9	rsx: Handle lost data due to unused data sections - After splitting, the sections may not be referenced at all for anything other than just pixel storage - In such cases, either merge down or sample from the upstream source instead	2019-07-17 13:29:42 +03:00
kd-11	009e01a347	rsx: Set up for multi-section inheritance	2019-07-17 13:29:42 +03:00
kd-11	fc09572648	rsx: Implement texel border decode - Texel borders are no longer actually supported in modern APIs - Removes the border texels and uses border color instead which is incorrect but should work fine	2019-07-11 13:22:13 +03:00
kd-11	d8f753f1e8	rsx: Do not allow framebuffer surfaces that exceed their allocated pitch dimensions - Truncate surfaces to forcefully fit inside the declared region	2019-07-11 13:22:13 +03:00
kd-11	c072c511a1	rsx: Add support for slice padding rows when gathering slices for cubemap/3d	2019-07-09 16:27:59 +03:00
kd-11	ad10eb391e	vk: Reuse discarded memory whenever possible instead of recreating new objects - Memory allocations are surprisingly expensive when spammed	2019-07-03 15:52:16 +03:00
kd-11	71e809a78b	rsx: Implement dma abort in case of a reset after misprediction	2019-07-03 15:52:16 +03:00
Eladash	43f919c04b	Fixup after #6143 (#6146 ) vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145. Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code. Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct. Used sequantially consistent ordering in semaphore_release for TSX path as well. Improved memory ordering for sys_rsx_context_iounmap/map. Fixed sync bugs in HLE gcm because of not using atomic instructions. Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier. Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync. Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485	2019-06-29 18:48:42 +03:00

1 2 3 4 5 ...

698 commits