rpcsx

mirror of https://github.com/RPCSX/rpcsx.git synced 2026-02-12 02:34:29 +01:00

Author	SHA1	Message	Date
kd-11	035a76f26d	Fix build	2020-12-16 10:10:06 +03:00
kd-11	d3686dbb75	rsx: Add some texture upload statistics to the texture cache	2020-12-16 10:10:06 +03:00
kd-11	0ef5743261	rsx: Fix sampler descriptor updates for framebuffer resources - Each desc manages its own lifetime now instead of relying on global timestamp check - Fixes situation where same object remains active without update for long	2020-12-16 10:10:06 +03:00
Nekotekina	e321765c54	Split BEType.h to util/v128.hpp and util/to_endian.hpp	2020-12-13 16:34:45 +03:00
kd-11	f83c2f0b6b	rsx: Restructure and simplify some header include chains	2020-12-13 15:38:35 +03:00
Nekotekina	b59f142d4e	Move types.h to util/types.hpp	2020-12-12 15:12:01 +03:00
Nekotekina	6e05dcadb6	Reduce std::numeric_limits dependency Please, stop pretending... You need these templates for generic code. In other words, in another templates. Stop increasing compilation time for no reason.	2020-12-12 12:35:18 +03:00
Nekotekina	b382d3b3e9	Remove ASSUME macro It's dangerous and sometimes bluntly misused feature. Its optimization potential is near-zero.	2020-12-10 14:08:02 +03:00
Nekotekina	36c8654fb8	Remove HERE macro Some cleanup. Add location to some functions.	2020-12-10 12:30:22 +03:00
Nekotekina	5d934c8759	Improve narrow() and size32() with src_loc detection	2020-12-09 16:26:20 +03:00
Nekotekina	e055d16b2c	Replace verify() with ensure() with auto src location. Expression ensure(x) returns x. Using comma operator removed.	2020-12-09 15:43:38 +03:00
Nekotekina	eb66302907	atomic.hpp: replace std::atomic with atomic_t Dual dependency is nothing good.	2020-12-07 17:13:12 +03:00
kd-11	3a0b3a85a5	rsx: Separate program environment state from program ucode state - Allows for conservative texture uploads - Allows to update a program object without running full ucode analysis for no reason	2020-12-07 00:45:27 +03:00
RipleyTom	af8c661a64	Remove BOM markers	2020-12-06 15:30:12 +03:00
kd-11	2aa5c437e8	rsx: Fix upscaled image reconstruction - Base the upscaling on the real source and not the "attr" parameter. - In case of reconstruction, the source is much larger than the subslice in "attr"	2020-11-30 01:20:17 +03:00
kd-11	3ddfa288cf	rsx: Use multithreaded shader compiler backend	2020-11-21 20:43:15 +03:00
kd-11	0e7a705254	rsx: Resolution scaling overhaul - Enforce square pixels instead of per-axis scaling	2020-11-18 09:29:34 +03:00
Eladash	fefab50e06	Fix vm::range_lock, imporve vm::check_addr	2020-11-11 10:30:09 +03:00
Megamouse	a3eb5c2d63	More Header cleanup	2020-11-06 22:14:05 +01:00
kd-11	b32eecb5a7	rsx: Driver compatibility improvements (#9131 ) * rsx: Refactor vertex clip emit to avoid using f64 unnecessarily - Fixes driver crash on intel * vk: Add NVIDIA driver version check - Warn if user has outdated drivers with known problems	2020-10-27 13:22:15 +03:00
kd-11	04ff7913b4	rsx/codegen: Workaround for borked hardware - Bitwise or does not evaluate correctly for some hardware. Substitute with subtraction instead.	2020-09-28 22:18:36 +03:00
kd-11	9baef8c705	rsx: Emit simpler fragment program code - Optimize clamp16 - Use bfe instead of shift-and	2020-09-27 18:56:04 +03:00
kd-11	a14a358b73	rsx: Optimize vertex decoder to generate simpler code - Significantly improves compilation speed by simplifying most of the code and doing something similar to LICM. * Actual decoding is now vectorized and performed in one step rather than in a loop. * Switches inside loops are removed and replaced with simple comparison. Generates much nicer (and smaller) GCN bytecode.	2020-09-27 18:56:04 +03:00
kd-11	a50ea09053	rsx: Properly pass format_class information during RTV/DSV resource barrier - Also takes the opportunity to remove repeating code in a minor refactor.	2020-09-22 12:19:54 +03:00
kd-11	7ed82c0791	rsx: Always force typeless copy if memory is crossing aspect boundary	2020-09-22 12:19:54 +03:00
kd-11	9db97278f3	rsx: Lower error message to warning - Mismatched texture handling is a TODO that will be handled with texturing rewrites	2020-09-19 01:55:59 +03:00
kd-11	d3898fda57	rsx: Release misconfigured texture memory before attempting reupload	2020-09-19 01:55:59 +03:00
kd-11	92d65ff3c2	rsx: Add support for mixed data types when sampling shadow coordinates	2020-09-15 17:37:52 +03:00
kd-11	da6760ed98	vk: Simplify shadow comparison operations for non-integer formats - Just use hardware PCF, it makes everyone's life easier.	2020-09-09 22:11:12 +03:00
kd-11	6380e67af9	rsx: Fix depth clipping - Fix special case where n=f making (f-n) = 0 - Dynamically update depth range by setting dirty bits - Fix depth bounds when n=f and bounds test is disabled	2020-09-08 15:33:08 +03:00
kd-11	dc465df3bc	rsx: Enable support for extended range in depth buffer - Software clipping emulation is used here as OpenGL does not have explicit clip control. - Hardware clip control for vulkan to be enabled after this.	2020-09-08 15:33:08 +03:00
kd-11	e9cdb248a0	glsl: Properly implement shadow filtering when running emulated shadow compare - Previous code was completely borked	2020-08-29 02:03:09 +01:00
kd-11	9828d6146b	rsx: Fix format matching when aggregating textures - When copying depth-depth, prefer own format over depth int format	2020-08-27 12:52:28 +03:00
kd-11	65ead08880	rsx: Refactor and improve image memory manipulation routines	2020-08-27 12:52:28 +03:00
kd-11	794378d5e9	rsx: Do not create depth textures as blit engine targets.	2020-08-27 12:52:28 +03:00
kd-11	a5ac5a9861	rsx: Separate uint depth formats from float depth formats	2020-08-27 12:52:28 +03:00
kd-11	b41349546c	rsx: Proper support for typeless transform of ABGR framebuffers using the RGBA8 format	2020-08-12 20:19:19 +03:00
kd-11	7109fe9889	rsx: Improve swizzled layout detection - Reset swizzle flag to false automatically on section reset. - Detect render target payload and extract swizzle information from it.	2020-08-05 23:23:38 +03:00
kd-11	4df933275b	rsx: Propagate raster type of fbo sourced data throughout the pipeline. - Tracks which kind of raster was done (Z-ordered vs linear) throughout the application. - This allows to identify if data is in the expected format or not.	2020-08-02 16:14:11 +03:00
kd-11	b0c7ca6d1f	vk: Improve video memory manager to attempt recovery in out of memory situations	2020-07-25 14:48:11 +03:00
kd-11	42a9ac9e6c	rsx: Brute-force removal of superseded surfaces	2020-07-16 19:11:26 +03:00
kd-11	632af8d723	rsx: Support partial texture descriptors - It is safe to declare w > pitch and it works as long as sampling inside the legal 2D area is obeyed.	2020-07-10 15:26:07 +03:00
kd-11	acf51f0ead	rsx: Fix transfer descriptors for partially overlapping slices in head - Height must be corrected to skip the piece that exists before the current slice	2020-07-03 14:29:54 +03:00
kd-11	83d818d96f	rsx: Improve mipmap gathering - Account for source offsets when grabbing subregions - Scale input accordingly when sourcing from fbo in all paths	2020-06-16 19:12:03 +03:00
sampletext32	0ad4e91001	Avoid string reallocation in swizzle CgBinaryProgram	2020-06-15 22:26:49 +03:00
kd-11	f4ec28d932	rsx: Merge instruction expand flag with the other sign expand flags - Avoids double expansion when both the exp_tex flag is set AND the texture also is sampled as signed - Should fix missing eyeballs in Mass Effect 1 with the previous sign expansion fix	2020-06-12 20:19:20 +03:00
kd-11	ce587f43a0	rsx: Implement signed normalized texture formats - Already partially supported via EXP option in the shader opcode, but format decoding was disabled. - Noticed in some UE3 games which use _SNORM variants on PC but _UNORM on rpcs3	2020-06-12 20:19:20 +03:00
kd-11	87cc937d4e	rsx/fp: Separate SRC precision modifiers - SRC0, SRC1 and SRC2 have different bits for precision modifiers all stored inside SRC1 - This explains the strange observed behavior of the MAD instruction which has 3 inputs	2020-06-07 12:07:27 +03:00
kd-11	73fe9b51de	rsx/fp: Ignore self-referencing register writes. - Sometimes, usually with shaders that do pack/unpack operations, there is a write-to-self operation. For example r0.xy = r0.xy Obviously no new data was introduced into "r0" by this, so we should not mark the register as having new data. - TODO: Investigate on realhw if self-reference is needed to "cast" the overlapping half registers to their full register counterparts.	2020-06-03 09:45:02 +03:00
kd-11	26b2e4253d	rsx: Properly account for memory sizes of reused surfaces	2020-06-02 21:37:57 +03:00
kd-11	b353bf6c56	rsx: Improve surface cache resource management - Do not allocate too many objects. This is a problem in games using dynamic memory allocators that can make it rare for a surface to fall on the same address twice, keeping zombie RTVs and DSVs alive much longer than needed. - Current limit used is 256M of virtual VRAM which is impossible on retail PS3	2020-06-01 22:24:27 +03:00
kd-11	542a6aed51	rsx: Add stippled rendering support to interpreters	2020-05-30 14:47:10 +03:00
kd-11	1677618c75	rsx: Implement stippled rendering	2020-05-30 14:47:10 +03:00
kd-11	bd41a108d8	nv3089: Account for subpixel addressing - Those strange offsets noted in some games seem to match to subpixel addressing. For example, when scaling down by a factor of 4, a pixel offset of 2 will end up inside pixel 0 of the output	2020-05-24 11:31:37 +03:00
Ani	581176fb1a	gl: Restrict insert_vertex_input_fetch workaround to Intel proprietary It works fine on Mesa iris Fixes detection of Mesa as recent Mesa does not have "x.org" on vendor string, allowing vendor_MESA to become true instead of vendor_INTEL on Mesa Intel	2020-05-17 17:49:14 +03:00
kd-11	37df3c6f96	rsx/fp: Fix precision clamping on MAD instruction	2020-05-17 09:11:26 +03:00
AniLeo	99f5145aab	glsl: Avoid implicit int->uint conversions Silences debug output regarding implicit int -> uint conversions	2020-05-16 11:45:59 +01:00
Mrlinkwii	c22d778143	Spelling fix in texture_cache.h (#8219 ) heurestic_end -> heuristic_end	2020-05-14 21:42:21 +03:00
kd-11	310f367fb1	rsx: Improve blit engine memory validation (#8215 ) - In blit engine logic there is a tendancy to over-allocate so as to avoid having to sticth together textures later - Sometimes this can lead to out of bounds access and crash applications, so memory must be validated	2020-05-14 12:57:58 +01:00
kd-11	ed82288c1b	rsx/fp: Support more types of texture access - Allows more instructions to correctly decode depth textures	2020-05-13 22:20:43 +03:00
kd-11	b6e8560532	rsx/fp: Fix PK2/UP2 instruction - These variants take unsigned scalar inputs, not signed. - Fixes ARGB8->X16Y16 in SR: Gat out of Hell	2020-05-11 09:37:00 +01:00
kd-11	79e2a87bc5	rsx: Fix NOP shader passing - NOP shaders are used to stub rendering when a pass is supposed to be disabled	2020-05-10 21:54:34 +03:00
kd-11	14969cd8d0	rsx: Disable SCA writes to output register if vec result flag is set. - Noticed when debugging X-men origins: wolverine which has a bogus SCA op whilst writing vector to output - It makes no sense for both SCA and VEC to both write to the same register in the same instruction as memory ordering becomes an issue	2020-05-08 14:35:07 +03:00
kd-11	79c54aeba9	rsx: Move analyser dump to its own config option	2020-05-08 14:35:07 +03:00
kd-11	a3f25bc7c7	rsx/interpreter: Fix DIVSQ instruction	2020-05-05 13:18:03 +03:00
kd-11	4f7c020e63	glsl: Improve VGPR usage - VGPR usage lowered from 159 -> 127 for texturing. Occupancy doubled from 1 to 2 - Eliminate most temporary registers	2020-05-05 13:18:03 +03:00
kd-11	2ed50ba263	rsx/interpreter: Improve instruction accuracy - Fix DIV instruction - Add EXP_TEX modifier - Implement WPOS register read - Swap 3D and Cubemap enums to match RSX ids - Adds two extra instruction classes: flow control and packing control - Implement remaining FP instructions with exception of the rare projected texture lookups - Fix typo causing output color index > 0 to not work - Fix KIL instruction - Implement conditional vertex program writes	2020-04-30 15:02:59 +03:00
kd-11	bc5c4c9205	rsx/gl: Implement variable path interpreter for optimal performance	2020-04-30 15:02:59 +03:00
kd-11	930bc9179d	rsx/interpreter: Improve instructions support - Must statically write the gl_ClipDistance registers else you get uninitialized trash. This problem is more readily apparent on NVIDIA technology but even AMD is not completely immune.	2020-04-30 15:02:59 +03:00
kd-11	b4bf48c33b	vk: Integrate shader interpreter	2020-04-30 15:02:59 +03:00
kd-11	0072df7f20	rsx/gl: Add basic interpreter support to OGL - Adds basic interpreter functionality. - Flow control and other instructions not yet implemented.	2020-04-30 15:02:59 +03:00
scribam	2e397e38a4	Typos	2020-04-14 17:06:58 +03:00
scribam	f37adc4188	Add fallthrough attribute	2020-04-14 17:06:58 +03:00
Eladash	cb14805d78	rsx fp/vp analyzers: Fix strict type aliasing and improve codegen	2020-04-12 16:48:43 +03:00
Eladash	ff74c241c7	rsx: Fix get_optimal_blit_target_properties for local memory	2020-04-11 21:21:15 +03:00
Eladash	36fd1d0f0d	rsx: Optimize transform constants load methods (#7992 )	2020-04-09 15:53:43 +03:00
Megamouse	078c31c1da	Qt: fix lupdate warnings (used for translation)	2020-04-06 20:59:58 +02:00
kd-11	0b6e2b26fa	rsx: Fix DST instruction - It's the old-school distance vector, not the more modern distance() function - There is seemingly no glsl function that maps to it directly.	2020-04-05 16:35:20 +03:00
Eladash	d97e9f7b4a	rsx: Batch vertex program load methods	2020-04-02 20:42:12 +03:00
unknown	049825812e	RSX: Restrict analyser loop error	2020-03-28 09:42:13 +03:00
xddxd	d96dabcd60	rsx: Rename current_instrution to current_instruction (#7883 )	2020-03-28 02:46:48 +00:00
kd-11	7025985c0d	rsx: Improve section scanning when updating surface cache resources in blit engine.	2020-03-15 16:51:23 +03:00
kd-11	a756c0679e	rsx: Implement cross-aspect slice gathering - Fixes a data leak that can happen when a surface is rejected due to aspect mismatch. - Mismatch can lead to rejection due to area covered excluding the RTT and inevitable upload a texture from CPU at the same location. - Overlapping fbo/shader_read resources are not allowed.	2020-03-15 16:51:23 +03:00
kd-11	12b73c8bdc	rsx: Fix copypasta	2020-03-09 17:20:24 +03:00
kd-11	2985a39d2e	rsx: Rewrite async decompiler	2020-03-09 14:59:25 +03:00
kd-11	84a542fbce	rsx: Blit engine improvements - Detect writes to the display output memory and handle it specially. It already defines a known 2D region. - Try and detect situations where raw transfers would be of benefit.	2020-03-08 10:30:13 +03:00
Nekotekina	e4a81b1d13	Move Log.h to util/logs.hpp	2020-03-07 12:29:23 +03:00
Nekotekina	7a8772dafa	Replace std::string::npos with umax	2020-03-05 14:05:23 +03:00
Nekotekina Aux1	250736ece5	Fix warnings in emucore	2020-03-04 21:23:34 +03:00
kd-11	54775d91dc	rsx/blit-engine: Account for a rare corner case - It is possible to have a RTV<->DSV transfer with compatible-sized formats. Mark the depth size as typeless in such a situation to avoid crossing the aspect barrier with the API.	2020-03-04 21:21:59 +03:00
gamerforEA	c0fbf3091e	Remove unnamed namespaces from headers	2020-02-27 00:38:55 +03:00
kd-11	569e1c2df6	rsx: Fix typo. Noted by github user @gamerforEA	2020-02-26 19:40:35 +03:00
Nekotekina	5e75a0c497	Disable cotire on travis Make some workarounds for clang because it poorly supports -Wold-style-cast	2020-02-21 17:03:54 +03:00
Nekotekina	972e0ab31d	Remove -Wno-reorder and make it an error	2020-02-21 15:20:34 +03:00
Nekotekina	92e3eaf3ff	Fix signed-unsigned comparisons and mark warning as error (part 2).	2020-02-19 22:54:58 +03:00
Megamouse	fe75311be2	move config structs to own files and clean up some headers	2020-02-17 15:08:17 +03:00
kd-11	f47333997f	rsx: Validate memory blocks before checking for overlap	2020-02-10 21:48:35 +03:00
kd-11	3787108ee7	rsx: Typo fix in audit condition	2020-02-10 21:48:35 +03:00
Eladash	b7043ce000	Make rsx::get_address report caller location	2020-02-08 22:18:56 +03:00
kd-11	b6422c9a33	rsx: Fixup - Destination Y coordinate must be 'rebased' onto the current slice by subtracting its offset. Only the local path was affected this time	2020-02-05 18:18:09 +03:00
Nekotekina	c0f80cfe7a	Use attributes for LIKELY/UNLIKELY Remove LIKELY/UNLIKELY macro.	2020-02-05 10:42:34 +03:00
Nekotekina	1a78e0e80c	Make RPCS3 compile in C++2a mode	2020-02-04 23:43:55 +03:00
kd-11	9d9b5c4d66	rsx: Rewrite coverage test to take sum of areas into account. - TODO: A proper sweep algorithm to calculate sum of overlapping rectangles	2020-02-04 16:20:52 +03:00
kd-11	b9ec012922	rsx: Allow for proper data checks when WCB/WDB is enabled	2020-02-04 16:20:52 +03:00
Silent	7f4e546f19	Protect m_storage.find(key) to fix a race	2020-02-02 22:28:14 +03:00
kd-11	7d2ed9200d	rsx: Remove sections that are wholly inherited by new blocks - Allows sections reclaimed by the surface store due to overlap/inheritance to be identified and removed. - Additionally, potentially lowers the number of flushes required per block with multiple overlaps improving efficiency and theoretically performance.	2020-02-01 15:14:29 +03:00
Nekotekina	15391f45d0	Modernize RSX logging (rsx_log variable)	2020-02-01 11:52:22 +03:00
kd-11	36d5db7f30	rsx: Plug texture data leak in the 'exact match' path. - Followup to previous texture data leak fix for the replaced section path.	2020-01-31 14:56:53 +03:00
kd-11	c9e35926f5	rsx: Preserve pixel data when splitting sections - Ironically rhis data leak is caused by trying to fix another type of data leak	2020-01-30 21:07:36 +03:00
kd-11	1206a5d4b7	rsx: Tweak blit engine heurestics a bit - Reject writes to RTT if the source data is of unknown origin. non-RTT data and only 1 line in length is suspicious and often GPU data like programs or other rendering inputs.	2020-01-29 12:54:06 +03:00
kd-11	79216917b3	rsx: Workaround for broken rtt resampling - Avoids WCB requirement for now to keep res scaling working correctly. - TODO: Fix this properly	2020-01-26 13:58:48 +03:00
kd-11	44f2cacf7b	rsx: Blit engine tuning - Attempt to identify blit operations that will be flushed immediately after and just do them on CPU instead if the transformation is trivial. - If only a single blit section is contributing to an atlas merge op, the threshold should be 100%. The only acceptable result here is a truncation.	2020-01-26 13:58:48 +03:00
kd-11	7a275eaa3a	rsx: Fix incomplete blit operations getting used as texture inputs - Raise passing 'score' from 50% to 90% to filter out very incomplete merge operations. - Catch unfit sections passing the match test; possible for blit_dst data but will likely be always harmless. Disabled in release builds by default.	2020-01-26 13:58:48 +03:00
Maksim Derbasov	1abdee242a	small improvement (#7288 ) * small improvement * comments addressed Co-authored-by: kd-11 <15904127+kd-11@users.noreply.github.com>	2020-01-22 12:28:48 +00:00
kd-11	db014d8a58	rsx: Fix section length calculations when generating new blit targets.	2020-01-16 17:57:31 +03:00
kd-11	309251ce7a	rsx: Touch locked dst memory after blit transfer operations in case it is locked by WCB/WDB	2020-01-16 11:12:08 +03:00
Dravonic	94d2f97f27	Multithreaded shader compliation follow-up (#7190 ) * Multithreaded load pipeline entries shader compliation stage Co-authored-by: kd-11 <15904127+kd-11@users.noreply.github.com>	2020-01-06 21:59:59 +03:00
kd-11	7f09def94e	rsx/vp: Properly initialize output registers. - All registers tested on hw show contents to be 0, 0, 0, 1. Make default output registers match this pattern.	2020-01-05 18:06:08 +03:00
Megamouse	c9aee27d48	VK: remove unused init function declaration	2020-01-03 14:22:40 +01:00
Eladash	9690854e58	Some cleanup * Prefer default initializer over std::memset 0 when possible and more readable. * Use std::format in trophy files name obtaining. * Use vm::ptr<>::operator bool() instead of comparing vm::ptr to vm::null or using addr(). * Add a few std::memset calls in hle where it matters (or in some places just to document an actual firmware memcpy call).	2019-12-31 22:27:27 +03:00
Megamouse	ef6f565dbd	silence some annoying warnings	2019-12-28 15:40:57 +01:00
Emmanuel Gil Peyrot	9b77febd10	RSX: Remove two empty cpp files	2019-12-23 00:02:57 +03:00
Eladash	db4041e079	Implement rounded_div Round-to-nearest integral based division, optimized for unsigned integral. Used in sceNpTrophyGetGameProgress. Do not allow signed values for aligned_div(), align().	2019-12-20 14:47:04 +03:00
Nekotekina	377e7d2a73	C-style cast cleanup VI	2019-12-04 17:56:22 +03:00
Nekotekina	185c067d5b	C-style cast cleanup V	2019-12-03 17:23:00 +03:00
Nekotekina	28eacc616a	C-style cast cleanup III	2019-12-01 00:32:44 +03:00
kd-11	8ca53f9c84	rsx: Remember to min-max the anchor indices of a polygon or triangle fan	2019-11-24 19:01:57 +03:00
kd-11	429a76a140	rsx: Remove redundant check	2019-11-23 16:11:18 +03:00
kd-11	41e7d2aa0a	rsx: Select correct image aspect for blit engine targets.	2019-11-19 13:18:15 +03:00
kd-11	41c3180276	rsx: Fix invalid format checks for DMA sections which are typeless	2019-11-19 13:18:15 +03:00
kd-11	9dab0575fa	rsx: Add missing format check for the RTV<->DSV transfer case - TODO: Rewrite resource handling routines	2019-11-18 13:17:00 +03:00
kd-11	4a0e1c79ed	rsx: Improve format validation for blit engine - Check all possible cases where format mismatch is possible. - Warn if a slow path is going to be taken. Should help with future optimizations.	2019-11-18 13:17:00 +03:00
kd-11	2408922806	rsx: Do not ignore clamping for some routines that do not have implied range	2019-11-18 13:17:00 +03:00
kd-11	0a32d478df	vk: Enable auto-growing of the data heaps for the performance case	2019-11-10 17:53:12 +03:00
kd-11	f359342721	rsx: Implement mutable ring buffers with grow support	2019-11-10 17:53:12 +03:00
Emmanuel Gil Peyrot	56f82d2701	rsx: Wrap gsl::span definition into Utilities/span.h	2019-11-09 20:00:50 +01:00
Emmanuel Gil Peyrot	f76720ceb0	Remove extraneous ::narrow<int>() calls GSL’s gsl::span didn’t use the correct type for its index_type, which is why they were needed.	2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot	72cdf0b04c	Replace gsl::span’s implementation with tcbrindle’s This implementation optimises correctly on all relevant compilers, unlike GSL’s which gave extremely slow code on any compiler other than MSVC. Supersedes #6948.	2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot	ef368c5171	rsx: Replace gsl::byte with C++17’s std::byte	2019-11-09 19:30:05 +01:00
kd-11	7072489a6e	rsx: Implement point sprite coordinate generation - When the point sprite flag is set, overrides the input similar to the 2D mask. The returned X and Y values are always the gl_PointCoord values for the fragment. - Stacks with the 2D mask to override the z and w coordinates.	2019-11-09 12:50:53 +03:00
kd-11	63673b1a9f	rsx: Implement full color remap for the D24S8->ARGB8 converter	2019-11-08 19:11:59 +03:00
kd-11	1266b63135	vk: Enable gpu deswizzling	2019-11-05 22:07:22 +03:00
kd-11	9cd3530c98	rsx: Set up framework for hw deswizzle	2019-11-05 22:07:22 +03:00
Nekotekina	e3e7051ed3	Minor optimization in BufferUtils.cpp Don't use PSHUFB for horizontal operations. Utilize PHMINPOSUW to compute max as well: + sse41_hmin_epu16 + sse41_hmax_epu16	2019-10-30 18:52:34 +03:00
Nekotekina	b1968769b7	Minor cleanup in BufferUtils.cpp Replace inline asm with intrinsic using target attribute trick.	2019-10-30 17:53:51 +03:00
linkmauve	cfd5cf6bdb	Optimise primitive_restart::upload_untouched() (#6881 ) * rsx: Optimise primitive_restart::upload_untouched() with SSE4.1 This optimisation is only applied when skip_restart is false. I’ve only tested the u16 codepath, as it is the one used in NieR. In some very unscientific profiling, this function used to take 2.76% of the total frame time at the save point of the port town, it now takes about 0.40%. * rsx: Mark all SSE4.1 functions with attributes on gcc and clang This assures the compiler we will take care of only calling these functions after having checked that the CPU does support these instructions. * rsx: Add an AVX2 implementation of primitive restart ibo upload * rsx: Remove redefinition of SSE4.1 instructions Now that clang is aware that our functions are compiled with SSE4.1, it lets us generate this code using its intrinsics. * rsx: Optimise vector to scalar conversion This is done using minpos and srli intrinsics and generate less code than before. Thanks Nekotekina for the suggestion!	2019-10-30 16:42:44 +03:00
kd-11	aa3eeaa417	rsx: Separate subresource_layout:dim_in_block and subresource_layout::dim_in_texel - These two are not always linked when working with compressed textures. The actual texels extend past the actual size of the image if the size is not aligned. e.g if height is 1, the real height is 4, but its not possible to determine this from the aligned size. It could be 1, 2, 3 or 4 for example. - Fixes image out-of-bounds writes when uploading from CPU	2019-10-29 20:03:54 +03:00
kd-11	d04241ad25	rsx: Allow compressed textures to be unaligned in size - Align based on row length but let the texture itself be of arbitrary dimensions	2019-10-28 15:20:45 +03:00
kd-11	e04b6cd7c0	rsx: Copypasta fix - r1 is always float4 never half4. Its a full-width register unlike the other outputs which are optionally half-width.	2019-10-23 00:50:24 +03:00
Eladash	945abcc6cd	rsx: Align down index array offset * Also use improved to_be_t<> template (recetly ignoring one byte long types) for vm gsl::byte referencing, remove redundent narrow<> cast (same type)	2019-10-22 13:45:09 +03:00
kd-11	0b2f9f0f17	rsx: Add support for delayed shader discard. - Noticed a glitch on AMD hw and windows drivers where discard seems to affect entire 4x4 cells. - Dead fragments (outside the primitive boundary) could have their discards trigger as they do not have proper access to variables. - This introduces dead fragments along triangle edges, causing a diagonal line pattern across the screen that is very annoying.	2019-10-22 13:44:49 +03:00
kd-11	901942f24a	rsx: Replace pointless f32[4] restriction on texture parameters. - Use a struct instead to improve readability and remove pointless OpBitCast	2019-10-22 13:44:49 +03:00
kd-11	f7842b765f	rsx: Implement packed format renormalization - Renormalizes arbitrary N-bit values as 8-bit normalized. - NV hardware performs integer normalization at 8 bits if the size is less than 8. - This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.	2019-10-22 13:44:49 +03:00
kd-11	09de3b7974	rsx: Tweak behaviour of the "Use GPU texture scaling" option - If either source data or dest is a render target, do image operations on the GPU same as before - If swizzle is desired, use CPU fallback - If no scaling and no format conversion is required, use CPU fallback - If scaling is desired and the transfer target is in local memory, use the GPU - When doing trivial copies, use the routine in rsx_methods instead of duplicating code. Also has the benefit of better range checking.	2019-10-20 21:38:40 +03:00
kd-11	868547aec8	rsx: Minor improvement to fbo region invalidation - When commiting a block as fbo, keep blit_dst data as well. - Avoids removing (and losing data from) blit targets that just happen to share a page with a framebuffer.	2019-10-20 21:38:40 +03:00
kd-11	996534c559	rsx: Fixup for aspect mismatch	2019-10-20 15:25:07 +03:00
kd-11	404073c74a	rsx: Force-align compressed formats to 4x4 texel blocks and disable 1D compressed textures. - The PS3 allows defining 1D compressed images but this obviously doesn't work well on desktop.	2019-10-18 14:46:37 +03:00
kd-11	eff4e95c99	rsx: Minor cache fixup for cyclic references. - Logic was broken by mipmaps PR. Do not issue a texture barrier if a temp copy is being done.	2019-10-18 14:46:37 +03:00
kd-11	eee2237e19	rsx: Track uncached cache resources - Uncacheable resources can be reused as soon as they're made visible to the draw call. - Since they're likely to be reused every draw call until the shader changes, it is important to reuse as much as possible	2019-10-18 14:46:37 +03:00
kd-11	decf9cfcf6	rsx: Notify the backend to release or delete temporary surfaces after we're done with them.	2019-10-18 14:46:37 +03:00
kd-11	a936e43ff6	rsx: Fixup for slice gathering for structures with multiple mipmap levels - TODO: Proper multi-level assembly for non-2D structures	2019-10-17 18:18:00 +03:00
kd-11	e166dbccc8	rsx: Fix visibility of blit destination targets	2019-10-17 18:18:00 +03:00
kd-11	0c35595ce2	rsx: Remove the alpha-to-coverage hack that was added to hide the missing mipmaps in games - Moves to a purely stochastic function using dithering to simlulate coverage	2019-10-17 18:18:00 +03:00
kd-11	f0ed0285f3	rsx: Implement range-based subresource descriptor cache - The previous address-based approach was pretty awful when it comes to invalidating	2019-10-17 18:18:00 +03:00
kd-11	fbb9ed4e25	rsx: Add explicit range to cached subresource descriptors	2019-10-17 18:18:00 +03:00
kd-11	c9e3a321b2	rsx: Fixup for surface cache scanning - Fix regression when gathering cubemaps	2019-10-17 18:18:00 +03:00
kd-11	1ac976771c	rsx: Add some texture search options for the cache - Potentially optimizes texture cache searching using explicit options	2019-10-17 18:18:00 +03:00
kd-11	840b52fe80	rsx: Implement mipmap gathering from texture cache	2019-10-17 18:18:00 +03:00
kd-11	d6d8766f8d	rsx: Refactoring - Move some helper routines out of the cache core - Prep for multi-layered image search	2019-10-17 18:18:00 +03:00
kd-11	4a19a2dd24	rsx: Explicity describe transfer regions for both source and destination blocks	2019-10-04 18:10:46 +03:00
kd-11	ef5b56bc48	rsx: Align width properly when normalizing to avoid fractional results being lowered to 0	2019-09-29 11:39:22 +03:00
kd-11	c59cb1bdd3	rsx: Allow only sse4.1 capable CPUs to take the accelerated index path - Older sets lack the required min/max functionality	2019-09-13 12:28:52 +03:00
kd-11	cc313b052f	rsx: Improve hit testing when scanning for overlapping surfaces - Calculate exact sizes when doing hit tests to avoid false negatives - Defer page checking until actually require to do memory setup - Introduce align2 helper to do non-pow2 alignments	2019-09-12 23:32:21 +03:00
kd-11	9842823a8c	rsx: Check if memory actually exists when overallocating blit targets	2019-09-12 23:32:21 +03:00
kd-11	cd1345b6bb	rsx: Do not use nul section if resolution scaling is active on a surface	2019-09-12 23:32:21 +03:00
kd-11	858014b718	rsx: Experiments with nul sink	2019-09-12 23:32:21 +03:00
kd-11	212ac19c11	vk: Reimplement DMA synchronization	2019-09-12 23:32:21 +03:00
kd-11	60845daf45	rsx: Improve use of CPU vector extensions - Allow use of intrinsics when SSSE3 and SSSE4.1 are not available in the build target environment - Properly separate SSE4.1 code from SSSE3 code for some older proceessors without SSE4.1	2019-09-12 14:08:21 +03:00
kd-11	27af75fe71	rsx: Fixup for blit engine when moving inverted regions - Properly calculate overlap range when sections are inverted - Simplify transfer logic for inverted regions	2019-09-11 23:30:55 +03:00
kd-11	412c620b9d	rsx: Allow sampling from shader_read resources for blit engine - With harmonization between all texture types implemented, there is no difference between blit_engine_src and shader_read for supported formats - Adds extra format filtering to ensure no conflicts when copying data	2019-09-10 16:54:02 +03:00
kd-11	75fcfac00e	rsx: Modify find_cached_texture to respect gcm_format. Can pass 0 for "dont care"	2019-09-10 16:54:02 +03:00
kd-11	f53361b966	rsx: Fix fast texture copy when src_pitch != width * block_size - Happens on mipmapped linear images	2019-09-08 18:22:27 +03:00
kd-11	0af9685381	rsx: Deprecate surface_transform::argb_to_bgra which is no longer required. - vulkan now uses native swizzle mapping for both surface and texture	2019-09-08 13:56:41 +03:00
kd-11	6aa0b49dbc	vk: Prefer using native alignment when uploading. - Allows using fast copy paths and reduces memory and compute footprint	2019-09-07 16:23:20 +03:00
kd-11	a3a0cb8c17	rsx: Minor texture optimizations	2019-09-07 16:23:20 +03:00
kd-11	efa501dac6	rsx/vp: Set default inputs to (0, 0, 0, 1) - From some hw tests, it seems this is the default.	2019-09-06 17:08:28 +03:00
kd-11	f8dbe281a5	glsl: Explicitly declare const inputs as such - Avoids copying the values to temp variables before invoking function calls - Generates shorter, cleaner AST and SPV bytecode	2019-09-06 17:08:28 +03:00
kd-11	9dc06cef7f	rsx: Do not include ro data when attempting to do section merge - Avoids crazy situations like trying to merge from a 3d or cubemap in memory	2019-09-02 16:49:04 +03:00
kd-11	e99e8460fe	rsx/texture_cache_utils: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	27fabd7607	rsx/ring_buffer: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	0158a88c88	rsx/textures: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	401bd9112a	rsx/prog: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	652f18ebaa	rsx/buffers: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	94656ac1e3	rsx/vp: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	0ee9d7b46d	rsx/fp: Warnings cleanup	2019-09-01 18:59:50 +03:00
kd-11	7f99de36c1	rsx: Fixup for surface_target_a flag being broken - While the mask for surface_a is at index 0, the surface cache expects the order to be maintained correctly! Set the correct mask since surface store now checks each RTT individually	2019-08-30 21:46:19 +03:00
kd-11	99fb6d6a5d	rsx: Allow GPU-accelerated stream manipulation when doing texture uploads	2019-08-30 21:46:19 +03:00
kd-11	e55d216619	rsx: Workarounds for some buggy games - Replace assert with log message until hardware testing confirms findings	2019-08-28 14:54:51 +03:00
kd-11	e334a43169	rsx: Fix surface cache hit tests - Avoid silly broken tests due to queue_tag being called before pitch is initialized. - Return actual memory range covered and exclude trailing padding. - Coordinates in src are to be calculated with src_pitch, not required_pitch.	2019-08-28 14:54:51 +03:00
kd-11	2962e05f26	rsx: Implement per-RTT color masks - Also refactors and simplifies some common code in surface store and rsx core	2019-08-27 21:59:02 +03:00

... 2 3 4 5 6 ...

924 commits