Commit graph

924 commits

Author SHA1 Message Date
Nekotekina cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina 8b4b6ba946 copy_data_swap_u32: build AVX-512 path 2021-12-26 14:40:21 +03:00
Nekotekina 599e00d6da BufferUtils: remove dead code (vertex streaming)
RIP. It won't be useful.
2021-12-26 14:40:21 +03:00
Nekotekina 3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
nastys a0040e6fb1
macOS: Implement texture converter for Metal (2) (#11289)
* macOS: Implement texture converter for Metal (2)

* Fix texture conversion formatting
2021-12-24 15:46:37 +03:00
kd-11 39ef39aa4e rsx: Exercise caution when testing for overlaps in invalidated sections 2021-12-24 15:13:33 +03:00
Nekotekina 262ff01619 Use aligned stores in write_index_array_data_to_buffer
Ensure that target buffer is cache line aligned.
Improve stx::make_single to support alignment.
2021-12-21 23:28:09 +03:00
Nekotekina 76ccaf5e6f BufferUtils: refactoring
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
2021-12-21 23:28:09 +03:00
Nekotekina d6420b8803 Put std::hash specialization out of std 2021-12-07 13:04:10 +03:00
kd-11 44fe6f6d39 rsx: Fix sloppy format matching test 2021-11-27 17:47:41 +03:00
kd-11 4df1a938b1 Unused var 2021-11-24 16:02:24 +03:00
kd-11 94a3b1cfe8 rsx: Roll back some optimizations
- Just use RGB565 for all blit targets. Avoids really dumb transforms done by GPU hw.
- When X16 is used, all the channels get written to R channel alone. CmdBlit does perform format conversion!
- gl: Force image copy when blit is requested with compatible targets. Avoids format conversion issues.
2021-11-24 16:02:24 +03:00
kd-11 a21c6c4628 rsx: Fix handling of scaling requests for packed formats
- One does not simply interpolate RGB565 components as U16 data!
2021-11-24 16:02:24 +03:00
kd-11 97bd8f7bc1 rsx: Update sampler format class when inheriting mipmap slices/sections 2021-11-24 16:02:24 +03:00
Megamouse 4d0330bf82 rsx: fix possible segfault 2021-11-16 09:31:16 +01:00
kd-11 ad00c44231 rsx: Configure pitch correctly for pitch-zero textures (1D) 2021-11-03 16:58:30 +03:00
kd-11 5b0ef401f7 rsx: Fix sampling in X when 0 pitch is given
- A pitch of 0 still allows 1-dimensional addressing.
2021-10-31 14:32:42 +03:00
kd-11 78bcb0fd53 rsx: Do not reuse/destroy sections that have references held
- Avoids a situation where blit-dst and blit-src have overlapping ranges. Uploading blit-dst destroys blit-src and vice-versa.
  This is not the end of the world, but blit-src should be kept around until the operation is completed to avoid stale references!
2021-10-27 12:30:43 +03:00
kd-11 479150b214 rsx: Fix decoding of linear cubemaps
- 128-byte boundary is not observed in linear tiling. Verified in hw.
2021-10-10 16:15:28 +03:00
kd-11 dc8fc9fc79 vk: Clean up around vkQueueSubmit handling
- Explicitly declare one version for CB flush and the other for Async flush
- Always flush descriptors on CB flush in case of page fault handling.
  Other threads other than offloader can also enter the method and require normal flow.
- Fix overlapping interrupt IDs.
- Minor formatting fixes
2021-09-28 23:18:26 +03:00
kd-11 7b9fb7ad9c rsx: refactor rsx_utils a bit
- Move obviously standalone things to their own utility files
2021-09-28 17:43:15 +03:00
kd-11 3c7ada8e83 rsx: Fix 3D texture decode
- 3D mipmaps are shrunk in all 3 axes, they are not 2D array textures.
- Fixes mip1-mipN for all situations
2021-09-21 19:53:46 +03:00
kd-11 3ab9e04db7 rsx: Fix surface access bit flags
- The previous enumeration was a holdover from older access management.
- A bitflag of 0 seriously messes up the mask tests
2021-08-29 11:10:30 +03:00
kd-11 f745971cc8
rsx: Fix coordinate scaling for shadow access (#10668)
- For shadow2DProj the 3rd coordinate is actually the depth value, do not scale
2021-08-06 22:49:50 +01:00
kd-11 da3c9948e6 rsx: Revert use of std::has_single_bit
- Zero is not a power of 2 in this situation, and we do not want to treat it as such
2021-08-04 20:28:25 +03:00
kd-11 daa8265a47 rsx: Fix interpreter texture fetch 2021-08-04 20:28:25 +03:00
kd-11 8aec943093 Use c++20 has_single_bit for POT test 2021-08-03 00:36:04 +03:00
kd-11 99b6963fab rsx: Improve unnormalized coordinate sampling
- Improve rounding when sampling nearest neighbour. This is mostly a problem with NVIDIA
- Implement unnormalized 3D sampling
2021-08-03 00:36:04 +03:00
kd-11 b3c65b7bca rsx: Implement vtc encoding for NVIDIA OpenGL support 2021-08-03 00:36:04 +03:00
kd-11 0ec526c5f1 rsx: Do not use VTC tiling on NPOT textures
- Seems to be ignored for 'normal' textures. Mostly verified through games.
2021-08-03 00:36:04 +03:00
Megamouse 0a7a12bbff RSX: fix 'Working buffer not big enough' 2021-07-27 23:59:12 +02:00
kd-11 92d1534917 rsx: Set composite images upload context based on their actual contents 2021-07-27 10:52:21 +03:00
kd-11 6a9d1edee1 vk: Fix use-after-free hazard by checking if we're faulting from within the texture cache
- If we're using the texture cache, DO NOT delete resources.
2021-07-25 20:55:09 +03:00
kd-11 0d87d909c6 vk: Fix double-spill for invalidated resources 2021-07-17 21:28:11 +03:00
kd-11 d53f2f10fb rsx/vk: Improve recovery during OOM situations
- Do not spill when running on IGP with only one heap as it will just crash anyway.
- Do not handle collapse operations when OOM. This will likely just crash and there are better ways to handle old surfaces.
- Spill or remove everything not in the current working set
- TODO: MSAA spill without VRAM allocations
2021-07-17 21:28:11 +03:00
kd-11 aaac4c1bde Clang workaround for c++20 non-compliance 2021-07-15 18:05:35 +03:00
kd-11 2524c35638 vk: Improve handling of texture cache temporary resources
- Temp resources from the texture cache are used to hold composite objects being sent to the GPU and can waste a lot of memory.
- Remove them if we run out of memory as they can linger around for a long time.
2021-07-15 18:05:35 +03:00
kd-11 a2f93b0696 rsx: Implement a simple cache eviction routine
- Can remove all non-essential textures from the cache except those passed as an exclusion list
2021-07-15 18:05:35 +03:00
kd-11 77c9dff054 vk: Minor whitespace fix
- Non-functional formatting and warning fixes
2021-07-15 18:05:35 +03:00
kd-11 c18e5e07cc vk: Implement VRAM spilling
- The idea is to shift memory to "shared graphics memory" when VRAM is running out
2021-07-15 18:05:35 +03:00
kd-11 000414c47d vk: Refactor surface cache by moving code to cpp file 2021-07-15 18:05:35 +03:00
kd-11 71a5e5333a rsx: Fix invalid reference when purging unlocked sections 2021-07-15 18:05:35 +03:00
kd-11 4bf9700562 rsx: Remove unused variable leftover from refactoring 2021-06-17 00:43:20 +03:00
kd-11 966aec7ad7 rsx: Resync excluded memory regions to avoid memory tests failing after flush events
- This is a mostly correct fix, but a corner case exists that can leak old data to the surface cache
2021-06-15 15:42:16 +03:00
kd-11 78972cd611 rsx: Refactor surface inheritance logic 2021-06-15 15:42:16 +03:00
kd-11 ddbe496097 rsx: Fix depth/color mismatch resolve in texture cache
- Sometimes we need a depth texture but only a color texture is available.
2021-06-07 01:03:49 +03:00
kd-11 3f80d0b7d8 rsx: Fix surface deduplication crash 2021-06-07 01:03:49 +03:00
kd-11 568af756cc rsx: Fix expired sampler descriptors
- Rebuilding when strict mode is enabled was incomplete.
  The copy has to be redone if the source has been updated.
2021-06-06 15:37:47 +03:00
Nekotekina f2d6b52561 Fix span copy after refactoring
- Add range check at fast path.
- Fix typo in element by element copying.
Should fix #10385
2021-06-01 21:18:25 +03:00
Nekotekina a1608b636f span: implement as_span workarounds as utils::bless
Minor cleanup.
2021-05-31 15:46:34 +03:00
Ani a49446c9e9
Replace gsl::span for std::span (c++20) (#7531)
* Replace gsl::span for std::span (c++20)
* Replace gsl::byte with std::byte

Co-authored-by: Bevan Weiss <bevan.weiss@gmail.com>
2021-05-30 17:10:46 +03:00
kd-11 9e62e98f79
rsx: Minor refactoring (#10358)
- Fix some misnomers.
- Allow finer grained control over texture section creation routines.
2021-05-27 23:44:07 +01:00
Nekotekina 2491aad6f2 types.hpp: implement min_v<>, max_v<>, SignedInt, UnsignedInt, FPInt concepts
Restrict smax to only work with signed values for consistency.
Cleanup <climits> includes.
Cleanup <limits> includes.
2021-05-23 19:43:51 +03:00
Nekotekina 160b131de3 types.hpp: implement smin, smax, amin, amax
Rewritten the following global utility constants:
`umax` returns max number, restricted to unsigned.
`smax` returns max signed number, restricted to integrals.
`smin` returns min signed number, restricted to signed.
`amin` returns smin or zero, less restricted.
`amax` returns smax or umax, less restricted.

Fix operators == and <=> for synthesized rel-ops.
2021-05-22 12:10:57 +03:00
kd-11 c5a06dab0a rsx: Refactor program texture state handling to be persistent across shader swaps 2021-05-15 23:51:12 +03:00
octopoulo fe17c83020 reverted comment 2021-05-12 15:28:30 +03:00
octopoulo b8928d230a gl: Intel GPU shader fix 2021-05-12 15:28:30 +03:00
kd-11 1a73b0a0da rsx: Fix transfer barriers not triggering resolve target initialization 2021-05-12 12:32:24 +03:00
kd-11 e3944bc67f rsx: Handle transfer_read differently from transfer_write
- Transfer writes are expected to clobber surface cache contents. Do NOT reload from CPU memory for writes.
- TODO: During transfer write to surface cache objects, lock memory if it was unlocked to avoid silly problems.
2021-05-09 13:07:47 +03:00
Nekotekina f8e05f8e3c Remove redundant operators != 2021-04-29 22:57:40 +03:00
Megamouse 1caf81811a Move unspecific Emulator code out of System.cpp 2021-04-24 11:21:22 +03:00
kd-11 14a64e2529 rsx: Handle rare rounding issue where position.w is very close to zero 2021-04-13 21:26:23 +03:00
kd-11 06dc99ab85 rsx: Fix decompression of RB_RG textures.
- Removes several subtle hacks that hid the real issue.
  A compressed texture has more than one texel per 'block'.
2021-04-11 21:36:36 +03:00
Megamouse a16d8ba3ea More random changes 2021-04-11 14:01:51 +03:00
Nekotekina b3fb6d7d18 Add and fix -Wredundant-decls (GCC) 2021-03-23 22:48:57 +03:00
Nekotekina c22e1e71f0 Continue fixing strict aliasing warnings 2021-03-13 18:02:37 +03:00
Nekotekina 4adf412049 Fix std::bit_cast misuse 2021-03-10 16:11:30 +03:00
Nekotekina 9cbe77904d Revert changes in BufferUtils.cpp
Should fix #9933
2021-03-09 19:19:24 +03:00
Nekotekina a4fdbf0a88 Enable -Wstrict-aliasing=1 (GCC)
Fixed partially.
2021-03-09 03:10:15 +03:00
Nekotekina 53af2dbb3f Add/fix warning -Wignored-qualifiers (GCC/clang)
Fix simple_array::const_iterator as a part of it.
2021-03-09 03:09:50 +03:00
Nekotekina 87af905018 Enable -Wunused-parameter 2021-03-06 18:07:08 +03:00
Nekotekina ea5e837bd6 fixed_typemap.hpp: return reference 2021-03-02 16:08:14 +03:00
Nekotekina 2c18d67769 Fix -Wsometimed-uninitialized (Clang) 2021-02-18 14:15:52 +03:00
Nekotekina 038148bf06 Fix almost all GCC warnings 2021-02-17 22:59:04 +03:00
Nekotekina 8e6e57de86 Enable -Wunused-function warning 2021-02-15 14:39:53 +03:00
kd-11 eba7d3b172 rsx: Add duplicate section detection when there are too many sections in the surface cache
- Check for useless sections.
  Helps in games that create a bunch of sections randomly for one-time use
2021-02-14 20:42:34 +03:00
Alex Saveau 48296c2ba6
Fixup for multi-thread shader compilation (loading stage) (#9762)
* Multi-thread shader compilation

This offers a huge improvement in startup performance. With around 13,000 shaders we go from ~1:30 to under 10 seconds. It looks like this was the original intention of the author given the outer scope recompile variable.

Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
2021-02-12 14:39:35 +03:00
kd-11 195fb1cf66 rsx: Improve texture cache invalidate
- Bunch of improvements
- Properly signal renderer to rebind textures!
- TODO: Range checks, should be pretty easy
2021-02-10 11:37:14 +03:00
kd-11 52acc23ecf rsx: Fix protection bug 2021-02-10 11:37:14 +03:00
kd-11 bf66c36ba4 rsx/texture_cache: Add support for reusing dirty images if possible
- Avoids a silly situation where a texture is discarded and an identical copy created immediately afterward.
  Unfortunately allocating memory blocks is really slow so avoid it as much as possible.
2021-02-10 11:37:14 +03:00
kd-11 0c10f47e85 rsx: Lower cache block length to 256 pages
- Drastically lowers time wasted iterating blocks when many small objects
  are present
2021-02-10 11:37:14 +03:00
kd-11 1bad9a939f rsx: Refactor texture cache utils
- Also lays groundwork for optional hashed sections
2021-02-10 11:37:14 +03:00
kd-11 ddac4686a7 rsx: Clear vertex output register if nothing is written to it
- On NVIDIA GPUs, gl_Position is not initialized. Always clear to 0 to avoid on-screen crap
2021-02-05 22:22:07 +03:00
kd-11 a1ab6c28c1 vk/rsx: Fix some more bugs 2021-01-24 14:24:55 +03:00
kd-11 7766076042 rsx/vk: DMA stuff 2021-01-24 14:24:55 +03:00
kd-11 eb086b0e3f rsx: Add support shadow1D and shadowCube 2021-01-21 10:24:49 +03:00
kd-11 b6b9085773 rsx: Use unsigned variables to avoid sign problems when calculating stipple bits 2021-01-21 10:24:49 +03:00
Nekotekina 8a2a76da1e texture_cache: fix some warnings in AUDIT 2021-01-18 13:49:59 +03:00
Nekotekina 0af452720e RSX: Fix possible bug in memory streaming utils 2021-01-12 15:06:31 +03:00
Nekotekina db8e6fe7a7 Enable -Wunused-variable 2021-01-12 14:34:14 +03:00
kd-11 3f9b699eef rsx: Fix ambiguous call to min(float16_t, float) 2021-01-04 02:28:24 +03:00
kd-11 f87dd91b52 rsx: Allow attempted fetch of non-existent surface 2020-12-28 21:49:11 +03:00
kd-11 a96b4412d3 rsx: Do not rely on program env state, instead, always use program ucode analysis results when doing codegen
- Some things can be present in program env but not ucode state
  e.g A texture can be active and bound in a redirected manner but not actually be used in ucode
  In such a case, only the ucode analysis or decompilation can decide whether to inject decoding routines
2020-12-25 02:39:08 +03:00
kd-11 bee76fc8d1 rsx: Refactor shader codegen and fix shadow sampling on depth-float 2020-12-25 02:39:08 +03:00
Nekotekina a8e0d261b7 types.hpp: more cleanup
Also fix compilation.
2020-12-22 19:08:09 +03:00
Nekotekina bd269bccaf types.hpp: remove intrinsic includes
Replace v128 with u128 in some places.
Removed some unused files.
2020-12-21 21:11:25 +03:00
Nekotekina eec11bfba9 Move align helpers to util/asm.hpp
Also add some files:
GLTextureCache.cpp
VKTextureCache.cpp
2020-12-18 18:07:42 +03:00
Nekotekina db9b7db531 Cleanup and move sysinfo.h -> util/sysinfo.hpp 2020-12-18 12:55:54 +03:00
Nekotekina fb29933d3d Add usz alias for std::size_t 2020-12-18 12:23:53 +03:00