Commit graph

3700 commits

Author SHA1 Message Date
Eladash 21d725daa5 rsx: Fix shader cache of 2 or less pipelines 2021-01-03 01:47:07 +03:00
Eladash 247e90b3d0 rsx: Shaders cache loading and saving bugfixes
* Fixed crash whenever files are missing from the cache.
* Fixed crash whenever files are empty.
* Fixed crash whenever file creation/overwrite of cache files failed. (handled by fs::write_file)
* Fixed crash whenever there are any subdirectories inside the pipelines cache directories.
* Overwrite invalid shader cache files if encountered such.
* Optimizations have been added.
2021-01-03 01:47:07 +03:00
Megamouse 7262f3d0b7 VK: make static chip_family_tables const 2020-12-31 22:57:17 +03:00
Megamouse 6f80fd0063 VK: move static chip_family_tables to cpp 2020-12-31 22:57:17 +03:00
Megamouse 7a51b7a019 VK: move helpers to vkutils 2020-12-31 22:57:17 +03:00
Megamouse d9eb31000d VK: refactoring part 1 2020-12-31 22:57:17 +03:00
Eladash 7fc26b1fab rsx: Implement Texture LOD Bias addend setting 2020-12-30 15:37:21 +03:00
Eladash c0e121abef rsx: Fix RSXTexture.h spacing 2020-12-30 15:37:21 +03:00
Eladash 7db13fdeff rsx: Move Anisotropic Filter Override to RSX state 2020-12-30 15:37:21 +03:00
kd-11 18c120ab9f rsx: Revert an accidental deletion 2020-12-28 21:49:11 +03:00
kd-11 f87dd91b52 rsx: Allow attempted fetch of non-existent surface 2020-12-28 21:49:11 +03:00
Eladash 66581d115b vm: Fix access violations on super memory, support super memory in vm::get_addr 2020-12-26 17:56:49 +03:00
kd-11 a96b4412d3 rsx: Do not rely on program env state, instead, always use program ucode analysis results when doing codegen
- Some things can be present in program env but not ucode state
  e.g A texture can be active and bound in a redirected manner but not actually be used in ucode
  In such a case, only the ucode analysis or decompilation can decide whether to inject decoding routines
2020-12-25 02:39:08 +03:00
kd-11 bee76fc8d1 rsx: Refactor shader codegen and fix shadow sampling on depth-float 2020-12-25 02:39:08 +03:00
kd-11 d9cb1a6319 vk: Fix more spec violations 2020-12-25 02:39:08 +03:00
Nekotekina a8e0d261b7 types.hpp: more cleanup
Also fix compilation.
2020-12-22 19:08:09 +03:00
Nekotekina b7bf316c1a Don't randomly include "stdafx.h"
It's file for precompiled headers.
Include what is used, don't rely on transitive includes.
2020-12-22 14:32:30 +03:00
Nekotekina 41ee792f95 MSVC: remove MemLeak build support
There are better memleak detection tools.
1) Requires to guard placement new and external libs
2) Doesn't work thoroughly
2020-12-22 14:32:30 +03:00
Nekotekina bd269bccaf types.hpp: remove intrinsic includes
Replace v128 with u128 in some places.
Removed some unused files.
2020-12-21 21:11:25 +03:00
kd-11 e449111c33 vk: Fixup for renderpass issues 2020-12-19 12:58:44 +03:00
Megamouse 066e53da55 minor cleanup 2020-12-19 08:33:53 +01:00
Nekotekina eec11bfba9 Move align helpers to util/asm.hpp
Also add some files:
GLTextureCache.cpp
VKTextureCache.cpp
2020-12-18 18:07:42 +03:00
Nekotekina db9b7db531 Cleanup and move sysinfo.h -> util/sysinfo.hpp 2020-12-18 12:55:54 +03:00
Nekotekina 05099e2ae1 Replace uint64_t with u64 2020-12-18 12:23:53 +03:00
Nekotekina 77352a2a86 Replace uint32_t with u32 2020-12-18 12:23:53 +03:00
Nekotekina ae633292c0 Replace int32_t with s32 2020-12-18 12:23:53 +03:00
Nekotekina d6042cf891 Replace uint16_t with u16 2020-12-18 12:23:53 +03:00
Nekotekina 534c63bf57 Replace uint8_t with u8 2020-12-18 12:23:53 +03:00
Nekotekina fb29933d3d Add usz alias for std::size_t 2020-12-18 12:23:53 +03:00
Nekotekina 4cfa9b11f3 Move busy_wait() to asm.hpp 2020-12-18 12:23:53 +03:00
kd-11 fb47d1f788 vk: Register ampere GPU PCI IDs 2020-12-17 19:56:48 +03:00
kd-11 235db57f0e rsx: Do not reset vertex program texture mask when updating ucode analysis
- Fixes incorrect texture type detection in some games after program env/ucode separation
2020-12-17 09:36:20 +03:00
kd-11 cfbde005fb vk: Force ampere GPUs to use the slower but spec-compliant depth-color resize route
- TODO: More investigation and optimizations
2020-12-17 09:36:20 +03:00
kd-11 0a865bd9dc vk: Workaround for validation layers bug 2020-12-17 09:36:20 +03:00
Megamouse 0bfec59af8 Fix stop during shader compilation 2020-12-16 11:01:51 +03:00
kd-11 035a76f26d Fix build 2020-12-16 10:10:06 +03:00
kd-11 42f4e831a2 vk: Clean up some leftovers from shader decompiler rewrites 2020-12-16 10:10:06 +03:00
kd-11 d3686dbb75 rsx: Add some texture upload statistics to the texture cache 2020-12-16 10:10:06 +03:00
kd-11 fb1c790350 rsx: Make debug overlay dynamic 2020-12-16 10:10:06 +03:00
kd-11 0ef5743261 rsx: Fix sampler descriptor updates for framebuffer resources
- Each desc manages its own lifetime now instead of relying on global timestamp check
- Fixes situation where same object remains active without update for long
2020-12-16 10:10:06 +03:00
Nekotekina e321765c54 Split BEType.h to util/v128.hpp and util/to_endian.hpp 2020-12-13 16:34:45 +03:00
kd-11 f83c2f0b6b rsx: Restructure and simplify some header include chains 2020-12-13 15:38:35 +03:00
kd-11 d775c8dc73 rsx: Move shader analysis+prefetch to the end of the draw call 2020-12-13 15:38:35 +03:00
Nekotekina a6a5292cd7 Use uptr (std::uintptr_t alias) 2020-12-12 16:29:55 +03:00
Nekotekina b59f142d4e Move types.h to util/types.hpp 2020-12-12 15:12:01 +03:00
Nekotekina 666a18f5e5 Remove ceil2/floor2 from types.h 2020-12-12 15:12:01 +03:00
Nekotekina b09b7c1184 Remove any_pod<> from types.h
Add simplified any32 to GCM.h
Add simplified cmd64 to PPUThread.h
2020-12-12 13:12:39 +03:00
Nekotekina 6e05dcadb6 Reduce std::numeric_limits dependency
Please, stop pretending...
You need these templates for generic code.
In other words, in another templates.
Stop increasing compilation time for no reason.
2020-12-12 12:35:18 +03:00
Nekotekina bc7acf9f7a RSX: remove overly long integer sequence (opcode_list)
Convert to constexpr array and move to gcm_printing.cpp
2020-12-12 11:38:35 +03:00
Nekotekina 7a015b6fc0 VKMemAlloc.cpp: use shared_mutex in vk_mem_alloc.h
Because it allows to use custom implementation.
Also fix compilation.
2020-12-11 19:05:11 +03:00
Nekotekina aa3aef4beb std::chrono cleanup: always use steady_clock 2020-12-11 19:01:56 +03:00
Nekotekina 72284b4530 Use atomic_t<> in VKMemAlloc 2020-12-10 18:58:11 +03:00
Nekotekina b382d3b3e9 Remove ASSUME macro
It's dangerous and sometimes bluntly misused feature.
Its optimization potential is near-zero.
2020-12-10 14:08:02 +03:00
Nekotekina 36c8654fb8 Remove HERE macro
Some cleanup.
Add location to some functions.
2020-12-10 12:30:22 +03:00
kd-11 d25c401aec vk: Validate image creation inputs
- Should avoid things like res scaling breaking when very large scaling is in use
2020-12-09 21:30:08 +03:00
kd-11 aac874a842 vk: Add even more D32_SFLOAT missing locations 2020-12-09 21:30:08 +03:00
Nekotekina 5d934c8759 Improve narrow() and size32() with src_loc detection 2020-12-09 16:26:20 +03:00
Nekotekina e055d16b2c Replace verify() with ensure() with auto src location.
Expression ensure(x) returns x.
Using comma operator removed.
2020-12-09 15:43:38 +03:00
kd-11 f8d2830ac7
vk: Properly register D32_SFLOAT as a depth-stencil format (#9396) 2020-12-09 02:06:27 +00:00
Eladash 2602be426f
Allow emulation to work without firmware (#9367)
* Allow emulation to work without firmware
* Fix HLE prx path detection.
* Fix manual list loading bugs.
* Fix HLE gcm
* GUI: Fix fonts search
* GUI: Hardcode sprx list
Do not depend on /dev_flash/sys/external/ contents.
2020-12-07 20:10:34 +03:00
Nekotekina 24e4e329ed atomic.hpp: add atomic_t<bool> specialization
May be required in future, plus adds/hides some methods.
2020-12-07 17:13:12 +03:00
Nekotekina eb66302907 atomic.hpp: replace std::atomic with atomic_t
Dual dependency is nothing good.
2020-12-07 17:13:12 +03:00
Nekotekina b16cc618b5 atomic.hpp: add some features and optimizations
Add atomic_t<>::observe() (relaxed load)
Add atomic_fence_XXX() (barrier functions)
Get rid of MFENCE instruction, replace with no-op LOCK OR on stack.
Remove <atomic> dependence from stdafx.h and relevant headers.
2020-12-07 17:13:12 +03:00
kd-11 3a0b3a85a5 rsx: Separate program environment state from program ucode state
- Allows for conservative texture uploads
- Allows to update a program object without running full ucode analysis for no reason
2020-12-07 00:45:27 +03:00
RipleyTom af8c661a64 Remove BOM markers 2020-12-06 15:30:12 +03:00
kd-11 2aa5c437e8 rsx: Fix upscaled image reconstruction
- Base the upscaling on the real source and not the "attr" parameter.
- In case of reconstruction, the source is much larger than the subslice in "attr"
2020-11-30 01:20:17 +03:00
kd-11 67f48ce21c rsx: Fix uncaught depth-func changes
- Depth func of always or never usually disqualifies depth testing.
  Invalidate contested surfaces when depth func is changed.
2020-11-30 00:46:36 +03:00
kd-11 845a7d9968 Fix RSX replay thread lifecycle 2020-11-29 22:39:52 +03:00
kd-11 8228a4adcd gl: Disable depth test before rendering text to the backbuffer which does have a Z buffer 2020-11-24 11:10:43 +03:00
kd-11 f1c65dcefc vk: Improve support for RDNA
- Add support for RDNA2
- Add RDNA MSAA workaround
2020-11-23 19:20:00 +03:00
kd-11 cab4c78b7b rsx: Some shader compiler threads tuning
- Allow more threads for wide CPUs
- Simplify 'auto' selection a bit
2020-11-21 20:43:15 +03:00
kd-11 7553429130 gl: Thread shader source compilation dispatch
- glCompileShader is in itself much slower than anticipated
2020-11-21 20:43:15 +03:00
kd-11 3ddfa288cf rsx: Use multithreaded shader compiler backend 2020-11-21 20:43:15 +03:00
Nekotekina 71f1021648 Fix thread pool entry point and get_cycles()
Fix possible race between thread handle availability.
Don't treat zero thread as invalid one.
Now entry point is full is assembly.
Attempt to fix #9282
Also fix some TLS.
2020-11-21 17:18:42 +03:00
kd-11 0e7a705254 rsx: Resolution scaling overhaul
- Enforce square pixels instead of per-axis scaling
2020-11-18 09:29:34 +03:00
Eladash fefab50e06 Fix vm::range_lock, imporve vm::check_addr 2020-11-11 10:30:09 +03:00
Nekotekina 21ec32b465 vm: implement g_shmem for range locks
Renamed from g_shareable. Contains pointers instead of bits.
Used in range locks to prevent any "collision" between memory.
2020-11-08 16:43:15 +03:00
Nekotekina 1c99a2e7fb vm: add map_self() method to utils::shm
Add complementary unmap_self() method.
Move VirtualMemory to util/vm.hpp
Minor associated include cleanup.
Move asm.h to util/asm.hpp
2020-11-08 16:43:15 +03:00
Megamouse a3eb5c2d63 More Header cleanup 2020-11-06 22:14:05 +01:00
Nekotekina ba5ed5f380 Fix vm::lock_range wrong check
Minor header refactoring.
2020-11-04 14:59:26 +03:00
Megamouse 027eba2b59 Add Quadro to fast version of VK get_driver_vendor 2020-11-04 12:13:56 +01:00
Megamouse 91b8e7504e Set some things to log level always 2020-11-04 12:13:56 +01:00
Megamouse 36149fd986
overlays: kinda fix performance graph margins (#9181) 2020-10-31 16:32:31 +00:00
Megamouse 2cee26c3e7 Cleanup some includes 2020-10-31 11:53:46 +01:00
Megamouse 4984e87776 implement interception for cellKb and cellMouse
this needs to be tested
2020-10-31 02:11:27 +03:00
Eladash b5014d56ab rsx: Fix transform contants load 2020-10-31 02:08:03 +03:00
Nekotekina 13c564f2af sys_memory: add cpu_flag::wait 2020-10-30 18:09:30 +03:00
Nekotekina 95aeebe4b5 sys_rsx: add cpu_flag::wait 2020-10-30 18:00:25 +03:00
Nekotekina 605d57c541 sys_event: cleanup (replace vm::temporary_unlock)
Also made minor changes in sys_rsx.cpp.
Removed unused exception std headers.
2020-10-30 17:49:07 +03:00
Nekotekina f972fa26a4 Derive RSX Replay thread from cpu_thread
Its id is set to 0, so fix some id_type() usages.
2020-10-30 17:36:11 +03:00
Nekotekina 13de773486 Remove some vm::reservation_lock instances 2020-10-27 17:56:19 +03:00
kd-11 b32eecb5a7
rsx: Driver compatibility improvements (#9131)
* rsx: Refactor vertex clip emit to avoid using f64 unnecessarily

- Fixes driver crash on intel

* vk: Add NVIDIA driver version check

- Warn if user has outdated drivers with known problems
2020-10-27 13:22:15 +03:00
kd-11 18ca3ed449 rsx: Block-level reservation access 2020-10-25 20:21:04 +03:00
Nekotekina 306593a0c5 Revert "atomic.cpp: fixup for WaitOnAddress path"
This reverts commit 3b8bce1bed.
2020-10-21 09:54:22 +03:00
Nekotekina 3b8bce1bed atomic.cpp: fixup for WaitOnAddress path
Also fix wait quantum.
2020-10-21 08:18:27 +03:00
kd-11 a90801e2aa vk: Add VK_FORMAT_D32_SFLOAT to format conversion table
- This format is required to emulate RSX_FORMAT_CLASS_D16_FLOAT
2020-10-18 19:30:40 +03:00
Nekotekina 492ed27495 RSX: fix rsx::nv406e::semaphore_release partially
Properly release reservation (non-TSX path).
At least update and notify reservation (TSX).
2020-10-15 20:58:59 +03:00
Nekotekina dcff8c2637 Fix remaining vm::reservation_lock usages (for now)
Optimization can be restored later.
2020-10-13 12:04:59 +03:00
Nekotekina 346a1d4433 vm: rewrite reservation bits
Implement classic unique/shared locking concept.
Implement vm::reservation_light_op.
2020-10-10 13:58:48 +03:00
kd-11 bca3a3f4ed
vk: Open CB before doing frame cleanup so that callbacks work (#9041) 2020-10-10 10:02:55 +01:00
kd-11 d5f7e7b179 vk: Reimplement GPU query management 2020-10-06 12:02:53 +03:00
Eladash f4ca6f02a1 PPU: Implement support for 128-byte reservations coherency 2020-09-28 22:34:42 +03:00
kd-11 04ff7913b4 rsx/codegen: Workaround for borked hardware
- Bitwise or does not evaluate correctly for some hardware.
  Substitute with subtraction instead.
2020-09-28 22:18:36 +03:00
kd-11 9baef8c705 rsx: Emit simpler fragment program code
- Optimize clamp16
- Use bfe instead of shift-and
2020-09-27 18:56:04 +03:00
kd-11 a14a358b73 rsx: Optimize vertex decoder to generate simpler code
- Significantly improves compilation speed by simplifying most of the code and doing something similar to LICM.
  * Actual decoding is now vectorized and performed in one step rather than in a loop.
  * Switches inside loops are removed and replaced with simple comparison. Generates much nicer (and smaller) GCN bytecode.
2020-09-27 18:56:04 +03:00
kd-11 259844f4f3 vk: Disable spirv optimizer
- I've not found it to be very useful and it just breaks good code right now.
  TODO: Re-enable when things improve.
2020-09-27 18:56:04 +03:00
kd-11 9ea478008c vk: Properly initialize float64 support for SPIRV 2020-09-24 01:30:21 +03:00
Morgan Creekmore 4fe2951509 Fixed formatting 2020-09-23 20:27:30 +03:00
Morgan Creekmore ff8a94714f Add additional NV4097 methods to gcm_printing 2020-09-23 20:27:30 +03:00
Morgan Creekmore b45d6fee2d Fix methods_name indentation 2020-09-23 20:27:30 +03:00
Morgan Creekmore f44e696edf Added missing NV methods for gcm_printing 2020-09-23 20:27:30 +03:00
kd-11 a50ea09053 rsx: Properly pass format_class information during RTV/DSV resource barrier
- Also takes the opportunity to remove repeating code in a minor refactor.
2020-09-22 12:19:54 +03:00
kd-11 d012abd924 vk: Improve image transfer and scaling
- Handle typeless src and dst with aliased typeless format
- Optimize typeless transfers by only dealing with affected texels.
  * Eliminates redundant dst->typeless transfer of full image (very expensive)
  * Eliminates full src->typeless transfer of full image and replaces with only affected region
  * Requires significantly smaller output buffers, saving on VRAM cost
2020-09-22 12:19:54 +03:00
kd-11 7ed82c0791 rsx: Always force typeless copy if memory is crossing aspect boundary 2020-09-22 12:19:54 +03:00
kd-11 9db97278f3 rsx: Lower error message to warning
- Mismatched texture handling is a TODO that will be handled with texturing rewrites
2020-09-19 01:55:59 +03:00
kd-11 d3898fda57 rsx: Release misconfigured texture memory before attempting reupload 2020-09-19 01:55:59 +03:00
kd-11 7900780cea vk: Fix nul section crash due to unexpected format (B8) 2020-09-16 20:14:44 +03:00
kd-11 92d65ff3c2 rsx: Add support for mixed data types when sampling shadow coordinates 2020-09-15 17:37:52 +03:00
Megamouse a2da187615 HLE: localize most - if not all - exposed strings 2020-09-14 18:24:18 +02:00
Megamouse d0ffbbfc4d Qt/overlays: use Argument list for translatable strings
This is somewhat crippled for now. It only takes a single argument in the callback
2020-09-14 18:24:18 +02:00
Megamouse 460a933267 Qt/overlays: Localize most rsx overlays 2020-09-14 18:24:18 +02:00
kd-11 da6760ed98 vk: Simplify shadow comparison operations for non-integer formats
- Just use hardware PCF, it makes everyone's life easier.
2020-09-09 22:11:12 +03:00
kd-11 6380e67af9 rsx: Fix depth clipping
- Fix special case where n=f making (f-n) = 0
- Dynamically update depth range by setting dirty bits
- Fix depth bounds when n=f and bounds test is disabled
2020-09-08 15:33:08 +03:00
kd-11 dc465df3bc rsx: Enable support for extended range in depth buffer
- Software clipping emulation is used here as OpenGL does not have explicit clip control.
- Hardware clip control for vulkan to be enabled after this.
2020-09-08 15:33:08 +03:00
kd-11 2e88924cb9 rsx/gl: Refactoring and cleanup
- Fix incorrect memory requirement calculation for D32FS8X24_PACK64 data type on GL
- Removes a lot of spaghetti code in GL backend from years of accumulation
- Retires several now-useless methods from RSX util toolbox
2020-09-08 13:53:06 +03:00
kd-11 6d2cb94e3e gl/vk: Support swizzled data for RCB/RDB 2020-09-07 22:31:57 +03:00
kd-11 85e5b077f7 gl: Overhaul upload and download routines for textures to go through shared image_to_buffer and buffer_to_image routines.
- This automatically adds support for depth float textures as well
2020-09-07 18:25:54 +03:00
kd-11 85dd1b4ea9 vk: Fix fconvert job issues
- Fix compilation bug caused by typo
- Invert to/from for consistent declarations
- Fix dst_swap when From == 2
2020-09-07 18:25:54 +03:00
kd-11 220e86bbd1 gl: Accelerate D24X8_UINT operations
- Adds compute decoding for D24X8_UINT on both download and upload routines
- Adds support for D24X8_UINT operations for typeless copy
2020-09-07 18:25:54 +03:00
Bevan Weiss baf96b3eb6 RSX: Update manual string creation -> std::string()
Replace manual string creation with call to std::string() constructor passing in char*
This appears to drastically reduce the cache impact here
2020-09-05 10:38:32 +03:00
kd-11 3c43d8fe05 rsx: Fix execution barrier insertion
- In case of element re-arrangement, the barrier should obey the current insertion pointer
2020-09-04 09:34:13 +03:00
Eladash 73d23eb6e6
SPU: Implement Accurate DMA (#8822) 2020-09-02 23:58:29 +02:00
kd-11 a917f55ef8
vk/sdk: Sync with vulkan SDK v148 (#8814)
- Sync with vulkan SDK 148
- glslang library was split into several smaller libraries
- HLSL is no longer needed
2020-09-01 00:57:38 +03:00
kd-11 af9e217fa4 vk: Improve D16F handling
- Adds upload and download routines. Mostly untested, which is why the error message exists
2020-08-30 09:26:37 +03:00
kd-11 e9cdb248a0 glsl: Properly implement shadow filtering when running emulated shadow compare
- Previous code was completely borked
2020-08-29 02:03:09 +01:00
kd-11 e8274d5a59 vk: Fix depth format mismatch detection in copy_image 2020-08-29 02:03:09 +01:00
kd-11 d000d648b0 vk: Fix some minor spec violation
- Stencil clear pass does not consume an image, do not bind one.
- Add push_barrier to allow push-pop semantics for texture barrier insert.
2020-08-27 12:52:28 +03:00
kd-11 d257ba5156 vk: Add some more diagnostic messages for unoptimized image transfer setups 2020-08-27 12:52:28 +03:00
kd-11 9828d6146b rsx: Fix format matching when aggregating textures
- When copying depth-depth, prefer own format over depth int format
2020-08-27 12:52:28 +03:00
kd-11 9e4bec8cec vk: Fix some missing render target declarations 2020-08-27 12:52:28 +03:00
kd-11 65ead08880 rsx: Refactor and improve image memory manipulation routines 2020-08-27 12:52:28 +03:00
kd-11 f6c6c04648 vk: Implement transport for D24S8_FLOAT data 2020-08-27 12:52:28 +03:00
kd-11 794378d5e9 rsx: Do not create depth textures as blit engine targets. 2020-08-27 12:52:28 +03:00
kd-11 a5ac5a9861 rsx: Separate uint depth formats from float depth formats 2020-08-27 12:52:28 +03:00
kd-11 faaf28b41d rsx: Basic support for creating depth float formats 2020-08-27 12:52:28 +03:00
Eladash 853e2b90a3 rsx: Minor rsx::ceil_log2 bugfix 2020-08-15 20:39:21 +03:00
kd-11 fd2607ad52 rsx: Fix XBGR vs XRGB screenshots 2020-08-12 20:19:19 +03:00
kd-11 7e1b24224d rsx: Support XBGR flip image load from Cell memory 2020-08-12 20:19:19 +03:00
kd-11 56c63170b9 vk: Warn if GPU does not support RGBA8 natively 2020-08-12 20:19:19 +03:00
kd-11 b41349546c rsx: Proper support for typeless transform of ABGR framebuffers using the RGBA8 format 2020-08-12 20:19:19 +03:00
kd-11 6850533b50 rsx: Unify composite texture creation and management
- Some texture accesses require image compositing steps to assemble the requested image from existing subresources.
  Handle all the common routines in a unified manner to avoid having one broken path (e.g mipmap gather not supporting bitcast operations)
2020-08-10 13:31:22 +03:00
kd-11 22f5e7a9be vk: Generate valid image+image_view combinations for placeholder texture descriptors 2020-08-06 20:34:49 +03:00
kd-11 7109fe9889 rsx: Improve swizzled layout detection
- Reset swizzle flag to false automatically on section reset.
- Detect render target payload and extract swizzle information from it.
2020-08-05 23:23:38 +03:00
kd-11 bd21930d1a rsx: Decode swizzled GPU data on CPU readback
- Currently this conversion is being done on the CPU to reuse as much code as possible.
  The expectation is that this almost never happens, so there is not point in increasing maintenance burden by adding compute paths
2020-08-02 16:14:11 +03:00
kd-11 4df933275b rsx: Propagate raster type of fbo sourced data throughout the pipeline.
- Tracks which kind of raster was done (Z-ordered vs linear) throughout the application.
- This allows to identify if data is in the expected format or not.
2020-08-02 16:14:11 +03:00
Megamouse f073ff8fe8 overlays: fix minor warning 2020-07-29 13:18:33 +02:00
Bevan Weiss 7898ae6fe6 VK_REMAP enum is signed.. but later case comparison is unsigned
another clang directed fix up... might be involved with swizzle..
2020-07-26 15:27:51 +03:00
Ani 74c8a44d84
rsx: Fix cache skipping shaders on load+compile (#8633)
When on single worker mode (OpenGL or an hypothetical scenario of a 
single core PC with Vulkan), load and compile would skip 10% of the 
shaders on queue each stage.

Co-authored-by: kd-11 <karokidii@gmail.com>
2020-07-26 08:47:50 +01:00
kd-11 be4b71b805 vk: Fixup for PR #8590
- This change was lost during rebase
2020-07-25 14:48:11 +03:00
kd-11 b0c7ca6d1f vk: Improve video memory manager to attempt recovery in out of memory situations 2020-07-25 14:48:11 +03:00
kd-11 4d8de282f9 vk: Improve typeless texture succession
- Ensure incoming texture is large enough that the original one fits inside it to avoid back-and-forth succession.
- Make use of the resource manager to remove the obsolete textures to avoid holding on to the them which "leaks" VRAM.
  The memory isn't leaking, it's just wasting space in temporary pool until renderer is closed.
2020-07-25 14:48:11 +03:00
Megamouse de80a4b6c7 overlays: try to fix unexpected font crop 2020-07-25 10:21:52 +03:00
Eladash 268bcd1c7b rsx: Fix false desync events 2020-07-16 19:26:10 +02:00
kd-11 42a9ac9e6c rsx: Brute-force removal of superseded surfaces 2020-07-16 19:11:26 +03:00
kd-11 182b20c33d rsx: Fix draw count append when draw ranges are out of order
- It is common for draw counts to truncate at 256 even when it makes no sense to do so.
- e.g 256 is not a multiple of 3 so triangles will glitch out
2020-07-14 16:04:44 +03:00
kd-11 ab3d36f0f3 rsx: Fix depth bounds test
- Allow depth bounds test to access the Z buffer even when depth test is disabled.
2020-07-10 15:59:15 +03:00
kd-11 632af8d723 rsx: Support partial texture descriptors
- It is safe to declare w > pitch and it works as long as sampling inside the legal 2D area is obeyed.
2020-07-10 15:26:07 +03:00
kd-11 987ede2e6c vk: Inject memory barrier upon conclusion of a framebuffer feedback loop
- Do not write to the texture until previous draw call is completed using it.
- This is usually not much of a problem until blending operations come into play.
2020-07-08 19:23:29 +03:00
kd-11 05dc6ad610 gl: Silence warnings 2020-07-05 16:58:44 +03:00
kd-11 3fe8499956 rsx: Improve ZCULL queued requests finalization
- Unifies the code
- Allows conditionals to be evaluated with a forwarder present
2020-07-05 16:58:44 +03:00
kd-11 acf51f0ead rsx: Fix transfer descriptors for partially overlapping slices in head
- Height must be corrected to skip the piece that exists before the current slice
2020-07-03 14:29:54 +03:00
kd-11 c9c0d7361d rsx: Implement fast ZCULL barrier when query object is already known 2020-07-02 20:11:57 +03:00
kd-11 d7ffc8b4ac rsx: Fix leaking ZCULL queries after a barrier
- Multiple queries can be queued up, process them all before completing the barrier
2020-07-02 20:11:57 +03:00
Bird Egop eea12bad07
Implement Caret upwards and downwards move in overlay_edit_text (#8342)
* Implement caret upwards and downwards move in overlay_edit_text

* Optimize caret up and down movement

Co-authored-by: Megamouse <studienricky89@googlemail.com>
2020-07-02 09:06:37 +02:00
kd-11 bd14429f20 vk: Disable primitive restart for old GCN cards
- Also adds more Navi 14 chips to detection table
2020-06-30 15:00:07 +03:00
kd-11 5e29bdbe22 vk: Add more GPUs to nvidia chip table
- Registers more TU116 and TU117 variants
- Registers GA100
2020-06-28 22:54:58 +03:00
kd-11 b437794e92 vk: Improve nvidia speedhack for non-turing cards
- Inverts the chip family check to skip any unidentified GPUs altogether
2020-06-28 22:54:58 +03:00
kd-11 5ea6535fd5 rsx: Force flushing of NaN/INF to zero
- This option was always enabled for NVIDIA cards, but it seems some games would benefit from the option on other GPUs as well.
- TODO: Hwtest to verify correct behavior and plan how to safely implement in hw
2020-06-26 09:24:15 +03:00
kd-11 628cb1c779 rsx: Validate blend factors according to hardware testing 2020-06-23 12:15:02 +03:00
kd-11 a14e0a0104 rsx: Validate stencil op to match realhw behavior 2020-06-23 12:15:02 +03:00
kd-11 f3637cdfdb rsx: Fix surface options hint mechanism
- Silly typo
2020-06-23 12:15:02 +03:00
kd-11 c6a9a5d5d7 rsx/fixup: Fix color clear logic
- Enable fast clears on ABGR formats in vulkan
- Fix disabling color clears for unsupported formats in GL
2020-06-23 12:15:02 +03:00
kd-11 7f917c8ba5 rsx: Fix ABGR decoding for colormask and clear color
- The bytes in these values are based on the format according to hw tests
- G8B8 is unaffected as the first two bytes are already G8B8 for A8R8G8B8 standard layout (BGRA)
- A8B8G8R8 and its derivatives have words 0 and 2 exchanged.
2020-06-22 20:12:41 +03:00
kd-11 e992cbe01b rsx: Support DRGB8 sampling of render targets 2020-06-22 20:12:41 +03:00
kd-11 2086e7f2e8 rsx: Account for subpixel precision when converting DST coordinates to
SRC coordinates

- When extracting a 1x1 texture from another texture of a different
  format, width conversion can result in a dimension of 0 if the
extracted texel is not a full texel in SRC
2020-06-17 22:18:47 +03:00
kd-11 c764925b4d rsx: Properly handle conversion of G8B8 and related formats
- These formats are 16-bit packed, not separate 8-bit channels. Conversion requires byteswap for them.
2020-06-16 22:36:38 +03:00
kd-11 83d818d96f rsx: Improve mipmap gathering
- Account for source offsets when grabbing subregions
- Scale input accordingly when sourcing from fbo in all paths
2020-06-16 19:12:03 +03:00
sampletext32 0ad4e91001 Avoid string reallocation in swizzle CgBinaryProgram 2020-06-15 22:26:49 +03:00
kd-11 8d8fb4a2e4 rsx: Remove ARGB->D24S8 conversion shader which has been deprecated for years since compute capabilities were added to the emulator 2020-06-15 14:18:12 +03:00
kd-11 2e737ad483 rsx: Fix surface option invalidation
- Depth buffers can be in special "read" state when writes are disabled. Account for this.
2020-06-14 23:30:03 +03:00
Eladash e2248332ae rsx: Make "Disable ZCull Occlusion" setting dynamic 2020-06-14 18:45:46 +01:00
kd-11 3663a8ab4d rsx: Improve surface options invalidation 2020-06-14 20:13:12 +03:00
Eladash bd6fdf3f2d rsx: Optimize rsx::rsx_iomap_table construction 2020-06-12 22:40:58 +03:00
kd-11 e1183f6919 gl: Fix depth buffer byteswap hint
- uint24_8 is not actually swapped, it is decoded in a special way
2020-06-12 20:49:47 +03:00
kd-11 f4ec28d932 rsx: Merge instruction expand flag with the other sign expand flags
- Avoids double expansion when both the exp_tex flag is set AND the texture also is sampled as signed
- Should fix missing eyeballs in Mass Effect 1 with the previous sign expansion fix
2020-06-12 20:19:20 +03:00
kd-11 ce587f43a0 rsx: Implement signed normalized texture formats
- Already partially supported via EXP option in the shader opcode, but format decoding was disabled.
- Noticed in some UE3 games which use _SNORM variants on PC but _UNORM on rpcs3
2020-06-12 20:19:20 +03:00
kd-11 ebbf329b6a gl: Improve async compiler synchronization with initialization
- On multithreaded mesa, the program initialization routine was not
  being flushed correctly. Set up synchronization fence after initialization
is complete.
2020-06-07 12:54:34 +03:00
kd-11 87cc937d4e rsx/fp: Separate SRC precision modifiers
- SRC0, SRC1 and SRC2 have different bits for precision modifiers all stored inside SRC1
- This explains the strange observed behavior of the MAD instruction which has 3 inputs
2020-06-07 12:07:27 +03:00
kd-11 d47d597b34 vk: Make the depth-convert pass multisample aware 2020-06-05 17:19:24 +03:00
kd-11 69c2150fbd vk: Fix query reset when renderpass is active
- Performs delayed query reset on-demand
2020-06-05 17:19:24 +03:00
kd-11 650152e05f rsx: Fix fragment state updates
- Fix copypasta for POLYGON_STIPPLE_PATTERN vs SET_POLYGON_STIPPLE method binding
- Use proper enums for ROP_control bits to avoid confusion
2020-06-03 22:05:33 +03:00
kd-11 73fe9b51de rsx/fp: Ignore self-referencing register writes.
- Sometimes, usually with shaders that do pack/unpack operations, there is a write-to-self operation.
  For example r0.xy = r0.xy
  Obviously no new data was introduced into "r0" by this, so we should not mark the register as having new data.

- TODO: Investigate on realhw if self-reference is needed to "cast" the overlapping half registers to their full register counterparts.
2020-06-03 09:45:02 +03:00
kd-11 26b2e4253d rsx: Properly account for memory sizes of reused surfaces 2020-06-02 21:37:57 +03:00
kd-11 b353bf6c56 rsx: Improve surface cache resource management
- Do not allocate too many objects. This is a problem in games using dynamic memory allocators that can make it rare for a surface to fall on the same address twice, keeping zombie RTVs and DSVs alive much longer than needed.
- Current limit used is 256M of virtual VRAM which is impossible on retail PS3
2020-06-01 22:24:27 +03:00
kd-11 c9a978a03e Typo fix
Co-authored-by: Megamouse <studienricky89@googlemail.com>
2020-05-30 14:47:10 +03:00
kd-11 59d44cd1cc gl: Fix shader logging 2020-05-30 14:47:10 +03:00
kd-11 542a6aed51 rsx: Add stippled rendering support to interpreters 2020-05-30 14:47:10 +03:00
kd-11 1677618c75 rsx: Implement stippled rendering 2020-05-30 14:47:10 +03:00
Eladash 3d20ce70f5 rsx: Fix possible case NULL zcull_ctrl in on_exit() 2020-05-28 11:56:02 +02:00
xddxd f56b362769
rsx: Copypasta fix (#8289) 2020-05-25 20:07:11 +01:00
kd-11 224a0e4b1a rsx: Fix data format remapping
- Includes missing 0xEE and 0x44 variants of the 2-component data format remapper
2020-05-24 13:51:19 +03:00
kd-11 bd41a108d8 nv3089: Account for subpixel addressing
- Those strange offsets noted in some games seem to match to subpixel addressing.
  For example, when scaling down by a factor of 4, a pixel offset of 2 will end up inside pixel 0 of the output
2020-05-24 11:31:37 +03:00
kd-11 7080305d82 vk: Implement masked stencil buffer clears
- Partial stencil buffer clears were not implemented. This is for example where a game can choose to clear only some bits from the stencil buffer.
- Vulkan does not support masked stencil clears natively, it has to be implemented as a graphics operation.
- Also refactors vulkan overlay passes to use global resource system instead of forcing the render backend to own all of them and manage lifetimes.
2020-05-21 19:27:23 +03:00
Nekotekina 72fedccaba rsx_utils.h: fix signed/unsigned comparison 2020-05-18 00:51:57 +03:00
Ani 581176fb1a gl: Restrict insert_vertex_input_fetch workaround to Intel proprietary
It works fine on Mesa iris
Fixes detection of Mesa as recent Mesa does not have "x.org" on vendor string, allowing vendor_MESA to become true instead of vendor_INTEL on Mesa Intel
2020-05-17 17:49:14 +03:00
Eladash 377e2ce3e8
rsx: Write 4-byte long data to all semaphores (#8246)
* rsx: Write 4-byte long data to all semaphores
2020-05-17 17:48:35 +03:00
kd-11 37df3c6f96 rsx/fp: Fix precision clamping on MAD instruction 2020-05-17 09:11:26 +03:00
AniLeo a8bca8b2ed gl: Only log shaders if g_cfg.video.log_programs is enabled 2020-05-16 16:16:17 +01:00
AniLeo b0d3c4d75e gl: Refactor shader type usage
Use Common/GLSLTypes.h program_domain instead of duplicated own internal
type
2020-05-16 16:16:17 +01:00
AniLeo 3db2f23e02 gl: Refactor shader compilation 2020-05-16 16:16:17 +01:00
Ani 661636efef gl: Check for EXT_depth_bounds_test
Avoid glEnable/glDisable GL_DEPTH_BOUNDS_TEST_EXT flood that returns 
GL_INVALID_ENUM if the feature isn't supported
2020-05-16 14:50:05 +01:00
AniLeo 99f5145aab glsl: Avoid implicit int->uint conversions
Silences debug output regarding implicit int -> uint conversions
2020-05-16 11:45:59 +01:00
Eladash 8a0425570c
rsx: Fix data written to RSX semaphores and the initial data of them (#8235) 2020-05-16 09:55:56 +01:00
AniLeo eecb22e749 gl: Remove older debug code
KHR_DEBUG path makes this obsolete, usage of this older path has been 
removed a long time ago
2020-05-16 08:29:00 +01:00
AniLeo 3ad12cf5f8 gl: Avoid issuing glDelete calls with m_id == GL_NONE
Applies GL_NONE macro usage where it makes sense
2020-05-16 08:29:00 +01:00
AniLeo 308cdfac35 gl: Rewrite Debug Output
Remove Windows restriction, enable Debug Output for every supported OS
Filter our severity by Khronos severity
Handle and log source and type enums
2020-05-16 08:29:00 +01:00
Nekotekina 7824419bbf Remove std::rotr usage for now
It seems to be missing on some std implementations.
2020-05-14 21:42:44 +03:00
Mrlinkwii c22d778143
Spelling fix in texture_cache.h (#8219)
heurestic_end -> heuristic_end
2020-05-14 21:42:21 +03:00
kd-11 310f367fb1
rsx: Improve blit engine memory validation (#8215)
- In blit engine logic there is a tendancy to over-allocate so as to avoid having to sticth together textures later
- Sometimes this can lead to out of bounds access and crash applications, so memory must be validated
2020-05-14 12:57:58 +01:00
kd-11 cc723ed45c vk: Properly fix dynamic state descriptors 2020-05-13 22:20:43 +03:00
kd-11 ed82288c1b rsx/fp: Support more types of texture access
- Allows more instructions to correctly decode depth textures
2020-05-13 22:20:43 +03:00
kd-11 118bfbbe98 rsx: Rewrite data texture remap expansion
- 16-bit channel formats have special 0x4|0xE encoding for only 2 channels, not 4
- float textures do not take any remapping and crash if you try it.
  Depth float textures work fine though.
2020-05-13 14:33:22 +03:00
JohnHolmesII 69ea573b0d
vk: Remove more deprecated VK_DYNAMIC_STATE_RANGE_SIZE usage (#8206) 2020-05-13 02:52:01 +01:00
kd-11 dd9397473a
vk: Remove deprecated enum VK_DYNAMIC_STATE_RANGE_SIZE (#8202) 2020-05-12 22:36:55 +01:00
kd-11 b6e8560532 rsx/fp: Fix PK2/UP2 instruction
- These variants take unsigned scalar inputs, not signed.
- Fixes ARGB8->X16Y16 in SR: Gat out of Hell
2020-05-11 09:37:00 +01:00
kd-11 79e2a87bc5 rsx: Fix NOP shader passing
- NOP shaders are used to stub rendering when a pass is supposed to be disabled
2020-05-10 21:54:34 +03:00
kd-11 14969cd8d0 rsx: Disable SCA writes to output register if vec result flag is set.
- Noticed when debugging X-men origins: wolverine which has a bogus SCA op whilst writing vector to output
- It makes no sense for both SCA and VEC to both write to the same register in the same instruction as memory ordering becomes an issue
2020-05-08 14:35:07 +03:00
kd-11 79c54aeba9 rsx: Move analyser dump to its own config option 2020-05-08 14:35:07 +03:00
kd-11 a1b6415c5a rsx: Fix alpha ref
- The alpha ref register is compared directly to the ROP output register in realhw
- alpha ref content must match bit-width of ROP register, which means fp16 values are possible
2020-05-08 00:02:47 +03:00
kd-11 a3f25bc7c7 rsx/interpreter: Fix DIVSQ instruction 2020-05-05 13:18:03 +03:00
kd-11 a0f63a31e3 vk: Enable optimization passes for generated SPIRV 2020-05-05 13:18:03 +03:00
kd-11 4f7c020e63 glsl: Improve VGPR usage
- VGPR usage lowered from 159 -> 127 for texturing. Occupancy doubled from 1 to 2
- Eliminate most temporary registers
2020-05-05 13:18:03 +03:00
Nekotekina cda8b3a59e RSX: fix new warnings 2020-05-01 22:00:57 +03:00
Megamouse 8f0af6a6fe rsx/interpreter: merge shader settings
- merge disable_asynchronous_shader_compiler and interpreter_mode
- removes disable_asynchronous_shader_compiler setting
- Adds the resulting settings as radio buttons to the gui tab
2020-04-30 15:02:59 +03:00
kd-11 2281c4f662 Fix build 2020-04-30 15:02:59 +03:00
kd-11 2ed50ba263 rsx/interpreter: Improve instruction accuracy
- Fix DIV instruction
- Add EXP_TEX modifier
- Implement WPOS register read
- Swap 3D and Cubemap enums to match RSX ids
- Adds two extra instruction classes: flow control and packing control
- Implement remaining FP instructions with exception of the rare projected texture lookups
- Fix typo causing output color index > 0 to not work
- Fix KIL instruction
- Implement conditional vertex program writes
2020-04-30 15:02:59 +03:00
kd-11 fc5b4026e1 vk: Implement optimized pipeline cache 2020-04-30 15:02:59 +03:00
kd-11 bc5c4c9205 rsx/gl: Implement variable path interpreter for optimal performance 2020-04-30 15:02:59 +03:00
kd-11 930bc9179d rsx/interpreter: Improve instructions support
- Must statically write the gl_ClipDistance registers else you get uninitialized trash.
  This problem is more readily apparent on NVIDIA technology but even AMD is not completely immune.
2020-04-30 15:02:59 +03:00
kd-11 b4bf48c33b vk: Integrate shader interpreter 2020-04-30 15:02:59 +03:00
kd-11 0072df7f20 rsx/gl: Add basic interpreter support to OGL
- Adds basic interpreter functionality.
- Flow control and other instructions not yet implemented.
2020-04-30 15:02:59 +03:00
Eladash 833ace1190 rsx: Fix zcull time to not time travel to the future 2020-04-28 21:07:15 +03:00
Megamouse 18219afbf7 Qt: move rsx capture to Utilities menu 2020-04-22 21:43:03 +02:00
Eladash b94e4247cc rsx: More strict zcull stats enabling 2020-04-21 16:18:32 +01:00
rexys 8f3b04cbd6 rsx: Fix is_fifo_idle with hle gcm 2020-04-16 12:59:19 +03:00
Megamouse cf229a8e9f some more dynamic settings 2020-04-15 18:25:25 +02:00
scribam 2e397e38a4 Typos 2020-04-14 17:06:58 +03:00
scribam f37adc4188 Add fallthrough attribute 2020-04-14 17:06:58 +03:00
Nekotekina 4d8bfe328b Replace rotate utils with std::rotl
More include cleanup.
2020-04-14 16:05:58 +03:00
Nekotekina 032e7c0491 Replace utils::cntlz{32,64} with std::countl_zero 2020-04-14 16:05:58 +03:00
Nekotekina d0c199d455 Replace utils::cnttz{32,64} with std::countr_{zero,one}
Make #include <bit> mandatory.
2020-04-14 16:05:58 +03:00
Eladash cb14805d78 rsx fp/vp analyzers: Fix strict type aliasing and improve codegen 2020-04-12 16:48:43 +03:00
Eladash e407018bb5 rsx: Write ref+get atomically
May contribute to better FIFO synchronization in some cases.
2020-04-11 21:21:15 +03:00
Eladash ff74c241c7 rsx: Fix get_optimal_blit_target_properties for local memory 2020-04-11 21:21:15 +03:00
Eladash 504ba8d824 rsx: Fix grammer issue (binded -> bound) 2020-04-11 21:21:15 +03:00
Eladash 8228fa1ece sys_rsx: Warn if RSX is not idle during crucial points 2020-04-11 21:21:15 +03:00
Eladash 36fd1d0f0d
rsx: Optimize transform constants load methods (#7992) 2020-04-09 15:53:43 +03:00
Eladash f7536bbce0 sys_rsx: Fix gcm events spam
In realhw the events are only sent if they are masked in driver_info->handlers as well.
2020-04-07 20:43:28 +03:00
Eladash 3f48450408 sys_rsx: Minor atomicity fixes 2020-04-07 20:43:28 +03:00
Megamouse 078c31c1da Qt: fix lupdate warnings (used for translation) 2020-04-06 20:59:58 +02:00
Megamouse b1fdbc7fcc Move some format functions 2020-04-06 20:59:58 +02:00
Eladash e7d5d17fd8 rsx: Adjust FIFO recovery to be a bit more merciful 2020-04-05 17:40:23 +03:00
kd-11 0b6e2b26fa rsx: Fix DST instruction
- It's the old-school distance vector, not the more modern distance() function
- There is seemingly no glsl function that maps to it directly.
2020-04-05 16:35:20 +03:00
kd-11 b301fecfd8 gl: Fix async shader compiler
- Removes glFinish hack.
- Adds proper server-side synchronization.
- Adds primary context detection to allow worker threads to be identified.
2020-04-05 16:35:20 +03:00
Eladash 72d1efa383 rsx: Batch transform contants load methods 2020-04-05 15:21:56 +03:00
Eladash 72c0aed4c1 rsx: Reset vertex program/constants at each boot 2020-04-02 20:42:12 +03:00
Eladash c2c5005278 rsx: Fix and improve fp program data invalidation 2020-04-02 20:42:12 +03:00
Eladash 2ed370093e rsx: Get rid of invalid_command_interrupt_raised 2020-04-02 20:42:12 +03:00
Eladash d97e9f7b4a rsx: Batch vertex program load methods 2020-04-02 20:42:12 +03:00
kd-11 69d90f6fec vk: Remove NVIDIA workaround for broken partial occlusion queries
- This bug has been fixed in the latest drivers.
2020-03-31 20:53:12 +03:00
kd-11 8c847d3a4b vk: Remove RADV workaround regarding renderpass barriers
- The situation was clarified in the official vulkan spec to allow this
  behavior.
  Barriers are now only inserted by the driver when layout
transitions are requested.
2020-03-31 20:53:12 +03:00
kd-11 b327e329d6 vk: Avoid query log spam if no program is loaded 2020-03-31 20:53:12 +03:00
Eladash 4215499b7f rsx: Fix typo in NV4097_SET_TRANSFORM_PROGRAM range 2020-03-28 11:07:34 +03:00
unknown 049825812e RSX: Restrict analyser loop error 2020-03-28 09:42:13 +03:00
xddxd d96dabcd60
rsx: Rename current_instrution to current_instruction (#7883) 2020-03-28 02:46:48 +00:00
Eladash 9d971e3b07 rsx: More strict infinite desync detection
6 desyncs per second for 1.5 seconds is pretty bad already.
2020-03-26 17:52:45 +03:00
Eladash 38c8dd98b4 rsx: Implement basic infinite FIFO desync detection 2020-03-26 15:22:45 +03:00
Eladash 158e34faca rsx: Reset all method registers at rsx_state::init() 2020-03-25 17:51:59 +03:00
Eladash 768b4f8c65 rsx: Improve NV308A_COLOR
* Fix NV308A_COLOR methods range.
* Batch NV308A_COLOR methods execution together.
* Fix termination of bind_range<> in rsx methods binding.
2020-03-25 17:51:59 +03:00
Megamouse fd3522436a overlays/osk: add more panels 2020-03-25 03:54:49 +01:00
Eladash 08e66ab14c Minor warning fixes 2020-03-23 21:37:37 +03:00
kd-11 4965bf7d7a gl/vk: Refactor draw call handling and stub shader interpreter
- Refactors backend draw call management to make it easier to extend
  functionality.
- Stubs shader interpreter functionality.
2020-03-23 14:47:28 +03:00
Eladash cccc32fa9d
sys_lwmutex/lwcond: track lwcond waiters (#7826)
In lwmutex destroy syscall, wait for pending waiters.
2020-03-23 10:30:17 +03:00
kd-11 12044bd8b0 rsx: Properly calculate vertex range when divisor is active
- The upper bound is to be rounded up, not down.
2020-03-22 10:57:47 +03:00
Eladash 9acf8e283d Fix OSK thread exit condition 2020-03-21 12:37:29 +03:00
Nekotekina c577bd2111 Implement thread_state::errored
State after calling thread emergency_exit() function.
Also default-construct thread result in this case.
2020-03-20 21:31:27 +03:00
Megamouse fd8cda0f2b overlays/osk: fix selection after changing panels
We now try to keep the current x and y selected after panel changes.
Also change some copy to ref
2020-03-19 21:10:08 +01:00
Megamouse c63f77e3b0 overlays/osk: fix full width characters 2020-03-19 21:10:08 +01:00
Megamouse a1f70bf96e overlays/osk: do not change the preview text on empty input
This prevents that the placeholder disappears
2020-03-19 21:10:08 +01:00
Megamouse f1127f1894 overlays: implement osk panels 2020-03-19 21:10:08 +01:00
kd-11 d25ba03e82 vk: Lazy evaluate renderpass scope
- Spamming the driver with renderpass open/close cycles is bad for performance.
2020-03-15 18:39:40 +03:00
kd-11 7025985c0d rsx: Improve section scanning when updating surface cache resources in blit engine. 2020-03-15 16:51:23 +03:00
kd-11 a756c0679e rsx: Implement cross-aspect slice gathering
- Fixes a data leak that can happen when a surface is rejected due to aspect mismatch.
- Mismatch can lead to rejection due to area covered excluding the RTT and inevitable upload a texture from CPU at the same location.
- Overlapping fbo/shader_read resources are not allowed.
2020-03-15 16:51:23 +03:00
Eladash 377e06a4a2 rsx: Fix unknown Blend equation 2020-03-15 09:53:15 +03:00
kd-11 2ae83782e1 vk: Fix potential MTRSX deadlock in case of a race condition 2020-03-13 22:06:04 +03:00
Eladash f3877d11e8 rsx: Fix initial boolean state of m_textures_dirty and m_vertex_textures_dirty 2020-03-12 21:36:43 +01:00
Eladash c04abac630 rsx capture: Fix exceptions handler, fix tiny race condition on capture new capture 2020-03-12 21:36:43 +01:00
kd-11 7e9dbeff7b
vk: Fix MTRSX deadlock (#7766) 2020-03-12 22:29:58 +03:00
Nekotekina 04dedb17eb Disable exception handling.
Use -fno-exceptions in cmake.
On MSVC, enable _HAS_EXCEPTION=0.
Cleanup throw/catch from the source.
Create yaml.cpp enclave because it needs exception to work.
Disable thread_local optimizations in logs.cpp (TODO).
Implement cpu_counter for cpu_threads (moved globals).
2020-03-12 16:03:08 +03:00
kd-11 47bbfdd2aa vk: Change texture cache memory management for disposed textures
- Use global resource manager instead of using the 2-frame hold behavior.
- Fixes high VRAM usage in some games
2020-03-11 16:29:34 +03:00
kd-11 7989de9d16 vk: Properly release dma resources. 2020-03-10 22:02:02 +03:00
Megamouse 3ea94c286b input/overlays: fix premature pad interception removal
shader compilation and trophy notifications shouldn't cancel the pad interception during proper dialogs
2020-03-10 19:04:32 +01:00
kd-11 12b73c8bdc rsx: Fix copypasta 2020-03-09 17:20:24 +03:00
Eladash 636ed4a48b HLE cellGcmSys: Avoid calling sys_rsx syscalls in rsx code 2020-03-09 16:07:14 +03:00
kd-11 2985a39d2e rsx: Rewrite async decompiler 2020-03-09 14:59:25 +03:00
Nekotekina 9dca2887d8 Fixup for Emu.Pause()
Remove some reduntant calls.
Don't pause on unknown sys_fs_fcntl operation.
2020-03-08 22:03:16 +03:00
kd-11 8214425a3c rsx: Fix framebuffer native layout for X32_FLOAT
- It was not matching the order laid out for normal textures uploaded from CPU.
2020-03-08 11:43:49 +03:00
kd-11 84a542fbce rsx: Blit engine improvements
- Detect writes to the display output memory and handle it specially.
  It already defines a known 2D region.
- Try and detect situations where raw transfers would be of benefit.
2020-03-08 10:30:13 +03:00
Eladash 892f74d762
rsx: Improve frame-limiter (#7723)
* rsx: Improve frame-limiter accuracy

* lv2: Improve lv2_obj::wait_timeout response time for aborting threads

* rsx: Make stretch to display area setting dynamic

* rsx: Redefine 'auto' frame limiter to obey vblank rate

* rsx: Make frame limiter setting dynamic

* rsx: Make frame-limiter compatible with dynamic changes
2020-03-08 01:11:35 +03:00
kd-11 149d550f7e gl: Restore commented out line
- Byte order step was disabled for debugging and not restored
2020-03-07 17:23:25 +03:00
kd-11 70f2577b9e vk/gl: Use best-fit semantics when scanning texture cache for flippable images
- Allows sourcing flip data from the blit engine resources which avoids expensive flush and re-upload
2020-03-07 16:58:35 +03:00
kd-11 93295f7f50 vk: Fix image properties for flip temporary images to be samplable.
- In case of gamma correction or other effects, they may require shader access.
- BGRA8_UNORM is usually safe to use directly without staging memory.
2020-03-07 16:58:35 +03:00
kd-11 1725f7a34b rsx: Add anaglyph 3D filter 2020-03-07 16:58:35 +03:00
kd-11 6e3406b3f5 video: Allow selection of 3D stereo resolutions 2020-03-07 16:58:35 +03:00
Nekotekina e4a81b1d13 Move Log.h to util/logs.hpp 2020-03-07 12:29:23 +03:00
Nekotekina 7a8772dafa Replace std::string::npos with umax 2020-03-05 14:05:23 +03:00
Nekotekina Aux1 250736ece5 Fix warnings in emucore 2020-03-04 21:23:34 +03:00
Nekotekina Aux1 f2f3321952 Fix warnings in VKGSRender 2020-03-04 21:23:34 +03:00
Nekotekina Aux1 c3f3451269 Fix warnings in GLGSRender 2020-03-04 21:23:34 +03:00
kd-11 54775d91dc rsx/blit-engine: Account for a rare corner case
- It is possible to have a RTV<->DSV transfer with compatible-sized formats.
  Mark the depth size as typeless in such a situation to avoid crossing the aspect barrier with the API.
2020-03-04 21:21:59 +03:00
kd-11 7fe9802f87 vk: Properly use declared pitch when loading simple images 2020-03-01 00:16:52 +03:00
kd-11 14aebeac58 video-out: Allow applications to successfully change display resolution
- Avoids a situation where a game configures output correctly but gets back bogus information later when querying.
- Should fix games being broken at some resolutions but not others.
2020-03-01 00:16:52 +03:00
kd-11 76bbbe27f1 vk: Fix dma resource leak
- Fix broken check; a relic of the past where flush method would reset the fence
2020-03-01 00:16:02 +03:00
kd-11 9af52d12a8 vk: Improve events
- Make events properly managed objects.
- Add a workaround for AMD's broken event status query
2020-03-01 00:16:02 +03:00
kd-11 5eb314fbbb vk: Add execution barriers.
- Useful for debugging
2020-03-01 00:16:02 +03:00
Nekotekina 8e5a03f171 Use named_thread_group in rsx_cache.h 2020-02-29 16:55:25 +03:00
kd-11 eb140c52a4 rsx: Reset ZCULL statistics at the end of a frame
- Workaround for games that leak zpass/zstats.
  The information is useless anyway without a clear op so it should be fine.
2020-02-29 14:23:52 +03:00
kd-11 198c84cabf rsx: Fix zcull clear command; do not clear ZPASS when ZSTATS is cleared. 2020-02-29 14:23:52 +03:00
Eladash 8762f2a588 Use more starts_with 2020-02-29 13:06:14 +03:00
kd-11 08f3460365 vk: Fixup for RCB/RDB in special cases
- Images must be in TRANSFER_DST_OPTIMAL or GENERAL layouts to call the image upload routines.
2020-02-29 12:13:11 +03:00
kd-11 cb047fcc75
rsx: Disable zstat checks to avoid unnecessary stream splitting (#7624) 2020-02-28 20:27:31 +03:00
Nekotekina ac2581659a RSXOffload: fix dma_manager::sync() freeze on exit
Its logic was completely broken.
2020-02-28 19:55:43 +03:00
Nekotekina f335d034fc Fix RSX Offloader thread exit (MTRSX fix)
Hangs on exit if MTRSX is enabled.
2020-02-28 19:43:42 +03:00
Nekotekina ecd68dfc70 overlays: add "thread bits" to wait on and avoid lockup
Add TLS variable to store its own bit.
2020-02-27 19:14:08 +03:00
Megamouse ee46ad1ca9 move overlays code to headers 2020-02-26 23:43:18 +01:00
gamerforEA 93552a5958 Apply some Clang-Tidy fixes 2020-02-27 00:38:55 +03:00
gamerforEA c0fbf3091e Remove unnamed namespaces from headers 2020-02-27 00:38:55 +03:00
gamerforEA 49294a3dd2 Add missing include guards 2020-02-27 00:38:55 +03:00
Nekotekina 5094ab8283 Fix RSX Offloader thread name 2020-02-26 21:57:01 +03:00
Nekotekina b35a5982e8 Fix one bug with MsgDialog thread (freeze on exit)
Forgot to check thread state
2020-02-26 21:23:30 +03:00
kd-11 569e1c2df6 rsx: Fix typo. Noted by github user @gamerforEA 2020-02-26 19:40:35 +03:00
kd-11 6e9392fb45 rsx: Restructure ZCULL query triggers
- Both ZCULL stats and ZPASS stats require hardware queries, but
  ZCULL stats should not contribute to ZPASS stats and vice versa!

- Disables hardware queries for ZCULL stats by themselves, we cannot
  generate them correctly anyway and no game so far has been found to
  actually use them. Should lessen the load on the backend for games
  that do not actually require it.
2020-02-26 19:40:35 +03:00
Nekotekina df1813b4e2 overlays: hotfix for waiting on thread_count 2020-02-25 23:43:05 +03:00
Nekotekina ff16e678a5 Add thread_count instead of former thread pool 2020-02-25 23:16:55 +03:00
Nekotekina 982856e70d overlays: remove unused threadpool 2020-02-25 22:56:50 +03:00
Nekotekina 144c20649f Try to fix msg dialog breakage 2020-02-25 22:50:44 +03:00
Megamouse e719bcf338 overlays: add layer modes to osk 2020-02-25 21:57:49 +03:00
Megamouse 3f4226b70e overlays: Fix find and replace regression 2020-02-25 21:57:49 +03:00
Megamouse 620cfd5063 overlays: move code to overlay_utils.cpp 2020-02-25 21:57:49 +03:00
Megamouse 2341749485 overlays: add overlay_osk.h 2020-02-25 21:57:49 +03:00
Nekotekina 9c9c2eb2c9 Fix wrong g_fxo->init_crtp name, use just init<> 2020-02-25 14:07:50 +03:00
Nekotekina 318a364d09 Try to fix OSK 2020-02-25 14:03:13 +03:00
Nekotekina fa02a04baa Add g_fxo->init_crtp to simplify thread construction 2020-02-25 11:51:41 +03:00
kd-11 cd40bc8c61 overlays: Avoid race condition between rendering and layout operations for system widgets
- System widgets are callable from outside RSX code.
- Responding to draw requests while setup is in progress can cause malformed cached output
- Fixes glitched layouts for system message dialogs
2020-02-24 23:33:47 +03:00
kd-11 f6ebd88687 overlays: Ditch wstring for u32string
- Turns out wstring is not the same as u32string on windows.
2020-02-24 23:33:47 +03:00
Megamouse f7666f44da Untangle GUI and input includes 2020-02-24 16:31:01 +01:00
Eladash 522daf5eac rsx: Fix NULL renderer 2020-02-23 19:57:55 +03:00
Nekotekina 8b4b859091 Remove "thread_ctrl::spawn" 2020-02-23 15:03:38 +03:00
Nekotekina 18db020b93 Fix warning in RSXOffload.cpp (rewrite thread) 2020-02-23 14:19:23 +03:00
Nekotekina 7069e7265f RSX: move g_dma_manager to g_fxo 2020-02-23 13:12:50 +03:00
kd-11 fa41297b27 overlays/trophy: Migrate to multibyte strings 2020-02-22 15:07:14 +03:00
kd-11 b8f51398b7 overlays/save_dialog: Migrate to multibyte strings 2020-02-22 15:07:14 +03:00
kd-11 cb2129c7e4 overlays/osk: Migrate to multibyte encoding 2020-02-22 15:07:14 +03:00
kd-11 703ec9f896 overlays: More unicode utilities 2020-02-22 15:07:14 +03:00
kd-11 19350d024b overlays: Font system improvements
- Add support for Hangul blocks (korean)
- Restructure font fallback system to allow the user to 'install' fonts if missing.
  Should allow fonts to work with no firmware on open systems like linux
2020-02-22 15:07:14 +03:00
kd-11 8e68427daf overlays: Add basic font substitution system and separate JPN from Latin-1 set
- Gets JP glyphs to render correctly, but the generalization may negatively affect other CJK glyph sets.
  PS3 doesn't seem to use other glyph sets much however.
2020-02-22 15:07:14 +03:00
kd-11 6220206cbc vk: Implement 2D array textures required for new font subsystem 2020-02-22 15:07:14 +03:00
kd-11 1df1ceb4ea gl: Support new glyph format with array textures 2020-02-22 15:07:14 +03:00
kd-11 6178a0ab25 overlays: Migrate to wide-char strings 2020-02-22 15:07:14 +03:00
Nekotekina 5e75a0c497 Disable cotire on travis
Make some workarounds for clang because it poorly supports -Wold-style-cast
2020-02-21 17:03:54 +03:00
Nekotekina 972e0ab31d Remove -Wno-reorder and make it an error 2020-02-21 15:20:34 +03:00
Nekotekina 92e3eaf3ff Fix signed-unsigned comparisons and mark warning as error (part 2). 2020-02-19 22:54:58 +03:00
Nekotekina 771eff273b First part of fixing sign-compare warning (inside be_t). 2020-02-19 22:54:58 +03:00
Eladash df8d0cde4a RSX/SPU: Accurate reservation access 2020-02-19 18:11:30 +00:00
Megamouse fe75311be2 move config structs to own files and clean up some headers 2020-02-17 15:08:17 +03:00
kd-11 5e6b1003ec vk: Only declare explicit subpass dependencies for RADV 2020-02-16 18:00:06 +03:00
kd-11 23f1515448 vk: Explicitly declare null subpass dependencies
- We do not want any actual dependencies, but it turns out removing them
  entirely makes the driver add even worse dependencies.
2020-02-15 21:45:25 +03:00
Eladash 9344b21484 rsx: Unify FIFO recovery methods
TODO: Maybe consider fifo stack content when recovering.
2020-02-14 17:11:26 +03:00
Eladash 07f300a14e rsx: ZCULL typo fix 2020-02-14 17:11:26 +03:00
Nekotekina bcbe324534 geometry.h: make conversion operators explicit
It requires static_cast<> to call them.
2020-02-11 13:21:45 +03:00
Eladash dcb30df7c8 rsx capture: Fix capture recovery after a crash 2020-02-10 21:39:39 +00:00
Eladash bdab26ec09 rsx: rewrite io mappings
Along with some with fixes to cellGcmSys HLE.
2020-02-10 21:39:39 +00:00
kd-11 f47333997f rsx: Validate memory blocks before checking for overlap 2020-02-10 21:48:35 +03:00
kd-11 3787108ee7 rsx: Typo fix in audit condition 2020-02-10 21:48:35 +03:00
Nekotekina 034267adb2 Compilation fix 2020-02-10 16:57:56 +03:00
kd-11 efc8c3f4a9 vk: Fixup for VK_ERROR_SUBOPTIMAL_KHR
- break from a switch does not break out of the external scope!
2020-02-09 13:45:30 +03:00
kd-11 792c481f6d rsx/overlays: Fix clipped rendering of UI elements
- Take viewport offset into account when applying window transforms.
  This is necessary because gl_FragCoord is based on the framebuffer and not the viewport.
2020-02-09 12:55:56 +03:00
Eladash 9d1bb60ad7 cellGcm HLE: fix cellGcmMapMainMemory
Fix arguments order, softcode RsxReports::report offset.
2020-02-08 22:18:56 +03:00
Eladash b7043ce000 Make rsx::get_address report caller location 2020-02-08 22:18:56 +03:00
kd-11 c64935f9dd rsx: Clean up graphics state notifications and add notification for change in point size
- Adds a backend notification when point size changes.
- Refactors all those separate notifiers into one reusable template.
2020-02-08 18:13:05 +03:00
kd-11 54da9ac7e5 overlays: Fixup
- Avoid calling join on self thread.
- Avoid use-after-free.
2020-02-07 19:28:41 +03:00
kd-11 e45360de2b overlays: Fix use after free
- Overlay can be closed when secondary thread is asleep!
  Wait for it to wake before proceeding with deletion.
2020-02-07 16:15:02 +03:00
kd-11 d59c449ff6 vk: Remove an overzealous assert 2020-02-07 16:15:02 +03:00
kd-11 0bba04ef8d vk: Fix a bug in RCB/RDB when MSAA is set to disabled.
- Initially MSAA option was hardcoded to be always enabled, this bug is a remnant of that time.
2020-02-06 17:54:05 +03:00
kd-11 43dae6c14d gl: Implement RCB/RDB 2020-02-06 17:54:05 +03:00
kd-11 2b5c24b304 gl: Fix memory barrier implementation and stub for RCB/RDB
- It's a miracle it even compiled
2020-02-06 17:54:05 +03:00
kd-11 50b1e26b17 gl: Fix a long-standing regression with typeless transfer caused by a typo.
- The parameters for the final upload should be 'unpack_info' not 'pack_info'!
2020-02-06 12:44:46 +03:00
kd-11 18e0559438 gl: Fix per-level sub-image sizes to comply with OpenGL guidelines for compressed textures 2020-02-06 12:44:46 +03:00
kd-11 3cc42c1bf8 gl: Fix broken image transfer operations 2020-02-05 18:18:09 +03:00
kd-11 b6422c9a33 rsx: Fixup
- Destination Y coordinate must be 'rebased' onto the current slice by subtracting its offset.
  Only the local path was affected this time
2020-02-05 18:18:09 +03:00
Nekotekina c0f80cfe7a Use attributes for LIKELY/UNLIKELY
Remove LIKELY/UNLIKELY macro.
2020-02-05 10:42:34 +03:00
Nekotekina 1a78e0e80c Make RPCS3 compile in C++2a mode 2020-02-04 23:43:55 +03:00
kd-11 9d9b5c4d66 rsx: Rewrite coverage test to take sum of areas into account.
- TODO: A proper sweep algorithm to calculate sum of overlapping rectangles
2020-02-04 16:20:52 +03:00
kd-11 b9ec012922 rsx: Allow for proper data checks when WCB/WDB is enabled 2020-02-04 16:20:52 +03:00
Nekotekina c4a01875d0 Space fix commit 2020-02-03 11:16:26 +03:00
Silent 7f4e546f19 Protect m_storage.find(key) to fix a race 2020-02-02 22:28:14 +03:00
kd-11 7d2ed9200d rsx: Remove sections that are wholly inherited by new blocks
- Allows sections reclaimed by the surface store due to overlap/inheritance to be identified and removed.
- Additionally, potentially lowers the number of flushes required per block with multiple overlaps improving efficiency and theoretically performance.
2020-02-01 15:14:29 +03:00
Nekotekina 15391f45d0 Modernize RSX logging (rsx_log variable) 2020-02-01 11:52:22 +03:00
Nekotekina 1d0f359406 logs: add more log channels instead of GENERAL 2020-01-31 16:44:48 +03:00
kd-11 36d5db7f30 rsx: Plug texture data leak in the 'exact match' path.
- Followup to previous texture data leak fix for the replaced section path.
2020-01-31 14:56:53 +03:00
kd-11 c9e35926f5 rsx: Preserve pixel data when splitting sections
- Ironically rhis data leak is caused by trying to fix another type of data leak
2020-01-30 21:07:36 +03:00
Eladash 92466165f6
Increase Maximum Vblank Rate and Clocks Scale
Allow x30 times the speed of vblank rate + clocks scale of original PS3.
In theory a 60 fps limit game which scales frame limit perfectly with vblank rate can be played at up to 1800 fps with this change.

And:
* Fixed lv2 sleep with Clocks Scaling
* Make these settings dynamicaly adjustable.
* Avoid code duplication
2020-01-29 21:42:41 +01:00
kd-11 1206a5d4b7 rsx: Tweak blit engine heurestics a bit
- Reject writes to RTT if the source data is of unknown origin.
  non-RTT data and only 1 line in length is suspicious and often GPU data like programs or other rendering inputs.
2020-01-29 12:54:06 +03:00
Nick Renieris 1e69de1205 overlays/perf: Graph label tune-up
Place graph text on top, split in 2 lines, center it horizontally.
Also if it's wider than the graph, match up graph's width to it.
2020-01-26 17:55:11 +01:00
kd-11 79216917b3 rsx: Workaround for broken rtt resampling
- Avoids WCB requirement for now to keep res scaling working correctly.
- TODO: Fix this properly
2020-01-26 13:58:48 +03:00
kd-11 698702cd4a vk: Fix DMA data leak
- There still does not exist a ranged flush implementation which is required.
- TODO: Implement this properly
2020-01-26 13:58:48 +03:00
kd-11 1166ae19bb vk: Use appropriate layouts depending on use case when creating new textures to avoid needless barriers 2020-01-26 13:58:48 +03:00
kd-11 44f2cacf7b rsx: Blit engine tuning
- Attempt to identify blit operations that will be flushed immediately
after and just do them on CPU instead if the transformation is trivial.
- If only a single blit section is contributing to an atlas merge op, the
threshold should be 100%. The only acceptable result here is a
truncation.
2020-01-26 13:58:48 +03:00
kd-11 7a275eaa3a rsx: Fix incomplete blit operations getting used as texture inputs
- Raise passing 'score' from 50% to 90% to filter out very incomplete
merge operations.
- Catch unfit sections passing the match test; possible for blit_dst
data but will likely be always harmless. Disabled in release builds by default.
2020-01-26 13:58:48 +03:00
Maksim Derbasov 1abdee242a small improvement (#7288)
* small improvement

* comments addressed

Co-authored-by: kd-11 <15904127+kd-11@users.noreply.github.com>
2020-01-22 12:28:48 +00:00
kd-11 adcc3e9c4b rsx: Optionally sync on texture read semaphore
- Some games use texture semaphore for zcull sync which is rather bizzare.
  However, it works on realhw as the depth test happens before fragment shader completion
- Due to the high performance penalty incurred by this act, this
behavior is only enabled by the "strict rendering mode" option.
2020-01-21 22:21:51 +03:00
Megamouse 4dbad6cce6 fix some random warnings 2020-01-19 16:38:17 +01:00
kd-11 22ca2827de rsx: Improve window border detection and clearing
- Improves logic to detect if the frame requires letterboxing and
properly clears the background appropriately.
2020-01-18 19:52:52 +03:00
kd-11 5e0ca4c0c4 rsx: Fixup for missing visuals when framebuffer is larger than requested
display dimensions.
2020-01-18 19:52:52 +03:00
kd-11 48407752a6 formatting: Unify indentation type in the newly added files to tabs 2020-01-18 19:52:52 +03:00
kd-11 bad4d1ff05 rsx: Improve present image scanning
- Adds support for partial (letterboxed) source images by taking insets
into account.
- Bugfix for potential access violation when capturing screenshot on
vulkan
2020-01-18 19:52:52 +03:00
kd-11 7453e46a7c rsx: Refactor out complex present code into separate files
- Also restructures present code to have image lookup in a separate
re-usable function.
2020-01-18 19:52:52 +03:00
kd-11 b36b9e4822 vk: Fixup for total number of combined samplers using the dynamic binding structure 2020-01-18 11:17:19 +03:00
kd-11 0a2b6a290d vk: Fixup
- Scaling is not needed for a direct typeless transfer!
2020-01-17 14:31:14 +03:00
Megamouse 449cbb7281 Qt: use persistent_settings for playtimes 2020-01-17 07:43:10 +01:00
kd-11 9b34f00241 vk: Optimize image transfers
- Adds the same optimization/simplification steps to complex image
transfer routines. Whenever possible, multi-step transfers are collapsed
into a single operation.
2020-01-16 22:29:26 +03:00
kd-11 82af17beb1 gl: Optimize image operations
- Avoid double transfers where a transfer to a temp image is done
without scaling and then a secondary transfer follows. Combines the two
steps into one whenever possible which can significantly alleviate
bandwidth problems at higher resolutions. Significant speedup, upto 90%
in some cases (PDF, PDF2)
2020-01-16 22:29:26 +03:00
kd-11 47b196e9d0 rsx: Fix uninitialized variable 2020-01-16 17:57:31 +03:00
kd-11 db014d8a58 rsx: Fix section length calculations when generating new blit targets. 2020-01-16 17:57:31 +03:00
kd-11 621fab2ad9 vk: Fix D32S8 interpolation by using integer interpolation instead of floating point
- Interpolating floats is not the same as interpolating their bits!
  Use integer format to interpolate linearly for D32F formats instead of using R32F as intermediary
2020-01-16 11:12:08 +03:00
kd-11 086ecf4ba6 vk: Add some missing image memory barriers causing artifacting on AMD cards
- There needs to be a memory barrier after each step.
- TODO: Optimize scale_typeless_safe function
2020-01-16 11:12:08 +03:00
kd-11 309251ce7a rsx: Touch locked dst memory after blit transfer operations in case it is locked by WCB/WDB 2020-01-16 11:12:08 +03:00
kd-11 74ad525566 vk: Fixup for cs_scatter job
- Access to the stencil output has to be atomic as each 'word' is shared among 4 adjacent texels
- TODO: Can be optimized using mirrored buffer views
2020-01-15 21:12:51 +03:00
Eladash 85695c8bac rsx: FIFO wake-up pause control 2020-01-15 19:54:23 +03:00
kd-11 2984300385 vk: Fix invocation alignment to support non-power-of-2 alignment 2020-01-15 15:42:36 +03:00
kd-11 ac4cadf538 vk: Fix word index counting for shuffle tasks 2020-01-15 15:42:36 +03:00
kd-11 175f78f5b3 vk: Lower default compute heap size to 64M
- There is no need to guess and use a large memory footprint as the heap is
now dynamic.
2020-01-15 15:42:36 +03:00
kd-11 3d96fe79cc vk: Implement dynamic sized compute heap
- Implements a dynamically sized compute heap to allow growing up the
size if it is too small.
2020-01-15 15:42:36 +03:00
Eladash 1ccb3c4492 rsx: Verify local memory offset 2020-01-15 13:23:56 +03:00
kd-11 8bbda3dedb vk: Restructure command queue flushing behavior to avoid deadlock
- Queueing commands on the offloader is a good idea but unfortunately
page faults can still happen causing a cyclic dependency and eventual
deadlock. Characterized by a vk::wait_for_event timed out error
accompanied by severe hitching.

- Drain the fault-able commands before pushing a submit operation to the
queue. If a fault is in progress, bypass the queue system and submit
raw. Technically this is incorrect but there isn't much that can be
done about it right now.
2020-01-14 14:32:40 +03:00
kd-11 db5d03c340 vk: Generate dynamic binding table based on the capability of the drivers
- This alleviates constraints imposed on shaders to allow running on some not-so-great platforms.
2020-01-09 15:38:23 +03:00
kd-11 ef3b0db7d8 vk: Workaround for NVIDIA occlusion query failure
- When using partial results on NVIDIA, a non-zero result is returned even when the draw is fully occluded.
  This, I believe, violates spec which says the partial result shall be between 0 and the final result.
2020-01-08 19:02:45 +03:00
kd-11 3f34a0196c overlays/osk: Add linear fade-in/out effect to OSK 2020-01-07 21:31:19 +03:00
kd-11 ecf00be155 rsx: Add color interpolation animation
- Adds color interpolation and modulation pass and refactors the code a
bit. Elements with this pass applied have their color modulated by the
animated color from the pass. Modulation transform is multiplicative.
2020-01-07 21:31:19 +03:00
Nick Renieris 5bace118a7 overlays: Redesign animation system (add easing functions, fix bugs)
Instead of speed, direction and distance, the user now specifies
start/end offsets and how much time the transition should take.

Fixes:
- Stuttering caused from framerate estimation.
- An edge case where animations would go over their supposed limit.

Adds:
- The ability to specify arbitrary easing functions for the animations
  - Implemented quadratic ease in and ease out and cubic ease in/out.
- Usage of cubic ease in/out in the trophy notification
2020-01-06 22:42:07 +03:00
Nick Renieris 28770c1580 overlays: Move vertex & vector utility classes to new file 2020-01-06 22:42:07 +03:00
Nick Renieris 192912131e rsx: Update vblank count in LLE mode 2020-01-06 22:42:07 +03:00
Dravonic 94d2f97f27 Multithreaded shader compliation follow-up (#7190)
* Multithreaded load pipeline entries shader compliation stage

Co-authored-by: kd-11 <15904127+kd-11@users.noreply.github.com>
2020-01-06 21:59:59 +03:00
kd-11 7f09def94e rsx/vp: Properly initialize output registers.
- All registers tested on hw show contents to be 0, 0, 0, 1.
  Make default output registers match this pattern.
2020-01-05 18:06:08 +03:00
kd-11 bdb5115c7f rsx/overlays: Improve space usage on trophy dialog
- Slightly increases the size of the trophy dialog and the font size.
  The old dimensions did not work with some libre fonts causing
alignment errors and other problems.
2020-01-04 16:36:49 +03:00
kd-11 3ada97d2d3 rsx/overlays: Implement trophy notification queue
- Allows to display more than one trophy at a time. Trophy notifications
will simply get queued up and displayed at appropriate time.
2020-01-04 16:36:49 +03:00
kd-11 31b07fece5 rsx/overlays: Add support for animations
- Adds animation support. This commit adds the base framework and
implements a translate animation used to slide elements around the
screen. This is then used to implement the sliding animation for the
trophy notification.
2020-01-03 20:33:32 +03:00
Megamouse 5e7d25ad35 overlays: refactor shader loading dialogs 2020-01-03 14:22:40 +01:00
Megamouse d94d094a7e overlays: fix non-interactive dialog loops 2020-01-03 14:22:40 +01:00
Megamouse c9aee27d48 VK: remove unused init function declaration 2020-01-03 14:22:40 +01:00
kd-11 d12762414a vk: Change default vertex output value
- Prefer w!=0 to avoid a situation where xyz/w = nan. More of a
theoretical problem, but some calculations break down in such a
situation.
2020-01-03 10:35:53 +03:00
kd-11 7786681954 rsx: Improve MTRSX synchronization
- Properly synchronize DMA transfers when handling RSX pipeline
barriers. Texture read barrier is used to signify completion of DMA
routines and is often used to signal that Cell can overwrite vertex
data!
2020-01-03 10:35:53 +03:00
kd-11 c4e59b5115 vk: Clamp depth export in FS
- PS3 matches OGL behavior where writing to the depth export
register results in clamping.
2020-01-01 22:39:20 +03:00
Eladash 9690854e58 Some cleanup
* Prefer default initializer over std::memset 0 when possible and more readable.
* Use std::format in trophy files name obtaining.
* Use vm::ptr<>::operator bool() instead of comparing vm::ptr to vm::null or using addr().
* Add a few std::memset calls in hle where it matters (or in some places just to document an actual firmware memcpy call).
2019-12-31 22:27:27 +03:00
kd-11 915cf0bae8 vk: Do not leak mapped memory 2019-12-31 13:56:14 +03:00
Megamouse c4b4ce46b8 cellSaveData: don't pause apps during dialogs 2019-12-29 14:22:58 +01:00
kd-11 24cb48971e vk: Fix cb chunk synchronization deadlock 2019-12-29 13:49:46 +03:00
kd-11 e1b734fd12 rsx: Fix linux build 2019-12-29 13:49:46 +03:00
kd-11 ed2bdb8e0c rsx: Zcull synchronization tuning
- Also fixes a bug where sync_hint would erroneously update the sync tag
  even for old lookups (e.g conditional render using older query)
2019-12-29 13:49:46 +03:00
kd-11 fdb638436f rsx: Add toggle for zcull sync behaviour - Adds a relaxed sync mode where ZCULL reports are lazily nudged into flushing and the main core does not actually wait for the event to finish before proceeding - Can drastically improve performance in cases where the game actually does not utilize the report data 2019-12-29 13:49:46 +03:00
kd-11 9f94a6dc11 vk: Refactoring and optimizations to query handling
- Caches query results when looking up report availability to avoid
  entering driver code twice.
- Minor code restructuring
2019-12-29 13:49:46 +03:00
kd-11 55ad9244c0 vk: Switch occlusion pool to FIFO rather than LIFO to avoid hard stall 2019-12-29 13:49:46 +03:00
kd-11 cdd9c12132 vk: Emulate conditional rendering for AMD 2019-12-29 13:49:46 +03:00
kd-11 93895838c7 vk: Implement hw conditional rendering 2019-12-29 13:49:46 +03:00
kd-11 a51395370e vk: Implement multithreaded command submission
- A few nagging issues remain, specifically that partial command stream
  largely caused by poor synchronization structures for partial CS flush
  and also the fact that occlusion map entries wait on a command buffer
  and not an EID!
2019-12-29 13:49:46 +03:00
kd-11 5be7f08965 rsx: Restructure ZCULL report retirement
- Prefer lazy retire model. Sync commands are sent out and the reports will be
  retired when they are available without forcing.

- To make this work with conditional rendering, hardware support is
  required where the backend will automatically determine visibility by
  itself during rendering.
2019-12-29 13:49:46 +03:00
kd-11 8dfea032f2 rsx: Remove deprecated do_method path that has been superceded by c++ inheritance for many years 2019-12-29 13:49:46 +03:00
Megamouse ef6f565dbd silence some annoying warnings 2019-12-28 15:40:57 +01:00
Emmanuel Gil Peyrot 9b77febd10 RSX: Remove two empty cpp files 2019-12-23 00:02:57 +03:00
linkmauve e9c5c6e6bf Move input to its own directory (#7126) 2019-12-22 17:39:42 +01:00
Eladash db4041e079 Implement rounded_div
Round-to-nearest integral based division, optimized for unsigned integral.
Used in sceNpTrophyGetGameProgress.
Do not allow signed values for aligned_div(), align().
2019-12-20 14:47:04 +03:00
Emmanuel Gil Peyrot e30173a835 rsx: Make X11 optional on Linux
This makes it possible to build rpcs3 on a pure Wayland system, without
the Xlib installed.
2019-12-20 10:48:03 +00:00
Nekotekina 321f7e7197 Fix missing-braces warnings 2019-12-13 03:21:43 +03:00
kd-11 73236efe58
vk: Remove some outdated code (#7060) 2019-12-12 16:29:55 +03:00
Eladash 6a926daee7 rsx: Delay FIFO recovery point creation if is in in_begin_end scope (#7080) 2019-12-12 15:38:56 +03:00
Eladash 7260af032e rsx: Ignore or recover from unknown primitives
This also fixes a bug when recovering FIFO or creating such recovery point inside in_begin_end == true scope.
2019-12-11 00:11:12 +03:00
Nekotekina 835892aa51 C-style cast cleanup VII 2019-12-05 02:10:15 +03:00
Nekotekina d2fd3c6bc4 Commit 377e7d2a73 2019-12-04 21:32:08 +03:00
Nekotekina 377e7d2a73 C-style cast cleanup VI 2019-12-04 17:56:22 +03:00
scribam 2eaaf5b132 vk: Add sampleRateShading to the list of device enabled features 2019-12-04 12:59:38 +03:00
Nekotekina 185c067d5b C-style cast cleanup V 2019-12-03 17:23:00 +03:00
Nekotekina 28eacc616a C-style cast cleanup III 2019-12-01 00:32:44 +03:00
Nekotekina 5b9df53c13 C-style cast cleanup (partial)
Replace C-style casts with C++ casts.
2019-11-29 00:35:23 +03:00
Megamouse f2b530823b overlays: add dynamic switch for perf overlay 2019-11-27 10:34:03 +01:00
kd-11 8ca53f9c84 rsx: Remember to min-max the anchor indices of a polygon or triangle fan 2019-11-24 19:01:57 +03:00
kd-11 429a76a140 rsx: Remove redundant check 2019-11-23 16:11:18 +03:00
kd-11 41e7d2aa0a rsx: Select correct image aspect for blit engine targets. 2019-11-19 13:18:15 +03:00
kd-11 fd751e3e7b rsx: Improve blit format mismatch detection 2019-11-19 13:18:15 +03:00
kd-11 41c3180276 rsx: Fix invalid format checks for DMA sections which are typeless 2019-11-19 13:18:15 +03:00
kd-11 9dab0575fa rsx: Add missing format check for the RTV<->DSV transfer case
- TODO: Rewrite resource handling routines
2019-11-18 13:17:00 +03:00
kd-11 4a0e1c79ed rsx: Improve format validation for blit engine
- Check all possible cases where format mismatch is possible.
- Warn if a slow path is going to be taken. Should help with future
optimizations.
2019-11-18 13:17:00 +03:00
kd-11 c415578e79 vk: Clamp buffer row length to never be less than declared width
- Fixes some games with broken textures
2019-11-18 13:17:00 +03:00
kd-11 2408922806 rsx: Do not ignore clamping for some routines that do not have implied range 2019-11-18 13:17:00 +03:00
kd-11 c10aa360b1 rsx: Remove more deprecated methods 2019-11-18 13:17:00 +03:00
Megamouse a17a5a76a0 overlays: avoid division by zero 2019-11-15 14:53:18 +01:00
Megamouse fb96047d2f overlays: add settings for overlay graphs 2019-11-15 14:53:18 +01:00
Megamouse dd1707bd46 overlays: fix center options when graphs are shown 2019-11-15 14:53:18 +01:00
Megamouse d6b0361a02 overlays: perf_metrics_overlay to seperate header
this is done to prevent severe conflicts with upcoming changes
2019-11-15 14:53:18 +01:00
Anuskuss 7e31c30133 Intel iGPU needs workaround on Windows 2019-11-15 12:08:16 +03:00
Nick Renieris cc59d319e1 overlay: Performance graphs 2019-11-12 20:43:09 +01:00
kd-11 8234bdb8f0 vk: Check for heap change events after a grow to avoid spec violations
- Avoid referencing the old buffer in stale views. Status can be set
globally if requested during heap creation.
2019-11-10 17:53:12 +03:00
kd-11 5968427a2f vk: Initialize queries before use
- The spec does not guarantee that queries are initialized. In fact, it
now says all queries must be reset before they are used for the first
time.
2019-11-10 17:53:12 +03:00
kd-11 8ea9bc9874 vk: Reduce memory allocation sizes of default heaps
- The heaps will grow as desired, no need to overallocate to cater to
the most resource-hungry games
2019-11-10 17:53:12 +03:00
kd-11 0a32d478df vk: Enable auto-growing of the data heaps for the performance case 2019-11-10 17:53:12 +03:00
kd-11 357e0d2097 vk: Implement explicit runtime flags to manage events like heap sync 2019-11-10 17:53:12 +03:00
kd-11 f359342721 rsx: Implement mutable ring buffers with grow support 2019-11-10 17:53:12 +03:00
kd-11 5f39a594ac rsx: Clean up some unused legacy methods unnecessary after d3d removal 2019-11-10 17:53:12 +03:00
Emmanuel Gil Peyrot 56f82d2701 rsx: Wrap gsl::span definition into Utilities/span.h 2019-11-09 20:00:50 +01:00
Emmanuel Gil Peyrot f76720ceb0 Remove extraneous ::narrow<int>() calls
GSL’s gsl::span didn’t use the correct type for its index_type, which is
why they were needed.
2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot 72cdf0b04c Replace gsl::span’s implementation with tcbrindle’s
This implementation optimises correctly on all relevant compilers,
unlike GSL’s which gave extremely slow code on any compiler other than
MSVC.

Supersedes #6948.
2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot ef368c5171 rsx: Replace gsl::byte with C++17’s std::byte 2019-11-09 19:30:05 +01:00
kd-11 7072489a6e rsx: Implement point sprite coordinate generation
- When the point sprite flag is set, overrides the input similar to the
2D mask. The returned X and Y values are always the gl_PointCoord values
for the fragment.
- Stacks with the 2D mask to override the z and w coordinates.
2019-11-09 12:50:53 +03:00
kd-11 63673b1a9f rsx: Implement full color remap for the D24S8->ARGB8 converter 2019-11-08 19:11:59 +03:00
kd-11 8d1505752f rsx: Validate depth test setup to avoid address contention 2019-11-07 11:32:44 +03:00
kd-11 508ffcb775 vk: Compute kernel fixups
- Adhere to workgroup count limits as exposed by the GPU vendor.
  They already execute properly even when going beyond the limits but this removes validation noise.
- Fix invocation counts for deswizzle kernel. The count was incorrect if blocksize was not 4, causing a bunch of useless work to be done.
2019-11-05 22:07:22 +03:00
kd-11 99d71fdc2a vk: Implement layer batching for the GPU swizzle decoder
- Handles all LODs per layer meaning cubemaps are now fully handled in 6 passes instead of 6 * (log2(width)) passes.
- Handles all LODs of a 3D texture in one pass as well.
- The improvements do warrant dropping down the number of allowed compute invocations a bit
2019-11-05 22:07:22 +03:00
kd-11 7a0b94f343 vk: Minor compute optimizations
- Remove use of uniform buffers for compute static data. Use push
constants instead.
- Minor touchups to the deswizzle code to avoid redundant data copies.
2019-11-05 22:07:22 +03:00
kd-11 1266b63135 vk: Enable gpu deswizzling 2019-11-05 22:07:22 +03:00
kd-11 9cd3530c98 rsx: Set up framework for hw deswizzle 2019-11-05 22:07:22 +03:00
kd-11 57d3c9e171 rsx: Take empty queries into account for engines that spam report reads.
- Some games will spam the report queue with requests but have zpass
statistics enabled.
2019-11-04 18:48:41 +03:00
kd-11 2a8f2c64d2 rsx: Implement report transfer deferring
- Allow delaying report flushes triggered by image_in or buffer_notify
- When the report is ready, all the delayed transfers will automatically
be done.
- TODO: Make this configurable?
2019-11-04 18:48:41 +03:00
kd-11 3e0f9dff4d vk: Improve zcull synchronization
- Use zcull sync hints more aggressively
2019-11-04 18:48:41 +03:00
kd-11 fe3c290d03 vk: Reimplement occlusion result reading
- Implement partial result reads
2019-11-04 18:48:41 +03:00
kd-11 51e0eaaddc rsx: Implement backend notification for upcoming zcull reads 2019-11-04 18:48:41 +03:00
kd-11 df63de8f16 rsx: Allow u32 restart index with full index width 2019-11-04 16:56:34 +03:00
kd-11 6b3af09fa5 vk: Improved crash message for missing MSAA features 2019-11-04 16:56:34 +03:00
kd-11 bbed791ee0 vk: Add explicit support for identity image views
- Allows bypassing all remap shenanigans to make some operations that
rely on the raw image to work correctly.
2019-11-01 19:35:46 +03:00
kd-11 63bbf11a76 vk: Add video out calibration pass
- Adds gamma correction and RGB range filters to output to match PS3
2019-10-31 14:43:24 +03:00
kd-11 78aefe5b5e rsx/overlays: Add support for other primitive types other than triangle_strips 2019-10-31 14:43:24 +03:00
Nekotekina e3e7051ed3 Minor optimization in BufferUtils.cpp
Don't use PSHUFB for horizontal operations.
Utilize PHMINPOSUW to compute max as well:
 + sse41_hmin_epu16
 + sse41_hmax_epu16
2019-10-30 18:52:34 +03:00
Nekotekina b1968769b7 Minor cleanup in BufferUtils.cpp
Replace inline asm with intrinsic using target attribute trick.
2019-10-30 17:53:51 +03:00
linkmauve cfd5cf6bdb Optimise primitive_restart::upload_untouched() (#6881)
* rsx: Optimise primitive_restart::upload_untouched() with SSE4.1

This optimisation is only applied when skip_restart is false.

I’ve only tested the u16 codepath, as it is the one used in NieR.

In some very unscientific profiling, this function used to take 2.76% of
the total frame time at the save point of the port town, it now takes
about 0.40%.

* rsx: Mark all SSE4.1 functions with attributes on gcc and clang

This assures the compiler we will take care of only calling these
functions after having checked that the CPU does support these
instructions.

* rsx: Add an AVX2 implementation of primitive restart ibo upload

* rsx: Remove redefinition of SSE4.1 instructions

Now that clang is aware that our functions are compiled with SSE4.1, it
lets us generate this code using its intrinsics.

* rsx: Optimise vector to scalar conversion

This is done using minpos and srli intrinsics and generate less code
than before.

Thanks Nekotekina for the suggestion!
2019-10-30 16:42:44 +03:00
kd-11 35794dc3f2 vk: Add checks for alphaToOne support
- This feature is very rarely used, as alphaToCoverage is commonly used as a replacement for blending, not in addition to it.
2019-10-30 01:06:28 +03:00
kd-11 eda09489b2 vk: Optionally ignore depth bounds testing on hardware that does not
support it.
2019-10-29 20:03:54 +03:00
kd-11 7a5c20ef85 vk: Minor spec touchups
- Simplify active instance management. While multicontext support will
be required in future, this is better done with multiple logical devices
rather than multiple instances.
- Destroy the WSI surface on exit
- Enable depthBoundsTest explicitly. TODO: Properly check for supported
features.
2019-10-29 20:03:54 +03:00
kd-11 aa3eeaa417 rsx: Separate subresource_layout:dim_in_block and
subresource_layout::dim_in_texel

- These two are not always linked when working with compressed textures.
The actual texels extend past the actual size of the image if the size
is not aligned. e.g if height is 1, the real height is 4, but its not
possible to determine this from the aligned size. It could be 1, 2, 3 or
4 for example.
- Fixes image out-of-bounds writes when uploading from CPU
2019-10-29 20:03:54 +03:00
Eladash 42fc698186 rsx: Enable primitive restart index only when needed (#6889)
* rsx: Enable primitive restart index only when needed

* rsx: Use if with initializer in read_put()
2019-10-28 23:16:27 +03:00
kd-11 479d92d075 vk: Fix uninitialized (and wrong) variable access 2019-10-28 15:20:45 +03:00
kd-11 b0708367c2 vk: Round lod bias to the nearest 0.5 to lower number of permutations when nearest mipmap sampling is used
- The lambda values will be rounded to the nearest integer anyway
2019-10-28 15:20:45 +03:00
kd-11 3e8dfede1c vk: Modify sampler cache to uniquely identify all the input parameters
- Avoids iteration when variable mipmap counts or lod bias parameters change
2019-10-28 15:20:45 +03:00
kd-11 ad2add9574 rsx:: Use fcmp correctly 2019-10-28 15:20:45 +03:00
kd-11 d04241ad25 rsx: Allow compressed textures to be unaligned in size
- Align based on row length but let the texture itself be of arbitrary dimensions
2019-10-28 15:20:45 +03:00
Emmanuel Gil Peyrot 69e9ee26f6 rsx: Make input_is_swizzled a template parameter
This lowers the relative cost of this function from ~2.25% to ~1.80% on
gcc 9 which I found quite surprising, some of it probably gets inlined
better in the callers, but I haven’t been able to isolate which parts.
2019-10-28 13:28:51 +03:00
kd-11 d53d7bb598 vk: Restore vega native use of FP16 in shaders
- AMD proprietary drivers should work fine
2019-10-23 12:20:06 +03:00
Emmanuel Gil Peyrot 54d95373d0 Support fullscreen properly on Wayland
The current behaviour when going fullscreen from windowed was to keep
the previous size of the swapchain, with black borders on all sides,
which looks quite ugly.

The root of this issue is that rpcs3 only checks for frame resize if
vkQueuePresent() returns VK_SUBOPTIMAL_KHR, which drivers can’t do on
Wayland, see https://gitlab.freedesktop.org/mesa/mesa/issues/1979
2019-10-23 12:19:46 +03:00
kd-11 e04b6cd7c0 rsx: Copypasta fix
- r1 is always float4 never half4. Its a full-width register unlike the
other outputs which are optionally half-width.
2019-10-23 00:50:24 +03:00
kd-11 00bc3fe658 Drop d3d12 backend 2019-10-22 21:45:14 +03:00
Emmanuel Gil Peyrot 14c63ec014 Fix misleading indent. 2019-10-22 16:11:43 +03:00
Eladash 586fe11e22 Fix cellGcm HLE regression
Also correct flags.
2019-10-22 13:45:09 +03:00
Eladash 945abcc6cd rsx: Align down index array offset
* Also use improved to_be_t<> template (recetly ignoring one byte long types) for vm gsl::byte referencing, remove redundent narrow<> cast (same type)
2019-10-22 13:45:09 +03:00
kd-11 3bb70e837a vk: Silly copypasta 2019-10-22 13:44:49 +03:00
kd-11 0b2f9f0f17 rsx: Add support for delayed shader discard.
- Noticed a glitch on AMD hw and windows drivers where discard seems to affect entire 4x4 cells.
- Dead fragments (outside the primitive boundary) could have their discards trigger as they do not have proper access to variables.
- This introduces dead fragments along triangle edges, causing a diagonal line pattern across the screen that is very annoying.
2019-10-22 13:44:49 +03:00
kd-11 901942f24a rsx: Replace pointless f32[4] restriction on texture parameters.
- Use a struct instead to improve readability and remove pointless OpBitCast
2019-10-22 13:44:49 +03:00
kd-11 f7842b765f rsx: Implement packed format renormalization
- Renormalizes arbitrary N-bit values as 8-bit normalized.
- NV hardware performs integer normalization at 8 bits if the size is less than 8.
- This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.
2019-10-22 13:44:49 +03:00
Eladash 29cddc30f0 rsx: Fix vblank signals flood after Emu.Resume() 2019-10-21 15:31:45 +03:00
Eladash 5de0005f5a rsx: Report full method range on invalid methods
Also report full command on fifo desync event for the first time
2019-10-21 15:31:45 +03:00
eladash 730e9cde84 sys_rsx: Improve allocations and error checks
* allow sys_rsx_device_map to be called twice: in this case the DEVICE address retrived from the previous call returned
* Add ENOMEM checks for sys_rsx_memory_allocate and sys_rsx_context_allocate
* add EINVAL check for sys_rsx_context_allocate if memory handle is not found
* Separate sys_rsx_device_map allocation from sys_rsx_context_allocate's
* Implement sys_rsx_memory_free; used by cellGcmInit upon failure
* Added context_id checks
* Throw if sys_rsx_context_allocate was called twice.
2019-10-21 15:31:45 +03:00
kd-11 3c44065684 gl: Fix copypasta
- MSAA is still unimplemented in OGL
2019-10-20 21:38:40 +03:00
kd-11 f40f2c6215 vk: Fix minification filter description for NEAREST_MIPMAP_NEAREST. Just a typo.
- Also remove mipmap filter for CONVOLUTION
2019-10-20 21:38:40 +03:00
kd-11 09de3b7974 rsx: Tweak behaviour of the "Use GPU texture scaling" option
- If either source data or dest is a render target, do image operations on the GPU same as before
- If swizzle is desired, use CPU fallback
- If no scaling and no format conversion is required, use CPU fallback
- If scaling is desired and the transfer target is in local memory, use the GPU
- When doing trivial copies, use the routine in rsx_methods instead of
  duplicating code. Also has the benefit of better range checking.
2019-10-20 21:38:40 +03:00
kd-11 868547aec8 rsx: Minor improvement to fbo region invalidation
- When commiting a block as fbo, keep blit_dst data as well.
- Avoids removing (and losing data from) blit targets that just happen to share a page with a framebuffer.
2019-10-20 21:38:40 +03:00
kd-11 996534c559 rsx: Fixup for aspect mismatch 2019-10-20 15:25:07 +03:00
Eladash d4ba7f37b6 rsx util: Implement decode_fxp<> 2019-10-18 15:41:39 +03:00
kd-11 299b98b30a vk: Disable mipmap sampling if sampling mode is does not have a mipmap filtering mode.
- GL_LINEAR and GL_NEAREST always sample LOD0 so make vulkan behave the same way
2019-10-18 14:46:37 +03:00
kd-11 404073c74a rsx: Force-align compressed formats to 4x4 texel blocks and disable 1D compressed textures.
- The PS3 allows defining 1D compressed images but this obviously doesn't work well on desktop.
2019-10-18 14:46:37 +03:00
kd-11 eff4e95c99 rsx: Minor cache fixup for cyclic references.
- Logic was broken by mipmaps PR. Do not issue a texture barrier if a temp copy is being done.
2019-10-18 14:46:37 +03:00
kd-11 bd1bcc6be7 vk: Remove a redundant memory barrier 2019-10-18 14:46:37 +03:00
kd-11 70642484cd vk: Check for cyclic references if sampler is marked as do-not-cache.
- Usually an indication of surface/texture cache interaction.
2019-10-18 14:46:37 +03:00
kd-11 eee2237e19 rsx: Track uncached cache resources
- Uncacheable resources can be reused as soon as they're made visible to the draw call.
- Since they're likely to be reused every draw call until the shader changes, it is important to reuse as much as possible
2019-10-18 14:46:37 +03:00
kd-11 decf9cfcf6 rsx: Notify the backend to release or delete temporary surfaces after we're done with them. 2019-10-18 14:46:37 +03:00
kd-11 97ed95d21b vk: Add video memory manager to monitor VRAM usage 2019-10-18 14:46:37 +03:00
kd-11 1046184dd0 rsx: Fix some uninitialized variables flagged by valgrind 2019-10-18 00:32:38 +03:00
kd-11 5af8a9fbbc rsx: Fix decoding of some fixed point texture parameters
- Checked envydocs and found the correct format as fixed-point 4.8 with optional sign bit
2019-10-17 18:18:00 +03:00
kd-11 a936e43ff6 rsx: Fixup for slice gathering for structures with multiple mipmap levels
- TODO: Proper multi-level assembly for non-2D structures
2019-10-17 18:18:00 +03:00
kd-11 e47b4ffb8f rsx: Fix rsx capture crash.
- Pixel coordinates are top-left not bottom-right
- Solves out of bounds access
2019-10-17 18:18:00 +03:00
kd-11 e166dbccc8 rsx: Fix visibility of blit destination targets 2019-10-17 18:18:00 +03:00
kd-11 0c35595ce2 rsx: Remove the alpha-to-coverage hack that was added to hide the missing mipmaps in games
- Moves to a purely stochastic function using dithering to simlulate coverage
2019-10-17 18:18:00 +03:00
kd-11 f0ed0285f3 rsx: Implement range-based subresource descriptor cache
- The previous address-based approach was pretty awful when it comes to invalidating
2019-10-17 18:18:00 +03:00
kd-11 fbb9ed4e25 rsx: Add explicit range to cached subresource descriptors 2019-10-17 18:18:00 +03:00
kd-11 c9e3a321b2 rsx: Fixup for surface cache scanning
- Fix regression when gathering cubemaps
2019-10-17 18:18:00 +03:00
kd-11 1ac976771c rsx: Add some texture search options for the cache
- Potentially optimizes texture cache searching using explicit options
2019-10-17 18:18:00 +03:00
kd-11 840b52fe80 rsx: Implement mipmap gathering from texture cache 2019-10-17 18:18:00 +03:00
kd-11 d6d8766f8d rsx: Refactoring
- Move some helper routines out of the cache core
- Prep for multi-layered image search
2019-10-17 18:18:00 +03:00
kd-11 cb362b4085 rsx: Runtime check on RTT cast 2019-10-17 02:30:03 +03:00
kd-11 5c7bbb3354 vk: Fixup
- Removes incorrect line writing stencil flags to a regular texture.
2019-10-17 02:30:03 +03:00
kd-11 d29b6cdb59 vk: Proper workaround for VEGA float16_t bugs 2019-10-16 22:40:50 +03:00
kd-11 a6e143254a vk: Add workaround for broken format conversion in older GeForce cards 2019-10-16 22:40:50 +03:00
kd-11 4f088a102c vk: Add kepler and maxwell tables 2019-10-16 22:40:50 +03:00
plappermaul 2171ffdab2 minor optimization for FIFO_control::read_put() (#6768) 2019-10-14 21:26:31 +03:00
kd-11 42aa4c5000 gl: Vendor-specific tuning 2019-10-13 19:00:05 +03:00
kd-11 776fa54d22 gl: Fix missing case 2019-10-13 19:00:05 +03:00
kd-11 27f48fbc06 gl: Rewrite image transfer operations to support image subregions
- Working exclusively with full sized images is very expensive
2019-10-13 19:00:05 +03:00
kd-11 d9a9766e41 gl: Refactoring and fallback support for compute acceleration 2019-10-13 19:00:05 +03:00
kd-11 b39bfa02a6 gl: Windows bringup 2019-10-13 19:00:05 +03:00
kd-11 105d4b51e6 gl: Use compute shaders for typeless texture decode 2019-10-13 19:00:05 +03:00
kd-11 7a6e2e716f gl: Add a framework for compute shaders 2019-10-13 19:00:05 +03:00
Markus Stockhausen 4d99169d51 Patch v2 for vkCreateInstance()
as requested
2019-10-11 21:16:36 +03:00
Markus Stockhausen 8adcb8046b Patch for vkCreateInstance()
patch as requested
2019-10-11 21:16:36 +03:00
Markus Stockhausen f5817cb430 Error handling for vkCreateInstance()
Cry in log if initialization failed.
2019-10-11 21:16:36 +03:00
Eladash 397007cf8b rsx: Fix FIFO_DRAW_BARRIER substituation 2019-10-11 12:34:53 +03:00
Eladash 9242f16560 rsx: Improve FIFO recovery from flip 2019-10-10 19:34:23 +03:00
Eladash 06017cb14e rsx: Recover from invalid writes to CELL_GCM_NV4097_SET_INDEX_ARRAY_DMA
Also: Trigger a FIFO recovery when encountering an invalid method.
2019-10-10 19:34:23 +03:00
Eladash 2eaf5df60b rsx: Register some more methods 2019-10-10 19:34:23 +03:00
kd-11 305a5bd717 typo fix 2019-10-05 12:01:46 +03:00
kd-11 4a19a2dd24 rsx: Explicity describe transfer regions for both source and destination blocks 2019-10-04 18:10:46 +03:00
kd-11 7aed9c3f13 gl: Add missing input declarations for 2-sided lighting 2019-09-30 21:52:43 +03:00
kd-11 88229f4716 gl: Remember to unbind attachments from active framebuffer after clear
- If a stale reference is left lying around (e.g the texture bound to
depth has been deleted and we attach a color image) no operations
actually take place. glCheckFramebufferStatus also does not catch this
problem.
2019-09-30 21:52:43 +03:00
Eladash 0b2fa6ffdc rsx: Flush FIFO GET before smeaphore_acquire 2019-09-30 17:30:15 +03:00
Eladash 70b4ae6bd6 rsx: Optimize FIFO PUT masking 2019-09-30 17:30:15 +03:00
kd-11 bcf8799079 rsx: Fix missing point size export
- Sometimes program-point-size is enabled, but the vs does not actually
write to the point size register. In this case, pass the incoming point
size along instead of the default register init.
2019-09-30 01:40:04 +03:00
Eladash 319fc8c55d rsx: Mask FIFO PUT on rsx execution 2019-09-29 13:05:24 +03:00
Eladash 822287b418 rsx: Avoid unsigned/signed mismatch with fifo ret addr 2019-09-29 13:05:24 +03:00
kd-11 8cfd3b56d6 vk: Increase wait timeout in case of problematic GPU loads causing heavy stutter
- When compiling LLVM objects, it is possible to starve the driver thread and cause the timeouts to trigger
- Observed in RE6 when using SPU LLVM since the game generates a very large number of objects "infinitely"
2019-09-29 11:39:22 +03:00
kd-11 ef5b56bc48 rsx: Align width properly when normalizing to avoid fractional results being lowered to 0 2019-09-29 11:39:22 +03:00
kd-11 69c090b14a vk: Check frame descriptors before rendering in case of a flip request between begin() and end()
- There is no reason to delay async flip requests since most of the work can be handled during rendering anyway
2019-09-29 11:39:22 +03:00
kd-11 1464069476 rsx: Restructure deferred flip queue handling
- Allows frameskipping to occur naturally if RSX thread is bombarded with flip requests but just jumping to the last one if possible
- See request_emu_flip() for async frame submission and implicit skipping
- Also allows display queue to fill faster than the flip thread can drain the queue
2019-09-28 21:13:56 +03:00
Nekotekina bd1a24b894 Tidy endianness support (se_t) implementation
Move se_t and se_storage to util/endian.hpp
Use single template instead of two specializations.
Add minor optimization for MSVC.
Remove v128 dependency.
Try to enable intrinsics for unaligned data.
Fix minor bug in u16/u32/u64 specializations.
2019-09-28 15:39:50 +03:00
kd-11 2275259bf5 rsx: Properly scale overlay passes to match drawable area 2019-09-28 13:24:14 +03:00
kd-11 28534e8833 gl: Remove a debug print 2019-09-28 13:24:14 +03:00
kd-11 e53e98749f rsx: Add missing initialization 2019-09-27 21:07:56 +03:00
Nekotekina 3c72069ae6 cellOskDialog: use g_fxo 2019-09-26 23:26:36 +03:00
Nekotekina 5f9c5e8765 Use g_fxo for rsx::thread 2019-09-26 23:26:36 +03:00
kd-11 ee0633f43a vk: Add turing workaround
- Turing crashes if using the depth->color transfer hack
2019-09-26 20:12:25 +03:00
kd-11 acc986be3f vk: Add chip family detection 2019-09-26 20:12:25 +03:00
Jan Beich 5ec35c7daa rsx: unbreak build with Clang 9
ld: error: rpcs3/CMakeFiles/rpcs3.dir/main_application.cpp.o: unable to find library from dependent library specifier: opengl32.lib
ld: error: rpcs3/Emu/librpcs3_emu.a(GLGSRender.cpp.o): unable to find library from dependent library specifier: opengl32.lib
ld: error: rpcs3/Emu/librpcs3_emu.a(GLRenderTargets.cpp.o): unable to find library from dependent library specifier: opengl32.lib
ld: error: rpcs3/Emu/librpcs3_emu.a(GLVertexBuffers.cpp.o): unable to find library from dependent library specifier: opengl32.lib
2019-09-24 01:00:45 +03:00
Megamouse 7193d407b9 Input: Remove unused flush member 2019-09-20 22:12:40 +02:00
kd-11 1a892c6b1b rsx: Avoid recursion in flip handler 2019-09-20 15:08:41 +03:00
Megamouse aa7eb1536a overlays: fix enter button assignment in osk 2019-09-20 10:53:09 +02:00
kd-11 e0005ec347 rsx: Refactoring and improvement
- Separate displayed statistics from actual backend statistics.
  Allows asynchronous flipping to work correctly as it just uses display stats.
  The real stats are used by the frame scope marker to determine behavior like engaging the FIFO optimizer or skipping draw calls correctly.
2019-09-19 23:10:09 +03:00
kd-11 2c76f47eec rsx: Restructure flip code and frame scoping
- Add an explicit frame scope marker tied in with the queue_prepare command
  Since queue_prepare is emitted at the end of a frame, it can be used as end-of-frame in games that emit this
- If this command is not emitted, fifo flatenner and frameskip will not work
2019-09-19 23:10:09 +03:00