Commit graph

4256 commits

Author SHA1 Message Date
Megamouse e0baad417a perfoverlay: fix minimal graph min/max calculation 2023-01-27 00:13:20 +01:00
kd-11 5f0467b084 rsx: Remove framebuffer_status_valid flag and move to state 2023-01-26 11:42:39 +03:00
kd-11 6adcabda29 rsx: Fix graphics state foot-gun 2023-01-26 11:42:39 +03:00
Megamouse 44771150b7 overlays: add simple home menu 2023-01-21 09:11:53 +01:00
Megamouse ac2b2d82d2 overlays/osk: move pointer variables to fxo 2023-01-20 23:41:56 +01:00
Megamouse 11c42eb8d4 overlays/osk: add analog movement if CELL_OSKDIALOG_NO_INPUT_ANALOG is unset 2023-01-20 23:41:56 +01:00
Megamouse dc0230c476 overlays/osk: Fix layout and positioning 2023-01-20 23:41:56 +01:00
Megamouse f659338e5e overlays/osk: implement first osk pointer 2023-01-20 23:41:56 +01:00
Megamouse 4a82d81efe overlays/osk: implement scaling 2023-01-20 23:41:56 +01:00
Megamouse 6b4208be9b overlays/osk: align osk position 2023-01-20 23:41:56 +01:00
Megamouse 34df4509af overlays/osk: implement "support languages"
Some languages/panels in the osk need to be activated by the developer.
They are not available otherwise.
So let's check if they were pre-configured and only add the panels if they are supported.
2023-01-20 23:41:56 +01:00
Megamouse 709305df0e overlays: fix indentation 2023-01-18 00:24:00 +01:00
kd-11 719e7a9d56 rsx: Fix inadvertent signal override for MSAA 2023-01-17 02:24:21 +03:00
kd-11 eed9e56bf4 rsx: Allow vertex fetch from uninitialized register 2023-01-17 02:24:21 +03:00
xperia64 240cb2d627 Add output scaling filtering options, migrate FSR checkbox to these options 2023-01-16 13:52:51 +01:00
kd-11 bd69466e94 rsx: Fix some pipe state signal propagation routines 2023-01-16 15:20:53 +03:00
kd-11 6809d84a00 vk: Bump max number of suppported inline draw calls to 32k
- Surprisingly some games actually exhaust the entire 16k pool causing slowdown
2023-01-11 16:48:53 +03:00
kd-11 2752cd1390 rsx/vk: Fix some problems with dynamic state updates 2023-01-11 16:48:53 +03:00
kd-11 10b56415e8 vk: Avoid loading the whole dynamic state properties if only the shader changed
- Handles a common case where a game engine switches materials but uses the same configuration
  e.g rendering two types of wall or ground may need different shaders but similar state properties
2023-01-11 16:48:53 +03:00
kd-11 bd87c80943 rsx: Simplify the debug overlay print text routines.
- Greatly simplifies adding text
2023-01-11 16:48:53 +03:00
kd-11 f71e7ef1cc vk: Switch programs if the primitive type changed
- This will change when EXT_dynamic_state is integrated
2023-01-11 16:48:53 +03:00
kd-11 756ad17c2c Fix GCC11 compilation 2023-01-11 16:48:53 +03:00
kd-11 29c1b20b41 Fix compilation 2023-01-11 16:48:53 +03:00
kd-11 aa5097e0d4 glsl: Update fog enums in shaders 2023-01-11 16:48:53 +03:00
kd-11 2ccfee2e45 rsx: Propagate decode failures up the chain.
- Dumping invalid data should not crash
2023-01-11 16:48:53 +03:00
kd-11 bf1311b902 Fix GCC compilation 2023-01-11 16:48:53 +03:00
kd-11 71efb3bc84 rsx: Use gcm cast to handle input enum validation 2023-01-11 16:48:53 +03:00
kd-11 439bdde849 rsx: Fix printing of expected values 2023-01-11 16:48:53 +03:00
kd-11 c7fed20f3c vk: Short-circuit program load if state did not change
- TODO: Incorporate VK_EXT_extended_dynamic_state
2023-01-11 16:48:53 +03:00
kd-11 3dd6e5664c rsx: Do not call a dynamic function to simply test-and-set. Do it inline. 2023-01-11 16:48:53 +03:00
kd-11 d4ee308ffd vk: Fix rare crash when handling mixed depth format types 2023-01-11 16:48:53 +03:00
kd-11 a272f3e3b9 rsx: Improve performance by using an integral type to indicate error 2023-01-11 16:48:53 +03:00
kd-11 f6027719d2 rsx: Fix vertex decode 2023-01-11 16:48:53 +03:00
kd-11 38402e78c0 rsx: Fixup vertex enums in shaders 2023-01-11 16:48:53 +03:00
kd-11 eae1ac6558 refactor: Fix build 2023-01-11 16:48:53 +03:00
kd-11 0b019401bd Refactor gcm enums 2023-01-11 16:48:53 +03:00
kd-11 73cda2324a rsx/lv2: Refactor DMA control stuff after VSH work 2023-01-11 16:48:53 +03:00
kd-11 3dba894369 rsx: Minor refactoring RSXThread
- Part 1 of many
2023-01-11 16:48:53 +03:00
Elad Ashkenazi 36a55660bf
Unbreak BSD 2023-01-09 20:20:13 +01:00
Elad Ashkenazi 0946e5945f
VSH Improvements (#13172)
* sys_prx: Implement PRX LIB register syscall

* VSH: partial log spam fix

* sys_process reboot fix

* Implement sys_memory_container_destroy_parent_with_childs

* sys_net: Implement SO_RCVTIMEO/SO_SENDTIMEO

* VSH: Implement sys_rsx_context_free

* PPU LLVM: distinguish PPU cache exec also by address

Fixes referencing multiple PRX.

* UI: Do not report size of apps inside /dev_flash
2023-01-09 20:03:01 +03:00
kd-11 7423abb136 rsx: Remove incorrect hack 2023-01-02 23:03:39 +03:00
kd-11 9d432187aa vk: Fix bug that made fall-out barriers never get triggered 2022-12-28 17:37:50 +03:00
kd-11 b13165f95a vk/rtts: Account for corner case where the same texture can be bound to more than 1 slot 2022-12-28 17:37:50 +03:00
kd-11 110c20d25f vk: Restructure framebuffer loop barrier management 2022-12-28 17:37:50 +03:00
kd-11 4def7f143c rsx: Fix logicOp behavior when blending is also active 2022-12-27 02:56:43 +03:00
kd-11 908d524631 vk: Add some missing PCI IDs 2022-12-27 02:00:28 +03:00
kd-11 41e9e0b965 rsx: Restructure color format enum to clearly separate float from int formats 2022-12-19 23:13:25 +03:00
kd-11 388d090b91 rsx: Propagate surface format changes to shader ROP control 2022-12-19 23:13:25 +03:00
kd-11 04fb86556a rsx: Fix surface metadata life-cycle
- Beware of clone operations. Blindly inheriting the parent's metadata is wrong.
- It is possible, especially when reusing a pre-existing slice, that the parent and child info has diverged
2022-12-17 20:16:58 +03:00
kd-11 90cf47cdce rsx: Handle some corner cases in surface locking 2022-12-17 20:16:58 +03:00
kd-11 bf96cbe980 rsx: Fix const RTV/DSV cast from texture cache 2022-12-17 20:16:58 +03:00
kd-11 66dc1cc15d rsx: Conditionally skip flush if no new data was introduced 2022-12-17 20:16:58 +03:00
kd-11 a05e3f02b8 rsx: Avoid expensive protection scan by sharing some data between surface and texture cache 2022-12-17 20:16:58 +03:00
Eladash 8980fc5524 rsx: Fix exceptions 2022-12-17 14:27:20 +01:00
kd-11 cebc0ec4a1 vk: Add missing memory barrier 2022-12-17 13:10:32 +03:00
kd-11 7e35679ec2 vk: Revise some TRANSFER->TRANSFER barriers that introduced RAW hazards when copying images 2022-12-14 03:24:37 +03:00
kd-11 b39f457363 vk: Zero-initialize scratch VRAM allocations 2022-12-14 03:24:37 +03:00
kd-11 2d5a427bd4 gl: Throw exception if we cannot initialize critical requirements 2022-12-12 14:23:06 +03:00
kd-11 26021e11f7 gl: Require GLSL 450 when using barycentric extension 2022-12-11 15:21:58 +03:00
kd-11 55886b0a50 gl: Fix shader extension requirements 2022-12-11 15:21:58 +03:00
kd-11 577b5ef2bd Support compiling with older SDK headers 2022-12-11 15:21:58 +03:00
kd-11 780c38a5e5 gl: Silence compiler warning spam 2022-12-11 15:21:58 +03:00
kd-11 6756bf7d4b rsx: Only request attribute interpolation if the GPU requires it and the driver supports it 2022-12-11 15:21:58 +03:00
kd-11 9c0b2338cf rsx: Fix shader compilation 2022-12-11 15:21:58 +03:00
kd-11 a0ef1a672c rsx: Implement interpolation using barycentrics 2022-12-11 15:21:58 +03:00
kd-11 1fd265d316 rsx: Properly flag the program control if needed 2022-12-11 15:21:58 +03:00
kd-11 e3b23822fd rsx: Pass on shader flags to the cache 2022-12-11 15:21:58 +03:00
Eladash 151a0955cf rsx: Implement draw call stepping 2022-12-10 15:09:42 +01:00
Eladash 40406bd3fe RSX debugger: Implement Texture Dumper
Also fix many bugs in textures display.
2022-12-10 15:09:42 +01:00
shinra-electric 809e880bd1
[3rdParty] Update MoltenVK to 1.3.236 & set MSL Fastmath to On Demand (#13035)
* Update MoltenVK to 1.2.236

* Change mvk_config.fastMathEnabled from a bool to Int

fastMathEnabled now has three options:
NEVER = 0
ALWAYS = 1 
ON_DEMAND = 2

On demand seems better, since it will use fast math except for shaders that are incompatible.
2022-12-09 20:49:56 +01:00
Megamouse b0e376ae76 rsx/qt: add recording to game window 2022-12-08 21:08:37 +01:00
Nekotekina 7c15001042 Implement read_from_ptr<>() util
Doing std::bit_cast on a "span".
Should be usable in constexpr.
2022-11-26 09:30:11 +03:00
kd-11 8be4ac6869 gl: Fix rotation operations in blit engine 2022-11-22 12:15:18 +03:00
kd-11 81f9259063 gl: Add support for capture debug markers 2022-11-22 12:15:18 +03:00
kd-11 a97424d46c rsx: Fix low precision shader option 2022-11-22 12:15:18 +03:00
kd-11 c4b259e0f8 rsx: Always enable ROP output quantization on NV 2022-11-18 23:06:47 +03:00
kd-11 e04855a0da rsx: Improve ROP output handling
- Perform 8-bit quantization/rounding before emulated operations like ALPHA_TEST
2022-11-18 23:06:47 +03:00
MSuih 3f8421fc17 Add enable exclusive fullscreen mode setting 2022-11-14 17:50:13 +01:00
kd-11 5943b802d7 grammar 2022-11-11 12:09:23 +03:00
kd-11 e98b07de03 vk: Set line width when rasterizing points (workaround)
- Fixes point rendering when using AMD drivers.
2022-11-07 23:12:31 +03:00
kd-11 de5217745c gl: Fix point size export 2022-11-07 23:12:31 +03:00
Nekotekina ae809ad320 Unexpected bugfixes
Mostly unaligned memory access.
Also includes workarounds for ubsan execution.
2022-10-31 14:20:02 +03:00
kd-11 b156b40f8f rsx: Fix clear color for formats with less than 32-bit width 2022-10-31 13:39:37 +03:00
Megamouse a38e144320 overlays: use the system keyboard layout for osk 2022-10-29 22:56:08 +02:00
Megamouse 059c45f202 overlays: implement osk keyboard cursor actions 2022-10-29 22:56:08 +02:00
Megamouse eccceea7fb overlays: implement osk delete action 2022-10-29 22:56:08 +02:00
Megamouse ad340c3007 overlays/osk: Implement fallback for unknown keys
Note that those keys won't be passed to the cellOsk event hook callback
2022-10-29 22:56:08 +02:00
Elad Ashkenazi c214f45e14
Savestates/rsx/IO: Resume emulation on long START press, enable "Start Paused" by defaut (#12881)
* Savestates: Enable "Start Paused" by default
* Emu/rsx/IO: Resume emulation on long START press
* rsx: fix missing graphics with savestates' "Start Paused" setting
* rsx/overlays: Add simple reference counting for messages to hide them manually
* Move some code in Emulator::Pause() so thread pausing is the first thing done by this function
2022-10-29 19:53:00 +02:00
shinra-electric edb7991979 Remove MVK Semaphore Support Style options
This line is no longer needed as MVK will select the appropriate support style automatically. 

See https://github.com/KhronosGroup/MoltenVK/pull/1738
2022-10-25 07:22:44 +02:00
Eladash 18e30c7e44 rsx: Implement custom fractional frame limit 2022-10-24 00:10:37 +02:00
kd-11 2c41eecdb1 rsx: Force position invariance on GPUs where it matters 2022-10-24 00:49:44 +03:00
kd-11 fcc7a7452a vk: Fix scratch buf size calculation when uploading DSVs 2022-10-22 15:11:40 +03:00
kd-11 1bb0caed6f gl: Add missing memory barrier after texture decode 2022-10-22 15:11:40 +03:00
Megamouse ddd261c943 Input: refactor vibration
There's no need to deal with vibration levels outside of the handlers.
All we need to know is the 0-255 DS3 range which is given by the u8 type.
2022-10-21 23:42:01 +02:00
kd-11 bd9c876e36 gl: Handle clip plane switching using API calls and the state tracker 2022-10-21 13:45:45 +03:00
kd-11 04f6302ecc Fix decode shader compilation 2022-10-16 19:58:30 +03:00
kd-11 1df977fae2 gl: Avoid including unnecessary headers 2022-10-16 19:58:30 +03:00
kd-11 9105c2cf4a gl: Refactor capabilities and add GLSL version detection support. 2022-10-16 19:58:30 +03:00
kd-11 6d43fcf8fb gl: Fall back to renderpass decoder on ATI drivers 2022-10-16 19:58:30 +03:00
kd-11 0737c788fc rsx: Fix parsing of broken command streams with hanging begin/end commands without a pair.
- While these are game bugs, the parser shouldn't break on encountering them.
2022-10-12 11:19:52 +03:00
kd-11 3fe9aea5b5 rsx/overlays: Allow some basic communication from the UI components to the backend renderers 2022-10-11 23:13:12 +02:00
Megamouse ab6ba848b8 overlays: simplify overlay_media_list_dialog 2022-10-11 23:13:12 +02:00
kd-11 65d20f2d08 gl: Add mesa support for polygon offset 2022-10-11 14:00:34 +03:00
kd-11 a229e30b08 rsx: Implement RSX-compliant polygon offset 2022-10-11 14:00:34 +03:00
kd-11 d246a37b11 rsx: Move fp16 toggle to a global shader precision option 2022-10-11 14:00:34 +03:00
Elad Ashkenazi 92b08a4faf
rsx: Fixup a bug after mfc list optimization (#12782) 2022-10-10 04:04:41 +03:00
Eladash a6dfc3be2f SPU: Enable the MFC list optimization for Atomic RSX FIFO 2022-10-09 19:27:46 +03:00
kd-11 d6d7ade6e3 vk: Reload state on dynamic state changed 2022-10-09 03:00:39 +03:00
Elad Ashkenazi e0df2c584f rsx: Attempt to fix frame limiter 2022-10-09 01:33:40 +03:00
kd-11 3c88477270 Fixup for scissor/viewport invalidation rules 2022-10-07 15:27:54 +03:00
kd-11 df46e5137c gl: Fix texture reconstruction logic
- Use correct target types
- Fix key generation to apply differently for each target type
2022-10-07 11:53:34 +03:00
kd-11 ffe8133865 vk: Avoid unnecessary dynamic state updates 2022-10-07 11:53:34 +03:00
kd-11 7140e82189 rsx: Fix program invalidation rules 2022-10-07 11:53:34 +03:00
kd-11 87411da95f gl: Explicitly declare gl_Position as invariant when using MESA 2022-10-06 06:41:24 +03:00
Eladash 9b5cc7cda7 System.cpp: Fix RSX thread abort 2022-10-04 14:14:38 +03:00
kd-11 73784b9e12 Fix GCC build 2022-10-03 12:57:16 +03:00
kd-11 533f960854 rsx: Handle some more corner cases 2022-10-03 12:57:16 +03:00
kd-11 765208a181 rsx: Avoid clobbering CELL memory when splitting fbos 2022-10-03 12:57:16 +03:00
kd-11 4417701ea7 rsx: Track orphaned surfaces' parent addresses 2022-10-03 12:57:16 +03:00
kd-11 f66eaf8f44 rsx: Add some handy util functions to simple_array 2022-10-03 12:57:16 +03:00
kd-11 a0e2a3db1d Fix underflow in ZCULL sync 2022-09-30 23:44:37 +03:00
kd-11 102d30db2d vk: Update support for framebuffer loops to comply with current spec 2022-09-28 12:55:31 +03:00
kd-11 5281a85b67 rsx: Fix compiler warnings 2022-09-28 12:55:31 +03:00
kd-11 de28c812e8 rsx: Re-evaluate color MRT setup when the surface target type changes 2022-09-28 12:55:31 +03:00
kd-11 67c02e3522 vk: Bump compute descriptor pool size to 8k
- TODO: This should be dynamic.
2022-09-27 14:58:47 +03:00
kd-11 19dd2a693b gl: Fix transform job assert 2022-09-27 14:58:47 +03:00
Nekotekina 6ff6a4989a Implement at32() util
Works like .at() but uses source location for "exception".
2022-09-26 18:04:15 +03:00
kd-11 dd8a337b14 rsx: Fix some more warnings 2022-09-22 23:46:48 +03:00
kd-11 0572d44996 gl: Fix enum collision 2022-09-22 23:46:48 +03:00
kd-11 38aa116c59 Fix build 2022-09-22 23:46:48 +03:00
kd-11 61666bae69 rsx: Fix hardware deswizzle not getting used when hardware deswizzle flag is not set 2022-09-22 23:46:48 +03:00
kd-11 362a26a404 gl: Fix D24X8 accelerated encode/decode
- PS3 D24X8 is swapped as a full word, unlike PC.
- Add missing paths to handle custom swap behavior.
2022-09-22 23:46:48 +03:00
kd-11 81fa3da101 gl: Minor optimization around test..set patterns in the state tracker 2022-09-22 23:46:48 +03:00
nastys acc2fea7e3
Update MoltenVK to 250e1f9 and single queue (#12620) 2022-09-20 11:12:27 +03:00
kd-11 3dc7b64fa1 rsx: Fix initialization of null cubemap resources 2022-09-19 19:13:46 +03:00
kd-11 79f2c21dfb gl: Restrict compute image bindings to [0-8]
NVIDIA only supports 8 compute image slots even on modern GPUs.
2022-09-19 01:37:10 +03:00
kd-11 df36c44bc2 gl: Avoid UBO/SSBO binding index collisions
- Some drivers don't like this. Actually only RADV.
- Almost all GPUs going back 15 years have a large number of UBO slots but limited SSBO slots.
  Move UBO slots up as we have tons more headroom there.
2022-09-19 01:37:10 +03:00
Nekotekina c4db65cc08 Fix one more warning 2022-09-18 18:35:17 +03:00
Nekotekina b49a1f27eb Warning fixes 2022-09-17 16:35:02 +03:00
Eladash c8199de188 CPU preemption control: Improve stutter elimination 2022-09-16 18:57:55 +03:00
Eladash 2e9ee81dcd CPU preemption control: Improve analysis 2022-09-16 18:57:55 +03:00
Eladash cf4da5c4d1 CPU preemption control: bugfixes 2022-09-16 18:57:55 +03:00
Eladash 9c5108c1ca CPU preemption control: Add one more debug variable 2022-09-16 18:57:55 +03:00
Eladash ec7b18dab5 Implement independent CPU preemptions 2022-09-13 19:28:20 +03:00
kd-11 572a2a06d1 rsx: Properly reset occlusion counters even when the register is not in use. 2022-09-12 17:15:06 +03:00
kd-11 d686b48f65 rsx: Simplify FIFO concurrent access. 2022-09-09 23:17:27 +03:00
kd-11 f319362e35 vk: Fix queue concurrency behavior for images 2022-09-09 23:17:27 +03:00
kd-11 940e726754 rsx: Minor FIFO cleanup 2022-09-09 23:17:27 +03:00
kd-11 f43824762a rsx: Get rid of an allocation in analyse_vertex_data that adds about 5% overhead.
This method is called many thousands of times per frame and that single allocation introduces a small perf hit.
Just get rid of it, it doesn't improve anything to have it there.
2022-09-09 23:17:27 +03:00
kd-11 cd53bb7eff rsx: Avoid on-the-fly ZCULL allocations with unordered_map 2022-09-09 23:17:27 +03:00
Eladash 274386a078 rsx: Add some debugging information 2022-09-07 18:39:32 +03:00
Nekotekina 5985f0eefa BufferUtils: cleanup regarding ARM64 2022-09-07 17:59:07 +03:00
Nekotekina 82258915da BufferUtils: rewrite remaining intrinsic code with simd_builder 2022-09-07 17:59:07 +03:00
Nekotekina 11a1f090d3 BufferUtils: simd_builder refactoring
Some simplifications implemented.
2022-09-07 17:59:07 +03:00
Elad Ashkenazi 290226539f
Fix ARM build (#12606) 2022-09-04 21:11:04 +03:00
Eladash 11a197a387 Savestates/RSX: fix unintentional vblank thread spin after abort 2022-09-01 20:09:28 +03:00
Eladash ee1384341e rsx: Implement atomic vertex upload (with Strict Rendering Mode) 2022-09-01 20:09:28 +03:00
Nekotekina 58e3232710 BufferUtils: Fix regression in upload_untouched 2022-09-01 17:39:04 +03:00
Nekotekina e28707055b Implement simd_builder for x86
ASMJIT-based tool for building vectorized loops (such as ones in BufferUtils.cpp)
2022-08-28 18:38:52 +03:00
kd-11 1fc0191311 Fix build 2022-08-23 23:49:46 +03:00
kd-11 1f9e04f72d rsx/vk: Implement flushing surface cache blocks to linear mem 2022-08-23 23:49:46 +03:00
kd-11 bca833dad7 Fix surface reuse 2022-08-20 01:23:15 +03:00
kd-11 f981e05908 rsx: Do not lie about surface details 2022-08-20 01:23:15 +03:00
kd-11 b5abd777b0 rsx: Allow longer dispatch queues to accomodate games with high draw call count 2022-08-19 20:29:32 +03:00
Elad Ashkenazi b2c9add47e rsx: Fix semaphore timeout on boot
Allow semaphore timeout to be disabled again.
2022-08-19 15:40:20 +03:00
kd-11 a401a192b8 Fixup for dst_stage 2022-08-19 14:29:20 +03:00
kd-11 ad1b007dd1 Fix whitespace 2022-08-19 14:29:20 +03:00
kd-11 71e35c8b4d vk: Implement support for VK_EXT_attachment_feedback_loop_layout 2022-08-19 14:29:20 +03:00
kd-11 2e504b2dac rsx: Silence some warnings 2022-08-19 14:29:20 +03:00
kd-11 bacf518189 rsx: Fix 2D intersection tests 2022-08-14 23:53:50 +03:00
kd-11 b960ce1426 vk: Align write length when pre-filling buffers with constant patterns 2022-08-14 23:53:50 +03:00
kd-11 c55a889c23 vk: Initialize buffer info blocks to avoid null descriptors 2022-08-14 23:53:50 +03:00
Eladash 4464a6c3f6 CG-Disasm: Name input/output vetex arrays 2022-08-12 15:20:48 +03:00
Elad Ashkenazi c4cc0154be LV2: Optimizations and fixes
Fix and optimize sys_ppu_thread_yield

Fix LV2 syscalls with timeout bug. (use ppu_thread::cancel_sleep instead)

Move timeout notification out of mutex scope

Allow g_waiting timeouts to be awaked in scope
2022-08-11 11:42:16 +03:00
kd-11 c51d3b5465 Workaround for msvc weirdness 2022-08-09 18:32:54 +03:00
kd-11 e179adc4a0 rsx: Refactor surface cache storage 2022-08-09 18:32:54 +03:00
kd-11 61a055a1c6 Tuning 2022-08-07 22:14:49 +03:00
kd-11 64b4cfa59f rsx: Erase surface background when reloading after a pitch mismatch 2022-08-07 22:14:49 +03:00
kd-11 c799ffd223 rsx: Stubs for pitch conversion 2022-08-07 22:14:49 +03:00
kd-11 2445ab8d8e Fix RSX capture playback 2022-08-04 19:01:45 +03:00
kd-11 3e923b4993 rsx: Optimize VTX_FMT_SNORM16 decoding
- Cuts down SNORM16 overhead by ~65%
2022-08-03 23:33:31 +03:00
kd-11 8181498d86 gl: Alias UBO/SSBO slots to avoid exceeding the available number of binding slots.
- The sets are different anyway and should not overwrite each other in a proper driver.
2022-08-03 23:33:31 +03:00
kd-11 57dd611111 gl: Fix incomplete stencil view of depth-stencil texture
- Samplers must use point sampling for stencil views
2022-08-03 23:33:31 +03:00
Eladash b3162bd41c rsx/vp: Fix SNORM16 vertex decoding 2022-08-03 18:11:46 +03:00
Elad Ashkenazi cd2adbad9a Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi 99730ac4f9 Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi d2ab3383ad Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi 3b15a6b39e Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi 651e58f443 rsx: Trivial optimization 2022-08-03 17:15:59 +03:00
Eladash 769f9e33e9 Savestates/RSX: Fix fifo_control::restore_state 2022-08-03 15:35:41 +03:00
kd-11 052725fdc7 rsx: Do not require ZCULL buffer binding to enable ZPASS counting
- ZPASS data is still accessible in unbuffered mode.
  The only thing that buffered ZCULL enables is something closer to early-Z where large blocks of pixels can be dicarded earlier.
  It is strictly a performance optimization and not required for ZPASS to work.
- Update ZCULL stat calculations to take into account unbuffered Z
2022-08-01 00:23:54 +03:00
Megamouse f90b79791f HLE: fix file not found errors in media functions 2022-07-31 16:45:05 +02:00
Megamouse 228844c017 overlays: fix line wrapping and position of lines
- Fix off by one issue when we wrapping a line caused by unnecessary zeroed whitespaces.
- Fix centering of lines that end with carriage return caused by overzealous reset of counters.
- Remove fabs where there shouldn't be any
2022-07-29 09:26:45 +02:00
Megamouse 577f379a12 implement cellPhotoImport 2022-07-26 17:27:35 +02:00
kd-11 c9058280e0 vk: Fix a potential deadlock 2022-07-25 21:05:31 +03:00
kd-11 5af50cfd55 vk: Handle corner cases
- Fix up flush sequence in DMA handling (WCB)
- Do not request resource sharing if queue family is not different!
2022-07-25 21:05:31 +03:00
kd-11 d846142f0c vk: Reimplement compliant async texture streaming
- Use CONCURRENT queue access instead of fighting with queue acquire/release via submit chains.
  The minor benefits of forcing EXCLUSIVE mode are buried under the huge penalty of multiple vkQueueSubmit.
  Batching submits does not help alleviate this situation. We simply must avoid interrupting execution.
2022-07-25 21:05:31 +03:00
Megamouse c40439ae6b cellMusic/Decode: implement playlist shuffle and repeat 2022-07-22 08:42:43 +02:00
kd-11 246bf1df64 Use C++17 ctor for string_view 2022-07-21 22:29:40 +03:00
kd-11 9a868e9239 gl: Silence compiler warning 2022-07-21 22:29:40 +03:00
kd-11 ab3cde1939 gl: Do some macro patching for intel driver 2022-07-21 22:29:40 +03:00
kd-11 bec3e156fb vk: Disable robust buffer access for ANV
- Robust access is nice, but we don't actually need it
2022-07-21 22:29:40 +03:00
Megamouse 086afbbaa5 overlays: implement back and focus in media_list_dialog 2022-07-21 01:36:33 +02:00
kd-11 680f08c2b8 gl: Destroy barrier signals correctly 2022-07-18 18:58:22 +03:00
kd-11 82bac4173e gl: Reuse scratch images 2022-07-18 18:58:22 +03:00
kd-11 8a8fda3e02 gl: Combine RGBA8/D24S8 readback and byteswap into one operation 2022-07-18 18:58:22 +03:00
kd-11 1c5b685398 gl: Only toggle state settings that are relevant to the current RSX state 2022-07-18 18:58:22 +03:00
kd-11 e95084f138 gl: Use DSA for imageview configuration and avoid needless bind operations 2022-07-18 18:58:22 +03:00
kd-11 e12d268662 gl: Implement support for texture1D decode 2022-07-18 18:58:22 +03:00
kd-11 6a3f17cd36 gl: Fix compute invocation counts for format handling code 2022-07-18 18:58:22 +03:00
Eladash 3e51426379 Savestates/SPU: Kill emulation when its safe to save SPU state 2022-07-15 09:30:53 +03:00
Megamouse 105781fa76 overlays: properly align lines with leading or trailing whitespace 2022-07-14 23:32:20 +02:00
Megamouse d2be12bb07 overlays: find missing characters lost during wrapped rendering 2022-07-14 23:32:20 +02:00
Megamouse fdc15e12c4 overlays: properly calculate offsets for wrapped text 2022-07-14 23:32:20 +02:00
Eladash e548743cbf Fixup rsx cpatures 2022-07-14 18:50:31 +03:00
kd-11 cdef752a9c gl: Fix 2D->3D splat in CopyBufferToImage 2022-07-13 02:09:58 +03:00
kd-11 1483941bea gl: Implement row alignment in CopyBufferToImage routines 2022-07-13 02:09:58 +03:00
kd-11 453e1bfaec gl: Silence compiler warning 2022-07-13 02:09:58 +03:00
kd-11 82439327fa gl: Support loading data from SSBO using compute shaders
- Gives better performance than using raw draw calls.
- Does not work with all formats. The draw call version is still used when needed.
2022-07-13 02:09:58 +03:00
kd-11 f60002e87d gl: Optimize memory barriers a bit
- Move waits to server side
- Increase the scratch buffer size to avoid waiting on barriers
2022-07-13 02:09:58 +03:00
kd-11 9fc6382909 gl: Finalize BGRA storage format internals
- Performance is terrible but it works properly now
2022-07-13 02:09:58 +03:00
kd-11 ebad08aa97 gl: Fix image creation for virtual formats 2022-07-13 02:09:58 +03:00
kd-11 599f1dd157 gl: Properly match BGRA RTT formats 2022-07-13 02:09:58 +03:00
kd-11 bb5ce67d57 gl: Handle corner cases for CopyBufferToImage
- Handle 3D textures and cubemaps
- Handle writing to mip > 0
2022-07-13 02:09:58 +03:00
kd-11 f948ce399e gl: Implement CopyBufferToImage in software
- Overrides the drivers CopyBufferToImage handling where possible
2022-07-13 02:09:58 +03:00
kd-11 954c60947d gl: Avoid calling gl functions without a context even if the object is GL_NONE
- While calling glDestroyXXXX with GL_NONE is a no-op, calling it without a context will crash some drivers.
2022-07-13 02:09:58 +03:00
kd-11 98b6783c05 gl: Fix image views broken after refactor 2022-07-13 02:09:58 +03:00
kd-11 0894d2886a Fix build 2022-07-13 02:09:58 +03:00
kd-11 4995b4abe3 gl: Do not use raw GL image copy command for RSX data 2022-07-13 02:09:58 +03:00
kd-11 35ef19cfc8 gl: Refactor the rest of GLHelpers 2022-07-13 02:09:58 +03:00
kd-11 09824a718f gl: Separate BGRA8 storage from RGBA8 2022-07-13 02:09:58 +03:00
Eladash ab27ee4cf4 Savestates/RSX: Save NV406E semaphore waiting 2022-07-12 15:15:42 +03:00
Eladash 24fddf1ded rsx: Fix emu stopping crash when using multi-threaded rsx
FXO signaled abort before it completed its work, leading to unsignalled vk::fence and deadlock. Fix it by deregistering it from FXO.
2022-07-10 14:19:59 +03:00
Eladash 87cd65ff03 Savestates: support game collections 2022-07-10 14:19:59 +03:00
Eladash 4ade06f36f Savestates/RSX: Restore the ZCULL control state
And fix the ZCULL control state at the initial state of RSX.
2022-07-10 14:19:59 +03:00
Nekotekina 4b787b22c8 Implement FN (lambda shortener)
Useful for some higher order functions.
Allows to make short lambdas even shorter.
2022-07-08 14:47:41 +03:00
Eladash 4ac88fa8d3 Savestates/RSX: Save drawing context 2022-07-08 12:57:43 +03:00
Eladash 5f8f9e33f1 RSX/Savestates: Replace GCM hack with a proper fix 2022-07-08 12:57:43 +03:00
Megamouse b683110e72 cellGem/overlays: show cursor if necessary 2022-07-07 12:40:23 +02:00
Megamouse 4823d4c32a input: add background input option
Adds an option to disable background input to the IO tab in the settings dialog.
This will disable pad input as well as ps move and overlays input when the window is unfocused.
2022-07-06 21:49:31 +02:00
Eladash bd9ba7ef1f Remove incorrect Emu.IsStopped() checks 2022-07-05 08:25:36 +02:00
kd-11 fddb6a31a7 Use utils::c_page_size 2022-07-04 22:35:05 +03:00
kd-11 5cafaef0a9 Aarch64 fixes for RSX 2022-07-04 22:35:05 +03:00
Elad Ashkenazi fcd297ffb2
Savestates Support For PS3 Emulation (#10478) 2022-07-04 16:02:17 +03:00
Nekotekina 69912ba3c7 Partial revert for cf0fcf5a2a 2022-06-30 14:38:14 +03:00
Eladash cf0fcf5a2a SPU: Implement execution wake-up delay 2022-06-28 19:54:25 +03:00
Eladash f5a55b3024 rsx: Fixup after #12052 for frame limiter off 2022-06-25 17:39:07 +03:00
Eladash 7422ab9e55 rsx: Do not discard flip notifications 2022-06-25 15:30:41 +02:00
Eladash f66256cc13 rsx: PS3 Native frame limiter improvements, add Infinite frame limiter
* Do not wait on DEVICE 0x30 semaphore, it seems like it is something to do with queue command synchronization.
 - This also fixes cellGcmSetFlipWithWaitLabel which is built specifically to enable accurate RSX flipping time, its waiting command is confirmed to be placed **AFTER** DEVICE 0x30 waiting.
* Fix default vsync state to be enabled. (and set it to enabled in cellGcmSetVBlankFrequency as well)
* Add experimental "Infinite" frame limiter mode.
* Fix spurious enabling of second vblank.
2022-06-25 15:30:41 +02:00
Eladash 5e01ffdfd8 Debugger: Optimize cpu_thread::dump_regs()
Reuse string buffer. Copies and reallocations are expensive with such large strings.
2022-06-23 22:41:32 +02:00
Eladash 3899248305 RSX Debugger: Stable NOP skipping
Allow addresses of NOP blocks to remain consistent in between debugger position changes except for the first which can shrink or grow.
2022-06-21 16:59:45 +03:00
Jeff Guo cefc37a553
PPU LLVM arm64+macOS port (#12115)
* BufferUtils: use naive function pointer on Apple arm64

Use naive function pointer on Apple arm64 because ASLR breaks asmjit.
See BufferUtils.cpp comment for explanation on why this happens and how
to fix if you want to use asmjit.

* build-macos: fix source maps for Mac

Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo
modes.

* LLVM PPU: fix aarch64 on macOS

Force MachO on macOS to fix LLVM being unable to patch relocations
during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by
PPUTranslator/Recompiler.

* virtual memory: use 16k pages on aarch64 macOS

Temporary hack to get things working by using 16k pages instead of 4k
pages in VM emulation.

* PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS

Fixes some intrinsics usage and patches usages of asmjit to properly
emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also
patches the SPU recompiler to properly work on arm64 by telling LLVM to
target arm64.

* virtual memory: fix W^X toggles on macOS aarch64

Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default
to RW mode. For both SPU and PPU execution threads, when initialization
finishes we toggle to RX mode. This exploits Apple's per-thread setting
for RW/RX to let us be technically compliant with the OS's W^X
    enforcement while not needing to actually separate the memory
    allocated for code/data.

* PPU: implement aarch64 specific functions

Implements ppu_gateway for arm64 and patches LLVM initialization to use
the correct triple. Adds some fixes for macOS W^X JIT restrictions when
entering/exiting JITed code.

* PPU: Mark rpcs3 calls as non-tail

Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you
call a function inside e.g. an L2 syscall, it will clobber LR on arm64
and subtly break returns in emulated code. Only JIT -> JIT "calls"
should be tail.

* macOS/arm64: compatibility fixes

* vm: patch virtual memory for arm64 macOS

Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to
existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix
memory unmapping on 16K page machines with a hack to mark "unmapped"
pages as RW.

* PPU: remove wrong comment

* PPU: fix a merge regression

* vm: remove 16k page hacks

* PPU: formatting fixes

* PPU: fix arm64 null function assembly

* ppu: clean up arch-specific instructions
2022-06-14 15:28:38 +03:00
Eladash 264253757c rsx: Improve Null Renderer 2022-06-12 20:54:42 +03:00
Ani 2512e958fa
glsl: Avoid implicit int->uint conversions (#12220) 2022-06-12 18:05:43 +01:00
Elad Ashkenazi 280aa6da91
rsx: Fix NV406E semaphore_acquire timeout detection (#12205) 2022-06-12 12:34:29 +03:00
Malcolm Jestadt 0d022d420b RSX: Add more wide paths for upload_untouched
- Adds AVX512 path for upload_untouched u16 with primitive restart, and
  AVX2 and AVX512 paths for upload_untouched without restart
- The AVX512 paths handle the remainder in simd code with masking, which
  provided a large speedup
- On my i5-1135G7 in demons souls benchmarking a scene in boletaria with
  a lot of geometry on screen via perf:
SSE4_1                      0.64%
AVX2                        0.59%
AVX512                      0.56%
AVX512 w/ remainder masking 0.51%
2022-06-12 06:23:55 +03:00
Elad Ashkenazi ec530a2c91
rsx: Suggest to try setting RSX FIFO Accuracy to a higher mode of accuracy on crash (#12204) 2022-06-11 23:26:12 +02:00
kd-11 7530b3c971 vk: Fix image view search and destroy 2022-06-09 02:13:55 +03:00
Eladash f9bc7458d4 rsx: Resurgence of HLE GCM 2022-06-06 12:56:25 +02:00
kd-11 6c315e8aee gl: Disallow overlapping binding points 2022-06-05 10:13:41 +03:00
Elad Ashkenazi 88faac7bbc
rsx: Minor fixup (#12165) 2022-06-04 15:04:27 +01:00
Elad Ashkenazi 9bb7e8d614
rsx: Implement atomic FIFO fetching (stability improvement) (non-default setting) (#12107) 2022-06-04 15:35:06 +03:00
kd-11 286f97fad0 rsx: Reduce some error spam 2022-06-04 14:02:33 +03:00
kd-11 f0a02e0d9d gl: Fix leaking texture views 2022-06-04 14:02:33 +03:00
kd-11 8185bfe893 gl: Track image destruction and remove handles from state tracker
- Handles are reused for different resources which can cause problems
2022-06-04 14:02:33 +03:00
kd-11 d577cebd89 gl: Refactor image and command-context handling
- Move texture object code out of the monolithic header
- All texture binds go through the shared state
- Transient texture binds use a dedicated temp image slot shared with native UI
2022-06-04 14:02:33 +03:00
kd-11 167161d8ce rsx: Restore some accidentally removed depth-format conversion macros 2022-06-03 11:54:09 +03:00
kd-11 b8b0ecabd8 gl: Fix data pointer on the optimized AMD path 2022-06-03 11:54:09 +03:00
kd-11 bb05de2e80 gl: Fix copypasta 2022-06-03 11:54:09 +03:00
kd-11 7890e87234 gl: Fix warning 2022-06-03 11:54:09 +03:00
kd-11 25c05867d6 gl: Fix ring buffer remove() function
- Fixes crash on running a second game in the same session
2022-06-03 11:54:09 +03:00
kd-11 a421270c19 gl: Use new scratch buffer system 2022-06-03 11:54:09 +03:00
kd-11 764fb57fdc gl: Implement scratch ring buffer with memory barriers 2022-06-03 11:54:09 +03:00
kd-11 3fd846687e gl: Refactor buffer object code 2022-06-03 11:54:09 +03:00
kd-11 ff9c939720 gl: Assume decode buffer is to be used as SSBO as this seems to be a hint to the driver about where to put the buffer
Part of OpenGL's achilles' heel - the API does not distinguish between VRAM and SYSTEM memory at all and relies on developers wrestling with the driver's heurestic algorithm for this.
2022-06-03 11:54:09 +03:00
kd-11 234db2be3f gl: Fix texture binding in overlay renderer 2022-06-03 11:54:09 +03:00
kd-11 fc44d53bb0 gl: Reset buffer size on destroying the GPU handle 2022-06-03 11:54:09 +03:00
kd-11 555a4b5f5c gl: Suggest readback buffer as ssbo if it is not provided
- We're likely to jump into a compute or readback pass anyway.
2022-06-03 11:54:09 +03:00
kd-11 a6e6df1445 gl: Implement fast texture readback for D24X8 and RGBA8/BGRA8 2022-06-03 11:54:09 +03:00
Nekotekina 76c72351a5 rsx_methods: fix warning 2022-06-02 12:56:49 +03:00
kd-11 eb52ac55a7 gl: Fix AMD buffer decode 2022-05-31 23:34:14 +03:00
kd-11 d167582f6b gl: Implement on-chip buffer-to-d24x8 conversion 2022-05-31 23:34:14 +03:00
kd-11 dd6cb054a7 gl: Add missing viewport save 2022-05-31 23:34:14 +03:00
kd-11 b97557ce7b gl: Use DSA for compressed texture upload 2022-05-31 23:34:14 +03:00
kd-11 964fd1095e gl: Properly preserve texture state
- Remove rogue glBindTexture calls and use gl commandstate object instead
2022-05-31 23:34:14 +03:00
kd-11 fcc6c2384b Fix linux build 2022-05-31 23:34:14 +03:00
kd-11 a5d73f41b5 gl: Remove debug message 2022-05-31 23:34:14 +03:00
kd-11 1b305bf789 gl: Workaround for poor AMD OpenGL performance
- Turns out the AMD driver really hates it if you render with a mapped index buffer.
  The driver internally seems to make a copy of the consumed indices and uses that. Very slow.
  I was able to isolate this after observing that glDrawArrays is not entirely shit, but glDrawElements duration scaled linearly with the number of vertices.
2022-05-31 23:34:14 +03:00
kd-11 943752db30 gl: Compute optimizations
- Keep buffers around longer to allow driver heurestics to work
- Properly initialize the shaders to allow optimal workgroup dispatch size
2022-05-31 23:34:14 +03:00
kd-11 60a2a39e88 gl: Deswizzle textures on the GPU 2022-05-31 23:34:14 +03:00
kd-11 532563e861 gl: Update some more buffer-object functions 2022-05-31 23:34:14 +03:00
kd-11 3ee27bd434 gl: Optimize consumption of buffer objects when uploading textures 2022-05-31 23:34:14 +03:00
kd-11 55e68441cb gl: Commit to bindless framebuffer object management 2022-05-31 23:34:14 +03:00
kd-11 7ec481d99b rsx: Allocate scratch memory using simple array with no default initialize
- This cuts down processing time significantly by eliminating calls to memset_stosb
2022-05-31 23:34:14 +03:00
kd-11 129e947720 gl: Improve CS throughput
- Avoids making too many invocations, especially given the 1D nature of some GPU dispatch handlers
2022-05-31 23:34:14 +03:00
kd-11 e964060a6a gl: Handle texture binding using the global state tracker 2022-05-31 23:34:14 +03:00
kd-11 74696d2e44 gl: Commit to a consistent global state 2022-05-31 23:34:14 +03:00
kd-11 78746fdb6f gl: Commit to using DSA for internal buffer management
- Gets rid of spammy BindBuffer calls on every draw
2022-05-31 23:34:14 +03:00
kd-11 ed2068fb03 gl: Rewrite buffer mapping 2022-05-31 23:34:14 +03:00
kd-11 b61c4d3693 gl: Fix stat counters 2022-05-31 23:34:14 +03:00