Commit graph

4256 commits

Author SHA1 Message Date
Eladash 274386a078 rsx: Add some debugging information 2022-09-07 18:39:32 +03:00
Nekotekina 5985f0eefa BufferUtils: cleanup regarding ARM64 2022-09-07 17:59:07 +03:00
Nekotekina 82258915da BufferUtils: rewrite remaining intrinsic code with simd_builder 2022-09-07 17:59:07 +03:00
Nekotekina 11a1f090d3 BufferUtils: simd_builder refactoring
Some simplifications implemented.
2022-09-07 17:59:07 +03:00
Elad Ashkenazi 290226539f
Fix ARM build (#12606) 2022-09-04 21:11:04 +03:00
Eladash 11a197a387 Savestates/RSX: fix unintentional vblank thread spin after abort 2022-09-01 20:09:28 +03:00
Eladash ee1384341e rsx: Implement atomic vertex upload (with Strict Rendering Mode) 2022-09-01 20:09:28 +03:00
Nekotekina 58e3232710 BufferUtils: Fix regression in upload_untouched 2022-09-01 17:39:04 +03:00
Nekotekina e28707055b Implement simd_builder for x86
ASMJIT-based tool for building vectorized loops (such as ones in BufferUtils.cpp)
2022-08-28 18:38:52 +03:00
kd-11 1fc0191311 Fix build 2022-08-23 23:49:46 +03:00
kd-11 1f9e04f72d rsx/vk: Implement flushing surface cache blocks to linear mem 2022-08-23 23:49:46 +03:00
kd-11 bca833dad7 Fix surface reuse 2022-08-20 01:23:15 +03:00
kd-11 f981e05908 rsx: Do not lie about surface details 2022-08-20 01:23:15 +03:00
kd-11 b5abd777b0 rsx: Allow longer dispatch queues to accomodate games with high draw call count 2022-08-19 20:29:32 +03:00
Elad Ashkenazi b2c9add47e rsx: Fix semaphore timeout on boot
Allow semaphore timeout to be disabled again.
2022-08-19 15:40:20 +03:00
kd-11 a401a192b8 Fixup for dst_stage 2022-08-19 14:29:20 +03:00
kd-11 ad1b007dd1 Fix whitespace 2022-08-19 14:29:20 +03:00
kd-11 71e35c8b4d vk: Implement support for VK_EXT_attachment_feedback_loop_layout 2022-08-19 14:29:20 +03:00
kd-11 2e504b2dac rsx: Silence some warnings 2022-08-19 14:29:20 +03:00
kd-11 bacf518189 rsx: Fix 2D intersection tests 2022-08-14 23:53:50 +03:00
kd-11 b960ce1426 vk: Align write length when pre-filling buffers with constant patterns 2022-08-14 23:53:50 +03:00
kd-11 c55a889c23 vk: Initialize buffer info blocks to avoid null descriptors 2022-08-14 23:53:50 +03:00
Eladash 4464a6c3f6 CG-Disasm: Name input/output vetex arrays 2022-08-12 15:20:48 +03:00
Elad Ashkenazi c4cc0154be LV2: Optimizations and fixes
Fix and optimize sys_ppu_thread_yield

Fix LV2 syscalls with timeout bug. (use ppu_thread::cancel_sleep instead)

Move timeout notification out of mutex scope

Allow g_waiting timeouts to be awaked in scope
2022-08-11 11:42:16 +03:00
kd-11 c51d3b5465 Workaround for msvc weirdness 2022-08-09 18:32:54 +03:00
kd-11 e179adc4a0 rsx: Refactor surface cache storage 2022-08-09 18:32:54 +03:00
kd-11 61a055a1c6 Tuning 2022-08-07 22:14:49 +03:00
kd-11 64b4cfa59f rsx: Erase surface background when reloading after a pitch mismatch 2022-08-07 22:14:49 +03:00
kd-11 c799ffd223 rsx: Stubs for pitch conversion 2022-08-07 22:14:49 +03:00
kd-11 2445ab8d8e Fix RSX capture playback 2022-08-04 19:01:45 +03:00
kd-11 3e923b4993 rsx: Optimize VTX_FMT_SNORM16 decoding
- Cuts down SNORM16 overhead by ~65%
2022-08-03 23:33:31 +03:00
kd-11 8181498d86 gl: Alias UBO/SSBO slots to avoid exceeding the available number of binding slots.
- The sets are different anyway and should not overwrite each other in a proper driver.
2022-08-03 23:33:31 +03:00
kd-11 57dd611111 gl: Fix incomplete stencil view of depth-stencil texture
- Samplers must use point sampling for stencil views
2022-08-03 23:33:31 +03:00
Eladash b3162bd41c rsx/vp: Fix SNORM16 vertex decoding 2022-08-03 18:11:46 +03:00
Elad Ashkenazi cd2adbad9a Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi 99730ac4f9 Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi d2ab3383ad Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi 3b15a6b39e Update rsx_methods.cpp 2022-08-03 17:15:59 +03:00
Elad Ashkenazi 651e58f443 rsx: Trivial optimization 2022-08-03 17:15:59 +03:00
Eladash 769f9e33e9 Savestates/RSX: Fix fifo_control::restore_state 2022-08-03 15:35:41 +03:00
kd-11 052725fdc7 rsx: Do not require ZCULL buffer binding to enable ZPASS counting
- ZPASS data is still accessible in unbuffered mode.
  The only thing that buffered ZCULL enables is something closer to early-Z where large blocks of pixels can be dicarded earlier.
  It is strictly a performance optimization and not required for ZPASS to work.
- Update ZCULL stat calculations to take into account unbuffered Z
2022-08-01 00:23:54 +03:00
Megamouse f90b79791f HLE: fix file not found errors in media functions 2022-07-31 16:45:05 +02:00
Megamouse 228844c017 overlays: fix line wrapping and position of lines
- Fix off by one issue when we wrapping a line caused by unnecessary zeroed whitespaces.
- Fix centering of lines that end with carriage return caused by overzealous reset of counters.
- Remove fabs where there shouldn't be any
2022-07-29 09:26:45 +02:00
Megamouse 577f379a12 implement cellPhotoImport 2022-07-26 17:27:35 +02:00
kd-11 c9058280e0 vk: Fix a potential deadlock 2022-07-25 21:05:31 +03:00
kd-11 5af50cfd55 vk: Handle corner cases
- Fix up flush sequence in DMA handling (WCB)
- Do not request resource sharing if queue family is not different!
2022-07-25 21:05:31 +03:00
kd-11 d846142f0c vk: Reimplement compliant async texture streaming
- Use CONCURRENT queue access instead of fighting with queue acquire/release via submit chains.
  The minor benefits of forcing EXCLUSIVE mode are buried under the huge penalty of multiple vkQueueSubmit.
  Batching submits does not help alleviate this situation. We simply must avoid interrupting execution.
2022-07-25 21:05:31 +03:00
Megamouse c40439ae6b cellMusic/Decode: implement playlist shuffle and repeat 2022-07-22 08:42:43 +02:00
kd-11 246bf1df64 Use C++17 ctor for string_view 2022-07-21 22:29:40 +03:00
kd-11 9a868e9239 gl: Silence compiler warning 2022-07-21 22:29:40 +03:00
kd-11 ab3cde1939 gl: Do some macro patching for intel driver 2022-07-21 22:29:40 +03:00
kd-11 bec3e156fb vk: Disable robust buffer access for ANV
- Robust access is nice, but we don't actually need it
2022-07-21 22:29:40 +03:00
Megamouse 086afbbaa5 overlays: implement back and focus in media_list_dialog 2022-07-21 01:36:33 +02:00
kd-11 680f08c2b8 gl: Destroy barrier signals correctly 2022-07-18 18:58:22 +03:00
kd-11 82bac4173e gl: Reuse scratch images 2022-07-18 18:58:22 +03:00
kd-11 8a8fda3e02 gl: Combine RGBA8/D24S8 readback and byteswap into one operation 2022-07-18 18:58:22 +03:00
kd-11 1c5b685398 gl: Only toggle state settings that are relevant to the current RSX state 2022-07-18 18:58:22 +03:00
kd-11 e95084f138 gl: Use DSA for imageview configuration and avoid needless bind operations 2022-07-18 18:58:22 +03:00
kd-11 e12d268662 gl: Implement support for texture1D decode 2022-07-18 18:58:22 +03:00
kd-11 6a3f17cd36 gl: Fix compute invocation counts for format handling code 2022-07-18 18:58:22 +03:00
Eladash 3e51426379 Savestates/SPU: Kill emulation when its safe to save SPU state 2022-07-15 09:30:53 +03:00
Megamouse 105781fa76 overlays: properly align lines with leading or trailing whitespace 2022-07-14 23:32:20 +02:00
Megamouse d2be12bb07 overlays: find missing characters lost during wrapped rendering 2022-07-14 23:32:20 +02:00
Megamouse fdc15e12c4 overlays: properly calculate offsets for wrapped text 2022-07-14 23:32:20 +02:00
Eladash e548743cbf Fixup rsx cpatures 2022-07-14 18:50:31 +03:00
kd-11 cdef752a9c gl: Fix 2D->3D splat in CopyBufferToImage 2022-07-13 02:09:58 +03:00
kd-11 1483941bea gl: Implement row alignment in CopyBufferToImage routines 2022-07-13 02:09:58 +03:00
kd-11 453e1bfaec gl: Silence compiler warning 2022-07-13 02:09:58 +03:00
kd-11 82439327fa gl: Support loading data from SSBO using compute shaders
- Gives better performance than using raw draw calls.
- Does not work with all formats. The draw call version is still used when needed.
2022-07-13 02:09:58 +03:00
kd-11 f60002e87d gl: Optimize memory barriers a bit
- Move waits to server side
- Increase the scratch buffer size to avoid waiting on barriers
2022-07-13 02:09:58 +03:00
kd-11 9fc6382909 gl: Finalize BGRA storage format internals
- Performance is terrible but it works properly now
2022-07-13 02:09:58 +03:00
kd-11 ebad08aa97 gl: Fix image creation for virtual formats 2022-07-13 02:09:58 +03:00
kd-11 599f1dd157 gl: Properly match BGRA RTT formats 2022-07-13 02:09:58 +03:00
kd-11 bb5ce67d57 gl: Handle corner cases for CopyBufferToImage
- Handle 3D textures and cubemaps
- Handle writing to mip > 0
2022-07-13 02:09:58 +03:00
kd-11 f948ce399e gl: Implement CopyBufferToImage in software
- Overrides the drivers CopyBufferToImage handling where possible
2022-07-13 02:09:58 +03:00
kd-11 954c60947d gl: Avoid calling gl functions without a context even if the object is GL_NONE
- While calling glDestroyXXXX with GL_NONE is a no-op, calling it without a context will crash some drivers.
2022-07-13 02:09:58 +03:00
kd-11 98b6783c05 gl: Fix image views broken after refactor 2022-07-13 02:09:58 +03:00
kd-11 0894d2886a Fix build 2022-07-13 02:09:58 +03:00
kd-11 4995b4abe3 gl: Do not use raw GL image copy command for RSX data 2022-07-13 02:09:58 +03:00
kd-11 35ef19cfc8 gl: Refactor the rest of GLHelpers 2022-07-13 02:09:58 +03:00
kd-11 09824a718f gl: Separate BGRA8 storage from RGBA8 2022-07-13 02:09:58 +03:00
Eladash ab27ee4cf4 Savestates/RSX: Save NV406E semaphore waiting 2022-07-12 15:15:42 +03:00
Eladash 24fddf1ded rsx: Fix emu stopping crash when using multi-threaded rsx
FXO signaled abort before it completed its work, leading to unsignalled vk::fence and deadlock. Fix it by deregistering it from FXO.
2022-07-10 14:19:59 +03:00
Eladash 87cd65ff03 Savestates: support game collections 2022-07-10 14:19:59 +03:00
Eladash 4ade06f36f Savestates/RSX: Restore the ZCULL control state
And fix the ZCULL control state at the initial state of RSX.
2022-07-10 14:19:59 +03:00
Nekotekina 4b787b22c8 Implement FN (lambda shortener)
Useful for some higher order functions.
Allows to make short lambdas even shorter.
2022-07-08 14:47:41 +03:00
Eladash 4ac88fa8d3 Savestates/RSX: Save drawing context 2022-07-08 12:57:43 +03:00
Eladash 5f8f9e33f1 RSX/Savestates: Replace GCM hack with a proper fix 2022-07-08 12:57:43 +03:00
Megamouse b683110e72 cellGem/overlays: show cursor if necessary 2022-07-07 12:40:23 +02:00
Megamouse 4823d4c32a input: add background input option
Adds an option to disable background input to the IO tab in the settings dialog.
This will disable pad input as well as ps move and overlays input when the window is unfocused.
2022-07-06 21:49:31 +02:00
Eladash bd9ba7ef1f Remove incorrect Emu.IsStopped() checks 2022-07-05 08:25:36 +02:00
kd-11 fddb6a31a7 Use utils::c_page_size 2022-07-04 22:35:05 +03:00
kd-11 5cafaef0a9 Aarch64 fixes for RSX 2022-07-04 22:35:05 +03:00
Elad Ashkenazi fcd297ffb2
Savestates Support For PS3 Emulation (#10478) 2022-07-04 16:02:17 +03:00
Nekotekina 69912ba3c7 Partial revert for cf0fcf5a2a 2022-06-30 14:38:14 +03:00
Eladash cf0fcf5a2a SPU: Implement execution wake-up delay 2022-06-28 19:54:25 +03:00
Eladash f5a55b3024 rsx: Fixup after #12052 for frame limiter off 2022-06-25 17:39:07 +03:00
Eladash 7422ab9e55 rsx: Do not discard flip notifications 2022-06-25 15:30:41 +02:00
Eladash f66256cc13 rsx: PS3 Native frame limiter improvements, add Infinite frame limiter
* Do not wait on DEVICE 0x30 semaphore, it seems like it is something to do with queue command synchronization.
 - This also fixes cellGcmSetFlipWithWaitLabel which is built specifically to enable accurate RSX flipping time, its waiting command is confirmed to be placed **AFTER** DEVICE 0x30 waiting.
* Fix default vsync state to be enabled. (and set it to enabled in cellGcmSetVBlankFrequency as well)
* Add experimental "Infinite" frame limiter mode.
* Fix spurious enabling of second vblank.
2022-06-25 15:30:41 +02:00
Eladash 5e01ffdfd8 Debugger: Optimize cpu_thread::dump_regs()
Reuse string buffer. Copies and reallocations are expensive with such large strings.
2022-06-23 22:41:32 +02:00
Eladash 3899248305 RSX Debugger: Stable NOP skipping
Allow addresses of NOP blocks to remain consistent in between debugger position changes except for the first which can shrink or grow.
2022-06-21 16:59:45 +03:00
Jeff Guo cefc37a553
PPU LLVM arm64+macOS port (#12115)
* BufferUtils: use naive function pointer on Apple arm64

Use naive function pointer on Apple arm64 because ASLR breaks asmjit.
See BufferUtils.cpp comment for explanation on why this happens and how
to fix if you want to use asmjit.

* build-macos: fix source maps for Mac

Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo
modes.

* LLVM PPU: fix aarch64 on macOS

Force MachO on macOS to fix LLVM being unable to patch relocations
during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by
PPUTranslator/Recompiler.

* virtual memory: use 16k pages on aarch64 macOS

Temporary hack to get things working by using 16k pages instead of 4k
pages in VM emulation.

* PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS

Fixes some intrinsics usage and patches usages of asmjit to properly
emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also
patches the SPU recompiler to properly work on arm64 by telling LLVM to
target arm64.

* virtual memory: fix W^X toggles on macOS aarch64

Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default
to RW mode. For both SPU and PPU execution threads, when initialization
finishes we toggle to RX mode. This exploits Apple's per-thread setting
for RW/RX to let us be technically compliant with the OS's W^X
    enforcement while not needing to actually separate the memory
    allocated for code/data.

* PPU: implement aarch64 specific functions

Implements ppu_gateway for arm64 and patches LLVM initialization to use
the correct triple. Adds some fixes for macOS W^X JIT restrictions when
entering/exiting JITed code.

* PPU: Mark rpcs3 calls as non-tail

Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you
call a function inside e.g. an L2 syscall, it will clobber LR on arm64
and subtly break returns in emulated code. Only JIT -> JIT "calls"
should be tail.

* macOS/arm64: compatibility fixes

* vm: patch virtual memory for arm64 macOS

Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to
existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix
memory unmapping on 16K page machines with a hack to mark "unmapped"
pages as RW.

* PPU: remove wrong comment

* PPU: fix a merge regression

* vm: remove 16k page hacks

* PPU: formatting fixes

* PPU: fix arm64 null function assembly

* ppu: clean up arch-specific instructions
2022-06-14 15:28:38 +03:00
Eladash 264253757c rsx: Improve Null Renderer 2022-06-12 20:54:42 +03:00
Ani 2512e958fa
glsl: Avoid implicit int->uint conversions (#12220) 2022-06-12 18:05:43 +01:00
Elad Ashkenazi 280aa6da91
rsx: Fix NV406E semaphore_acquire timeout detection (#12205) 2022-06-12 12:34:29 +03:00
Malcolm Jestadt 0d022d420b RSX: Add more wide paths for upload_untouched
- Adds AVX512 path for upload_untouched u16 with primitive restart, and
  AVX2 and AVX512 paths for upload_untouched without restart
- The AVX512 paths handle the remainder in simd code with masking, which
  provided a large speedup
- On my i5-1135G7 in demons souls benchmarking a scene in boletaria with
  a lot of geometry on screen via perf:
SSE4_1                      0.64%
AVX2                        0.59%
AVX512                      0.56%
AVX512 w/ remainder masking 0.51%
2022-06-12 06:23:55 +03:00
Elad Ashkenazi ec530a2c91
rsx: Suggest to try setting RSX FIFO Accuracy to a higher mode of accuracy on crash (#12204) 2022-06-11 23:26:12 +02:00
kd-11 7530b3c971 vk: Fix image view search and destroy 2022-06-09 02:13:55 +03:00
Eladash f9bc7458d4 rsx: Resurgence of HLE GCM 2022-06-06 12:56:25 +02:00
kd-11 6c315e8aee gl: Disallow overlapping binding points 2022-06-05 10:13:41 +03:00
Elad Ashkenazi 88faac7bbc
rsx: Minor fixup (#12165) 2022-06-04 15:04:27 +01:00
Elad Ashkenazi 9bb7e8d614
rsx: Implement atomic FIFO fetching (stability improvement) (non-default setting) (#12107) 2022-06-04 15:35:06 +03:00
kd-11 286f97fad0 rsx: Reduce some error spam 2022-06-04 14:02:33 +03:00
kd-11 f0a02e0d9d gl: Fix leaking texture views 2022-06-04 14:02:33 +03:00
kd-11 8185bfe893 gl: Track image destruction and remove handles from state tracker
- Handles are reused for different resources which can cause problems
2022-06-04 14:02:33 +03:00
kd-11 d577cebd89 gl: Refactor image and command-context handling
- Move texture object code out of the monolithic header
- All texture binds go through the shared state
- Transient texture binds use a dedicated temp image slot shared with native UI
2022-06-04 14:02:33 +03:00
kd-11 167161d8ce rsx: Restore some accidentally removed depth-format conversion macros 2022-06-03 11:54:09 +03:00
kd-11 b8b0ecabd8 gl: Fix data pointer on the optimized AMD path 2022-06-03 11:54:09 +03:00
kd-11 bb05de2e80 gl: Fix copypasta 2022-06-03 11:54:09 +03:00
kd-11 7890e87234 gl: Fix warning 2022-06-03 11:54:09 +03:00
kd-11 25c05867d6 gl: Fix ring buffer remove() function
- Fixes crash on running a second game in the same session
2022-06-03 11:54:09 +03:00
kd-11 a421270c19 gl: Use new scratch buffer system 2022-06-03 11:54:09 +03:00
kd-11 764fb57fdc gl: Implement scratch ring buffer with memory barriers 2022-06-03 11:54:09 +03:00
kd-11 3fd846687e gl: Refactor buffer object code 2022-06-03 11:54:09 +03:00
kd-11 ff9c939720 gl: Assume decode buffer is to be used as SSBO as this seems to be a hint to the driver about where to put the buffer
Part of OpenGL's achilles' heel - the API does not distinguish between VRAM and SYSTEM memory at all and relies on developers wrestling with the driver's heurestic algorithm for this.
2022-06-03 11:54:09 +03:00
kd-11 234db2be3f gl: Fix texture binding in overlay renderer 2022-06-03 11:54:09 +03:00
kd-11 fc44d53bb0 gl: Reset buffer size on destroying the GPU handle 2022-06-03 11:54:09 +03:00
kd-11 555a4b5f5c gl: Suggest readback buffer as ssbo if it is not provided
- We're likely to jump into a compute or readback pass anyway.
2022-06-03 11:54:09 +03:00
kd-11 a6e6df1445 gl: Implement fast texture readback for D24X8 and RGBA8/BGRA8 2022-06-03 11:54:09 +03:00
Nekotekina 76c72351a5 rsx_methods: fix warning 2022-06-02 12:56:49 +03:00
kd-11 eb52ac55a7 gl: Fix AMD buffer decode 2022-05-31 23:34:14 +03:00
kd-11 d167582f6b gl: Implement on-chip buffer-to-d24x8 conversion 2022-05-31 23:34:14 +03:00
kd-11 dd6cb054a7 gl: Add missing viewport save 2022-05-31 23:34:14 +03:00
kd-11 b97557ce7b gl: Use DSA for compressed texture upload 2022-05-31 23:34:14 +03:00
kd-11 964fd1095e gl: Properly preserve texture state
- Remove rogue glBindTexture calls and use gl commandstate object instead
2022-05-31 23:34:14 +03:00
kd-11 fcc6c2384b Fix linux build 2022-05-31 23:34:14 +03:00
kd-11 a5d73f41b5 gl: Remove debug message 2022-05-31 23:34:14 +03:00
kd-11 1b305bf789 gl: Workaround for poor AMD OpenGL performance
- Turns out the AMD driver really hates it if you render with a mapped index buffer.
  The driver internally seems to make a copy of the consumed indices and uses that. Very slow.
  I was able to isolate this after observing that glDrawArrays is not entirely shit, but glDrawElements duration scaled linearly with the number of vertices.
2022-05-31 23:34:14 +03:00
kd-11 943752db30 gl: Compute optimizations
- Keep buffers around longer to allow driver heurestics to work
- Properly initialize the shaders to allow optimal workgroup dispatch size
2022-05-31 23:34:14 +03:00
kd-11 60a2a39e88 gl: Deswizzle textures on the GPU 2022-05-31 23:34:14 +03:00
kd-11 532563e861 gl: Update some more buffer-object functions 2022-05-31 23:34:14 +03:00
kd-11 3ee27bd434 gl: Optimize consumption of buffer objects when uploading textures 2022-05-31 23:34:14 +03:00
kd-11 55e68441cb gl: Commit to bindless framebuffer object management 2022-05-31 23:34:14 +03:00
kd-11 7ec481d99b rsx: Allocate scratch memory using simple array with no default initialize
- This cuts down processing time significantly by eliminating calls to memset_stosb
2022-05-31 23:34:14 +03:00
kd-11 129e947720 gl: Improve CS throughput
- Avoids making too many invocations, especially given the 1D nature of some GPU dispatch handlers
2022-05-31 23:34:14 +03:00
kd-11 e964060a6a gl: Handle texture binding using the global state tracker 2022-05-31 23:34:14 +03:00
kd-11 74696d2e44 gl: Commit to a consistent global state 2022-05-31 23:34:14 +03:00
kd-11 78746fdb6f gl: Commit to using DSA for internal buffer management
- Gets rid of spammy BindBuffer calls on every draw
2022-05-31 23:34:14 +03:00
kd-11 ed2068fb03 gl: Rewrite buffer mapping 2022-05-31 23:34:14 +03:00
kd-11 b61c4d3693 gl: Fix stat counters 2022-05-31 23:34:14 +03:00
kd-11 81b9952e34 gl: Do not allow cross-aspect bitcasts
- There is special handling for some cross-aspect bitcasts in vulkan, but this is not possible using OpenGL
2022-05-31 23:34:14 +03:00
Elad Ashkenazi 95233b5299 rsx: Fix deadlock in vm::_page_unmap 2022-05-30 11:53:34 +03:00
Elad Ashkenazi 610d29dab0 rsx: Fix VBLANK time 2022-05-28 13:00:42 +02:00
Megamouse 345bda69ec Overlays: Add screenshot message to queue 2022-05-26 08:52:12 +02:00
kd-11 9c824aa0b5 vk: Enable event scope hack for INTEL proprietary drivers 2022-05-24 20:11:31 +03:00
kd-11 efff2a78c8
vk: Restructure how the conditional render evaluation is done (#12071)
Fixes conditional render fast-path
2022-05-24 11:11:21 +03:00
RipleyTom e68ffdbc81 Add a message overlay 2022-05-23 08:38:02 +02:00
kd-11 7c8fbc35bc rsx: Move PS3-compliant behavior to a new option 2022-05-21 16:35:35 +03:00
kd-11 b637429e44 Fix display flickering 2022-05-21 16:35:35 +03:00
kd-11 d52bb78d2c rsx: Trivial non-blocking display synchronization 2022-05-21 16:35:35 +03:00
kd-11 4e6be9172a rsx: Asynchronously flush the pipelines when handing ZCULL memory access violations 2022-05-21 10:06:32 +03:00
kd-11 0e1333ed5f rsx: Deadlock avoidance of accurate RSX reservations 2022-05-21 10:06:32 +03:00
Eladash cd74fb6a6d rsx: Implement HW accurate frame limiter 2022-05-20 22:40:48 +02:00
kd-11 ec2d529832 rsx: Separate loop interrupts from graphics state
- The interrupts are for multithreaded signals andmake the main loop run more aggressively for the next cycle
2022-05-20 16:29:27 +03:00
kd-11 257556bbf5 rsx: Add eng lock before flagging memory unmap
- This is much better than polling on atomics every cycle for something that happens a few times during gameplay
2022-05-20 16:29:27 +03:00
kd-11 93d93b4805 rsx: Fix typo 2022-05-20 16:29:27 +03:00
kd-11 e368453751 rsx: Rework loop interrupts a bit
- Reset backend interrupt in core handler
- Separate memory config interrupt from regular backend interrupt
2022-05-20 16:29:27 +03:00
kd-11 d0dc095c84 rsx: Silence some log spam 2022-05-20 16:29:27 +03:00
kd-11 360fdca5ac vk: Avoid multimap when handling image views 2022-05-20 16:29:27 +03:00
kd-11 e1b95913ea rsx/zcull: Improve deadlock avoidance
- Do not acquire eng lock while holding the page lock
  RSXThread may be waiting on the page lock and will never ack the pause request
2022-05-20 16:29:27 +03:00
kd-11 a3ea9e2985 rsx/zcull: Less aggressive disabling of optimizations 2022-05-20 16:29:27 +03:00
kd-11 e9bf3e13d0 rsx/zcull: Pause the main thread before flushing reports 2022-05-20 16:29:27 +03:00
kd-11 094fda0e73 Crash fix 2022-05-20 16:29:27 +03:00
kd-11 d2de560060 rsx: Improve sync_hint callback interface 2022-05-20 16:29:27 +03:00
kd-11 5315eb546f rsx: Stop spamming ZCULL update method
- This has a negative impact when ZCULL is active due to spamming __rdtsc
- While the method is fast, it is not free and some checks are done before the instruction can be emitted
  Let's use the saved time to actually get something useful done
2022-05-20 16:29:27 +03:00
kd-11 7fa521a046 rsx/vk: Redesign how conditional rendering hints work
- Pass a sync address to the backend
- Ignore the hint if the query is running in lazy mode
- Do not submit CBs too close to each other. Submits are expensive
2022-05-20 16:29:27 +03:00
kd-11 0244c4046e rsx: Lower performance hit due to frequency fetch 2022-05-20 16:29:27 +03:00
kd-11 7e8c93bea2 Random optimization 2022-05-20 16:29:27 +03:00
kd-11 9a1e6cc3e8 rsx: Implement RSX reports area access detection and optimize around it
- If nobody is reading RSX reports, do not be in a hurry to write them
- Requires HLE of some methods (cellGcmGetTimestamp) to function correctly
2022-05-20 16:29:27 +03:00
kd-11 f0135a02f5 vk: Unconditionally enable hw acceleration for conditional evaluation 2022-05-20 16:29:27 +03:00
kd-11 0b7e013fbe rsx: Simplify ZCULL logic a bit 2022-05-20 16:29:27 +03:00
kd-11 850eef0c1a rsx: Move ZCULL logic to its own file
- It's over 1k lines of code in its own namespace; it really should be in its own file
2022-05-20 16:29:27 +03:00
Nekotekina a2bfd5fcfc Minor AArch64 support changes 2022-05-04 16:12:32 +03:00
RipleyTom 8316469cfc Update libusb to v1.0.26 2022-04-29 02:04:52 +02:00
kd-11 7a434d19a6 rsx/vp: Zero-initialize temporary registers 2022-04-28 01:31:07 +03:00
kd-11 95ac7724a6 Fix typos 2022-04-28 01:31:07 +03:00
kd-11 e236ba4daf rsx: Improve lowered precision comparison emulation 2022-04-28 01:31:07 +03:00
Megamouse 3183d73e4d OSK/overlays: fix initial input interception
Don't use default interception if we already intercept with custom params.
2022-04-26 00:51:38 +02:00
Eladash 7329fa9cf5 TRPLoader: Use std::string_view 2022-04-25 20:15:10 +02:00
Megamouse 8d662e9327 overlays: enable key repeat by default 2022-04-25 19:44:56 +02:00
Megamouse ff7636ea01 OSK/overlays: handle keyboard enter and escape 2022-04-25 19:44:56 +02:00
Megamouse 8f14f392fd overlays: ignore input if kb pad handler is active 2022-04-25 19:44:56 +02:00
Megamouse 5fad7e1b87 OSK: flush key input to prevent key event spam 2022-04-25 19:44:56 +02:00
Megamouse 8864f944e2 cellOskDialog: implement dimmer_enabled 2022-04-25 19:44:56 +02:00
Megamouse 918984ee64 overlays: only log actual input loop errors 2022-04-25 19:44:56 +02:00
Megamouse b29f106c51 cellOskDialog: implement base_color 2022-04-25 19:44:56 +02:00
Megamouse 71f8280c5e cellOskDialog: implement KeyboardEventHookCallback 2022-04-25 19:44:56 +02:00
Megamouse 0ff293707a OSK: allow device input during interception 2022-04-25 19:44:56 +02:00
Megamouse 9adab801ac cellOskDialog: implement device mask and lock 2022-04-25 19:44:56 +02:00
Megamouse aee91b4f6f OSK: Ignore gamepad input if a key was pressed 2022-04-25 19:44:56 +02:00
Megamouse ffd36ea662 OSK: handle keyboard input 2022-04-25 19:44:56 +02:00
nastys f21b298e5e Make MSL Fast Math and software vkSemaphore optional 2022-04-24 09:25:13 +02:00
Eladash f92b487947 rsx: Allow NV0039 0x2100 2022-04-22 18:20:23 +03:00
kd-11 bca7b02ae9 Fix compressed pitch calculation 2022-04-19 22:58:29 +03:00
sguo35 e761b3235c macos: fix build for arm64
Adds arm64 branches to some x86 specific code and modifies some casting
logic to make Clang happy
2022-04-18 17:53:54 +03:00
Eladash 6783bcd273 Log a snippet of guest thread code at crash 2022-04-15 22:34:51 +03:00
Eladash 1d51f3af0c RSX-Debugger: Implement backwards scrolling
* Use 2 points of known true RSX code roots and follow them in order to peek at the current section of valid RSX code:
These roots are: current RSX instruction address and the last targeted address by a branch instruction.
2022-04-15 22:34:51 +03:00
kd-11 57aee92bfe rsx: Separate guest flip timer from host timing operations 2022-04-13 23:39:01 +03:00
kd-11 89de1a8cf6 overlays: Fix frame timing 2022-04-13 23:39:01 +03:00
kd-11 60cbd7a88c Automatically determine the epsilon value programatically 2022-04-13 15:48:28 +03:00
kd-11 2db68acab9 rsx: Implement Z value snapping to account for precision errors 2022-04-13 15:48:28 +03:00
kd-11 e53bbd668b rsx: Fix surface cache scanning and removal 2022-04-05 14:07:05 +03:00
kd-11 fc05511354 rsx: Optimize software sampling further for the 6-tap kernel 2022-04-04 16:51:03 +03:00
kd-11 ca35a75a7d rework weighting scheme 2022-04-04 16:51:03 +03:00
kd-11 15b7e4f05e 6-tap experiment 2022-04-04 16:51:03 +03:00
kd-11 49c84f099a rsx/glsl: Fixup 2022-04-04 16:51:03 +03:00
kd-11 43b267ea51 glsl: Rewrite MS sampling implementation 2022-04-04 16:51:03 +03:00
kd-11 a8441b28e8 rsx: Implement basic 2D bilinear filtering for MSAA images 2022-04-04 16:51:03 +03:00
kd-11 4a86638ce8 rsx: Avoid unnecessary memprotect syscalls 2022-03-29 12:35:32 +03:00
kd-11 e037b5c438 rsx: Handle in-place image swaps when locking data for WCB/WDB
- Rare, but possible if a surface address is switched from color to depth usage
- In such a case, deref the old image and ref the new one to avoid leaks
2022-03-29 12:35:32 +03:00
kd-11 f45343a345 rsx: Handle DMA block init where empty pages exist in the range 2022-03-29 12:35:32 +03:00
kd-11 94a7e52c1f rsx: Disable ref count on exit 2022-03-28 19:55:34 +03:00
kd-11 2b42895bc7 rsx: Reduce log spam a bit 2022-03-28 19:55:34 +03:00
kd-11 d98d152d23 rsx: Fix leaking surface cache refs from texture cache
- Lock surfaces in use by texture cache to prevent complete deletion
- Remove discarded surfaces from the reprotect cache to avoid uaf
2022-03-28 19:55:34 +03:00
kd-11 b645a7faf5 vk: Rebuild swapchain in case of unexpected errors during present 2022-03-28 19:55:34 +03:00
kd-11 ffa841e7c1 vk: Force resolve explicitly for transfer operations 2022-03-28 19:55:34 +03:00
kd-11 e66d6a9399 Fix interpreter 2022-03-26 16:10:18 +03:00
kd-11 ef65c47592 vk: Restore UBO alignment
- NV requires some very large alignment thresholds
2022-03-26 16:10:18 +03:00
kd-11 1592ecdc55 rsx: Invalidate transform block on program change
- Since each program now does a remap of the outputs, we need to reupload the constants
- This is not a loss, constants are almost always changing between draw calls anyway
2022-03-26 16:10:18 +03:00
kd-11 96742852eb Fix OGL 2022-03-26 16:10:18 +03:00
kd-11 de0e660d28 rsx: Handle vertex shaders with no constant references
- If no vc[] refs exist, do not upload anything!
2022-03-26 16:10:18 +03:00
kd-11 d057ffe80f rsx: Fix program generation and compact referenced data blocks 2022-03-26 16:10:18 +03:00
kd-11 9a2d4fe46b rsx: Relocatable transform constants 2022-03-26 16:10:18 +03:00
RipleyTom a4d715e25d Warning Fixes 2022-03-23 19:35:10 +01:00
kd-11 af0e1f609e Fix vulkan compilation warnings 2022-03-23 11:26:06 +03:00
kd-11 1ab5b481ff Fix ambiguous comparison operator warning 2022-03-23 11:26:06 +03:00
kd-11 26ee1246ae rsx: Block size back down to 4MB
- 4M is a good compromise, a 720p surface occupies just under 4MB
2022-03-23 11:26:06 +03:00
kd-11 d0402332f7 rsx: Bump surface cache block size to 16M 2022-03-23 11:26:06 +03:00
kd-11 43c7417906 rsx: Rework ranged map
- Adds metadata lookup for intersecting range calculations
- Make fetch/put methods more explicit
2022-03-23 11:26:06 +03:00
kd-11 56540a55ec Fix linux 2022-03-23 11:26:06 +03:00
kd-11 35ec4de776 rsx: Optimize surface store for faster scanning 2022-03-23 11:26:06 +03:00
kd-11 bc7ed8eaab rsx/vk: Rework MSAA implementation 2022-03-17 22:02:20 +03:00
Megamouse 04df392866 Log cpu usage periodically 2022-03-16 19:42:06 +01:00
kd-11 78b8bd80e4 rsx: Unconditionally set MSAA flags if MSAA is active 2022-03-11 01:15:13 +03:00
kd-11 1943d9819f rsx: Clean up surface cache routines around RTT invalidate 2022-03-10 20:43:58 +03:00
kd-11 59a0cf94ab rsx: Fix msvc build 2022-03-08 22:06:26 +03:00
kd-11 3e4faf602a rsx: Fix clang build 2022-03-08 22:06:26 +03:00
kd-11 454a724f4e rsx: Reduce the performance impact of enabling the profiling timer
- Just use TSC if available
2022-03-08 22:06:26 +03:00
kd-11 cfecbb24ca rsx: Avoid calling slow functions every draw call
- Use TSC for timing where interval duration matters.
- Use atomic counter for ordering timestamps otherwise.
2022-03-08 22:06:26 +03:00
kd-11 762b594927 rsx: Fully process texture if surface cache configuration changed 2022-03-08 22:06:26 +03:00
kd-11 8d3d290e33 rsx: Fix build 2022-03-08 22:06:26 +03:00
kd-11 0df903090d rsx: Optimize metrics a bit
- For some reason this has a massive impact on performance above some arbitrary threshold of calls
  Shows up under surface_cache::get_merged_memory_region when doing gathers.
2022-03-08 22:06:26 +03:00
kd-11 6812fa4764 rsx: Fix surface write coherency when MSAA is active 2022-03-08 22:06:26 +03:00
Megamouse cd97d74f0f cellMusic/Decode: add SelectContents functions 2022-03-08 09:02:59 +01:00
Megamouse aafd74f9ea cellMusicDecode: initial implementation
Implements the basic functionality of cellMusicDecode.
Works with Space Invaders (if you add the list selection from the other PR).
Probably fixes SSX custom music.
2022-03-05 18:34:27 +01:00
kd-11 0dbfe314a3 vk: Encode image type when caching resources 2022-03-01 21:51:55 +03:00
kd-11 00a1864a95 Revert "rsx: Downgrade depth-1 3D images to 2D (#11593)"
This reverts commit 6c096b72b5.
2022-03-01 21:51:55 +03:00
kd-11 6c096b72b5
rsx: Downgrade depth-1 3D images to 2D (#11593)
- Fixes problems with implicit view types derived from dimensions.
2022-03-01 10:45:50 +03:00
kd-11 e035000864 vk: Do not enable passthrough DMA unconditionally (yet)
- There are still some kinks to work out. Host labels do not fix all the bugs which means I missed something.
2022-02-26 10:28:46 +03:00
kd-11 6db5d83615 Flush dma offloader on texture read sema 2022-02-25 10:53:55 +03:00
kd-11 f3823232e0 Disable passthrough DMA for proprietary intel driver 2022-02-23 21:15:08 +03:00
kd-11 6b8b23c401 vk: Drain the label queue before using the CPU fallback to avoid out-of-order signals
- This avoids crashes in some game engines which expect RSX semaphores to signal in the order they are submitted.
2022-02-23 12:57:04 +03:00
kd-11 6fd2a9b677 rsx: Remove leftover dprints 2022-02-23 12:57:04 +03:00
kd-11 da559b5568 vk/rsx: Tuning and optimization for host labels 2022-02-23 12:57:04 +03:00
kd-11 24587ab459 rsx: Add the option to the advanced tab 2022-02-23 12:57:04 +03:00
kd-11 c7e49b58a8 rsx: Implement host GPU sync labels 2022-02-23 12:57:04 +03:00
kd-11 10e6b43a2f Drop redundant declaration 2022-02-21 23:58:01 +03:00
kd-11 0809e7cf9f Fix build 2022-02-21 23:58:01 +03:00
kd-11 12fd43e1c6 vk: Remove unused variables 2022-02-21 23:58:01 +03:00
kd-11 397a795e75 vk: Remove hardcoded command buffer list length 2022-02-21 23:58:01 +03:00
kd-11 1f9ade0ab6 vk: Remove pointless function (VKGSRender::open_command_buffer)
A relic of the past, back before we wrote wrappers for raw handles.
2022-02-21 23:58:01 +03:00
kd-11 83407c386c vk: Move renderer types to a separate file
- Makes my life easier managing conflicts
2022-02-21 23:58:01 +03:00
kd-11 b791d90b35 vk: Rewrite command buffer chains 2022-02-21 23:58:01 +03:00
Megamouse 93e7988df7 rsx: add boost mode shortcut 2022-02-20 11:56:11 +01:00
nastys 7801e8368b Add MoltenVK Semaphore setting 2022-02-20 08:47:16 +01:00
kd-11 254ddcad51 vk/dma: Initialize COW DMA block contents to avoid leaks
- It is possible to lose data when uploading since the result of map_dma can change types and handles.
- Consider sync-on-exit for inherited spans

Not a problem when using passthrough DMA, but this extension does not work properly on NVIDIA + windows
2022-02-16 16:33:27 +03:00
kd-11 2d5d5746d1 gl: Harmonize format conversion values
- Return values that are true to the PS3, not the host.
2022-02-13 15:31:39 +03:00
kd-11 314b63eebf vk: Drop unused native format ABGR8 2022-02-13 15:31:39 +03:00
kd-11 f382d54e9a gl: Remove pointless assert 2022-02-13 15:31:39 +03:00
kd-11 df5295ae85 vk: Per work-queue scratch resources
- Avoids parallel tasks from trampling over each other's data
2022-02-13 14:39:42 +03:00
kd-11 c8ad8b18bb vk: Ignore queue transfer stuff when using 'fast' mode 2022-02-13 14:39:42 +03:00
kd-11 44cc254620 Fix linux build 2022-02-13 14:39:42 +03:00
kd-11 cef512a123 vk: Spec-compliant async compute 2022-02-13 14:39:42 +03:00
kd-11 ec3e8de780 rsx: End the current frame before performing cache cleanup to release in-flight data 2022-02-10 22:20:56 +03:00
kd-11 f667b52cca vk: Rewrite resource management 2022-02-10 22:20:56 +03:00
kd-11 48b54131f6 vk: Fix up multiple resource allocation routines
- Originally part of async bringup. Imported to allow smoother transition.
2022-02-10 22:20:56 +03:00
Megamouse d172b9add6 Rename CallAfter to CallFromMainThread 2022-02-07 19:42:08 +01:00
kd-11 2d9f21a2ea rsx: Lower performance warnings to 'warn' level instead of 'error' level to avoid causing panic for users 2022-02-07 09:25:01 +03:00
kd-11 247759b75b rsx: Fix memory tagging and add some security checks 2022-02-07 09:25:01 +03:00
kd-11 90d368ae30 vk: Speed up cached image search a bit 2022-02-06 15:49:50 +03:00
kd-11 a2d33a7d76 vk: Fix WCB crash 2022-02-06 15:49:50 +03:00
kd-11 51f9310b9f vk: Silence compiler warnings 2022-02-06 15:49:50 +03:00
kd-11 dca3d477c9 vk: Use image hot-cache for faster allocation times
- Creating new images is expensive.
- We can keep around a set of images that have been recently discarded and use them instead of creating new ones from scratch each time.
2022-02-06 15:49:50 +03:00
nastys 6b370e85d5 Add overlay animations 2022-02-06 12:26:34 +01:00
Eladash e951c619c5
Implement Emulator::GracefulShutdown() 2022-02-05 11:49:29 +01:00
kd-11 86919ec0e1 rsx: Validate requested images before attempting to upload them
- Do not allow dimensions of 0 to reach the backend APIs
2022-01-30 14:58:51 +03:00
kd-11 0e320d17c1 vk: Fix 'grow' behavior when we reach the size limit
- Just swap out the current heap ptr and spawn a fresh one. Chances are, we can spare 1GB of host memory.
2022-01-30 10:56:15 +03:00
kd-11 d063f0b335 vk: Fix working buffer calculation for emulated D16F operations 2022-01-30 10:56:15 +03:00
Eladash 781b2b4548
Implement fs::isfile (#11447) 2022-01-29 22:10:48 +03:00
Nekotekina 16aae4eb77 Fixup creating image path 2022-01-26 15:46:16 +03:00
Nekotekina 3a1082fe0d Fix overlays::image_info constructor 2022-01-26 15:46:16 +03:00
kd-11 ffe00e8619 gl: Clean up format bitcast checks and register D32F type for FORMAT_CLASS16F
- Also hides a dangerous export for vulkan, same as GL
2022-01-26 12:08:36 +03:00
kd-11 3fa45ff994 Fix missing typeless info update 2022-01-26 12:08:36 +03:00
Eladash 73ff506b88 overlay_controls.cpp: Improve image_info ctor withstandability 2022-01-26 10:35:52 +03:00
kd-11 3a1676e558 vk: Fix float16 requirement issue 2022-01-25 21:34:21 +03:00
Nekotekina 0db9850a73 Add loop building utilities for ASMJIT
Refactor copy_data_swap_u32 a bit
2022-01-25 03:16:37 +03:00
Nekotekina 12c83b340d Remove built_function
With today's branch prediction techniques, it's hardly useful.
2022-01-24 22:21:41 +03:00
kd-11 1fa82eec89 vk: Rework format feature validation
- Requirements have changed a lot over the years. We no longer blit Z formats around for example because they never support linear filtering
- Removing some unused requirements allows more hardware to be usable
2022-01-24 19:14:27 +03:00
kd-11 2f7d38bb81 rsx: Improve coverage checking logic to handle 3D and cubemap resources 2022-01-23 00:03:03 +03:00
kd-11 4f8b5849b7 rsx: Take depth into account when calculating coverage 2022-01-23 00:03:03 +03:00
kd-11 7f216f2581 rsx: Fix local slice height calculation 2022-01-23 00:03:03 +03:00
kd-11 6ffd38c393 vk: Only enable DCC workaround if the format features allow it 2022-01-22 13:16:48 +03:00
nastys 801e7f3c2f macOS: Implement texture swizzling for 16-bit formats 2022-01-22 00:17:17 +01:00
nastys c7140df5f8 Initial support for Apple GPUs 2022-01-22 00:17:17 +01:00
nastys 6b5f0957ce Disable macOS swizzling workaround 2022-01-22 00:17:17 +01:00
kd-11 3942a464fe vk: Avoid leaking descriptor copies 2022-01-20 19:21:24 +03:00
kd-11 2331dc3256 vk: Keep the total number of allocated samplers under control 2022-01-20 19:21:24 +03:00
Nekotekina 4704367382 Remove unnecessary asmjit::imm_ptr 2022-01-18 00:10:32 +03:00
Nekotekina 14cca55b50 PPU: refactor vector rounding instructions
Fix: nearbyint -> roundeven
2022-01-18 00:10:32 +03:00
kd-11 000ec71629 Fix invalid descriptor setup if subdraw0 has broken vertex setup 2022-01-17 12:38:10 +03:00
kd-11 3e794e7fdb rsx: Optimize 8-bit rounding logic a bit
- NV hw does not like the raw use of round()
2022-01-17 10:28:23 +03:00
kd-11 c38ca21a81 rsx: Round up 8-bit ROP output on NVIDIA cards
- NV GPUs have a tendancy to be off by a very small margin, breaking rendering when greaterThan/lessThan checks are used.
- NOTE: Currently this setting is using the sRGB flag which indicates 8-bit output.
  Only one game is currently known to care about this behaviour so this is good enough for now.
2022-01-17 10:28:23 +03:00
kd-11 f923eaf09a rsx: Surface format remapping enhancements 2022-01-17 10:28:23 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
kd-11 d6aa834b5f vk: Enable shading rate hack for all GPUs
- This is a hack, ideally we should be using coverage-based masking when writing the exploded texture.
- We do not have access to the fragment coverage mask and it is non-trivial to integrate it in a competent manner.
2022-01-14 10:21:38 +03:00
kd-11 6d737e61fd rsx: Use 32 bit integers for pitch
- RSX max pitch = 65536 which requires 17 bits
2022-01-10 12:27:30 +03:00
kd-11 83026fd263 rsx: use coverage ratio to determine when too much data is overlapping 2022-01-07 22:55:27 +03:00
kd-11 92824b6729 rsx: Rework invalidation tagging 2022-01-07 22:55:27 +03:00
kd-11 7563655221 rsx: Bump surface removal threshold values
- It is much slower to attempt surface removal than to render duplicates on the host GPU
2022-01-07 22:55:27 +03:00
kd-11 6889b48973 rsx: Add optimized version of section removal code 2022-01-07 22:55:27 +03:00
Eladash bba528e2ae
rsx: Fix wrong fault report in initialization (#11323)
* rsx: Fix wrong fault report in initialization

* Ensure emu.isstopped() == true at RPCS3 startup

Based on zero initialization.
2022-01-05 20:41:01 +03:00
kd-11 7c47b0029c gl: Fully drop alignment restriction for compressed textures
- This is just not part of spec, there is no enforcement for multiple of block size for width or height of s3tc compressed images.
- This restriction does indeed exist for ASTC and ETC but we're not using those formats.
2022-01-02 14:29:38 +03:00
Nekotekina cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina 510041a873 rsx_methods.cpp: optimize compile time (120s to 10s)
Untemplate NV308A_COLOR
2021-12-26 14:40:21 +03:00
Nekotekina 8b4b6ba946 copy_data_swap_u32: build AVX-512 path 2021-12-26 14:40:21 +03:00
Nekotekina 599e00d6da BufferUtils: remove dead code (vertex streaming)
RIP. It won't be useful.
2021-12-26 14:40:21 +03:00
Nekotekina 3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
kd-11 a9303acfdf rsx: Fix zclip w scaling 2021-12-26 12:50:31 +03:00
nastys a0040e6fb1
macOS: Implement texture converter for Metal (2) (#11289)
* macOS: Implement texture converter for Metal (2)

* Fix texture conversion formatting
2021-12-24 15:46:37 +03:00
kd-11 28d7af313b rsx: Remove noisy debug print 2021-12-24 15:13:33 +03:00
kd-11 39ef39aa4e rsx: Exercise caution when testing for overlaps in invalidated sections 2021-12-24 15:13:33 +03:00
kd-11 56dd09f4fe rsx: Handle floating point shenanigans
- If near and far clip are too close together, the API will not distinguish between them leading to out of bounds values
2021-12-22 22:08:53 +03:00
kd-11 de495952fd rsx: Enable fallback for devices without wide integer Z buffers 2021-12-22 22:08:53 +03:00
kd-11 1ce5349199 rsx: Remove zclip hackery
- Calculates precise Z value as requested by the game
- Works properly if the underlying Z format matches the PS3 1:1 but may cause minor problems otherwise
2021-12-22 22:08:53 +03:00
Nekotekina 12e3c9e08b Use PAUSE in vk::query_pool_manager::get_query_result 2021-12-21 23:28:09 +03:00
Nekotekina 262ff01619 Use aligned stores in write_index_array_data_to_buffer
Ensure that target buffer is cache line aligned.
Improve stx::make_single to support alignment.
2021-12-21 23:28:09 +03:00
Nekotekina 76ccaf5e6f BufferUtils: refactoring
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
2021-12-21 23:28:09 +03:00
nastys 47e4a95d8f
Fix remap_vector redefinition on macOS (#11271) 2021-12-21 10:36:09 +01:00
nastys 08333e0876
macOS moltenVK support and SIGBUS handling (#11252) 2021-12-12 21:35:56 +01:00
kd-11 d523f9cc6b rsx: Avoid skipping input mask checks due to static flow control 2021-12-08 23:58:32 +03:00
kd-11 7ca15c60bb rsx: Improve image aspect tests
- Replace old format-based detection with proper aspect test.
  Explicit image aspect has been available for a long time, but older
  code was not updated.
2021-12-08 23:58:32 +03:00
Nekotekina d6420b8803 Put std::hash specialization out of std 2021-12-07 13:04:10 +03:00
DH 49c02854f5 [rsx] reduce size of config structs 2021-12-02 21:36:57 +03:00
DH cccfb89aa0 [Config] Use std::less<> for std::map<...>
Reduces amount of string copies
[Utilities] fmt::replace_all: avoid creation of temporary strings
2021-12-02 21:36:57 +03:00
kd-11 02832d9623
rsx: Add some sensible fallbacks (#11219)
* rsx: Add some sensible fallbacks

* Update GLPresent.cpp

* Update VKPresent.cpp

* Update rsx_utils.h

* Update rsx_utils.cpp
2021-12-02 16:02:08 +03:00
kd-11 9bb46aa944 rsx: Simplify unconstrained aspect ratio conversion
- There is a reason resolutions are defined by only a height variable.
2021-12-01 21:55:53 +01:00
Megamouse aea1ec2594 avconf: Add const to fxo references 2021-12-01 21:55:53 +01:00
kd-11 22a7b026e7 rsx: Fix image scaling
- Specifically fixes a corner case where double transforms are required.
  Technically this can be made more readable using transformation matrices:
  * M1 = transform_virtual_to_physical()
  * M2 = transform_image_to_virtual()
  * M3 = M1 * M2
  * Result = Input * M3
  But we don't use a CPU-side matrix library and it is not reasonable to do this on the GPU.
2021-12-01 21:55:53 +01:00
Megamouse c8d4a0dcdc VK/GL: honor game's aspect ratio when scaling 2021-12-01 21:55:53 +01:00
kd-11 38bfefcdfa vk: Fix incorrect mixed transfer modes for mipmapped VTC 2021-11-28 01:44:21 +03:00
kd-11 44fe6f6d39 rsx: Fix sloppy format matching test 2021-11-27 17:47:41 +03:00
orbea a84223bdc6 rpcs3: Fix the DATADIR path for AppImage
Even when DATADIR is defined the other paths may still be correct.

Fixes: https://github.com/RPCS3/rpcs3/issues/11195
2021-11-24 19:14:06 +01:00
kd-11 4df1a938b1 Unused var 2021-11-24 16:02:24 +03:00
kd-11 94a3b1cfe8 rsx: Roll back some optimizations
- Just use RGB565 for all blit targets. Avoids really dumb transforms done by GPU hw.
- When X16 is used, all the channels get written to R channel alone. CmdBlit does perform format conversion!
- gl: Force image copy when blit is requested with compatible targets. Avoids format conversion issues.
2021-11-24 16:02:24 +03:00
kd-11 a21c6c4628 rsx: Fix handling of scaling requests for packed formats
- One does not simply interpolate RGB565 components as U16 data!
2021-11-24 16:02:24 +03:00
kd-11 58f0fa3ca5 gl: Enable handling of X16 blit targets 2021-11-24 16:02:24 +03:00
kd-11 97bd8f7bc1 rsx: Update sampler format class when inheriting mipmap slices/sections 2021-11-24 16:02:24 +03:00
AniLeo 1df8f52a9f vk: Remove lavapipe workaround
Current lavapipe version now has support for 
shaderStorageBufferArrayDynamicIndexing
2021-11-23 22:48:46 +01:00
orbea 59f253ba24 cmake: Use GNUInstalldirs 2021-11-22 21:45:55 +01:00
Megamouse 7eee9e7b05 overlays: simplify backup icon copy procedure 2021-11-20 08:43:46 +01:00
Megamouse 0f7534c755 VK: fix NVIDIA driverVersion check 2021-11-16 09:31:16 +01:00
Megamouse 4d0330bf82 rsx: fix possible segfault 2021-11-16 09:31:16 +01:00
Megamouse f6e04ffdd2 overlays: add stick input to native dialogs 2021-11-16 01:38:33 +01:00
Megamouse 44b42f68fd overlays: add R3, L3 and PS buttons
Unused at the moment
2021-11-16 01:38:33 +01:00
Megamouse ff5e31f396 overlays: add system sounds 2021-11-15 23:03:30 +01:00
kd-11 59b1c324a9 rsx: Properly implement immediate mode rendering
- Treat the draw commands as being consumed on-the-fly with ATTR0 as provoking attribute
- Analysing streams sent to RSX and the results implies they are consumed fully inline.
  This only makes sense if a provoking attribute is present. The 'static' register is truly the immediate register for the draw.
2021-11-15 18:14:15 +03:00
kd-11 1f627caa81 rsx: Clear some leaking register state between runs 2021-11-15 18:14:15 +03:00
kd-11 7e3eab9915 rsx: Fix texture state propagation between unrelated draw calls
- Older games can load all textures before a draw sequence and then swap shaders for different draws.
- Optimizations in texture state streaming make it so that only referenced data is carried forward.
2021-11-09 12:39:49 +03:00
Megamouse 88bb26afb4 vk: make upscaler dynamic
The config option was marked as dynamic, but was never actually changed ingame
2021-11-06 01:02:54 +01:00
kd-11 f7eacf70ec rsx: Restore shader disassembler to working state 2021-11-05 23:55:07 +03:00
kd-11 933d96af5f vk: Do not clip region using renderpass renderarea, we have scissor for that 2021-11-04 21:05:15 +03:00
kd-11 ad00c44231 rsx: Configure pitch correctly for pitch-zero textures (1D) 2021-11-03 16:58:30 +03:00
Eladash b84e95d768 rsx: Fixate time stamp of VBLANK 2021-11-01 10:04:53 +01:00
Eladash 4369fb234e rsx: Fix typo in VBLANK processing regarding emulation pause 2021-11-01 10:04:53 +01:00
Eladash 58040d478a rsx: Implement NTSC fixup mode, improve VBLANK accuracy 2021-11-01 10:04:53 +01:00
kd-11 5b0ef401f7 rsx: Fix sampling in X when 0 pitch is given
- A pitch of 0 still allows 1-dimensional addressing.
2021-10-31 14:32:42 +03:00
Megamouse 1650dd1c7d overlays: fix graph offset error after applying new config
I already had this figured out last time but forgot the dynamic config use case.
2021-10-31 10:14:08 +01:00
Megamouse 84f123041a overlays: fix offset of right edge oriented graphs when detail level is none 2021-10-31 10:14:08 +01:00
Megamouse f258ae795c Add more logging for Emulator Stop events
This should give us more insight into the conditions that cause emulation stops.
This may also help find false issue reports.
2021-10-31 04:12:47 +01:00
Megamouse 33e80a733d overlays: fix trophy notification sound in queue 2021-10-30 22:44:30 +02:00
Megamouse 0e20acdf55 overlays: add optional sound to trophy popup 2021-10-30 17:16:45 +02:00
Megamouse f262e77fbd overlays: add fade to trophy notification pop-ups 2021-10-30 17:16:45 +02:00
Megamouse 244aa6879a overlays: fix trophy notification pop-up locations 2021-10-30 17:16:45 +02:00
kd-11 78bcb0fd53 rsx: Do not reuse/destroy sections that have references held
- Avoids a situation where blit-dst and blit-src have overlapping ranges. Uploading blit-dst destroys blit-src and vice-versa.
  This is not the end of the world, but blit-src should be kept around until the operation is completed to avoid stale references!
2021-10-27 12:30:43 +03:00
kd-11 c733e794de gl: Use real image dimensions when decoding compressed textures
- Image size is already correctly calculated using block dimensions
2021-10-27 12:30:43 +03:00
kd-11 99fc90648b gl: Disable shader interpreter if hardware does not support bindless textures 2021-10-27 12:30:43 +03:00
kd-11 2587545eed gl: Fix decoding of wide, swizzled textures
- Handle pre-byteswapped data (swizzled usually) in the compute-safe path
2021-10-27 12:30:43 +03:00
kd-11 4ed92f4155 vk: Fully allow CB change in emit_geometry
- upload_vertex_data can trigger a flush to CELL which will result in CB flush.
  Ensure CB state is correctly reloaded in such a situation.
2021-10-20 12:05:39 +03:00
Eladash ab50e5483e
GUI Utilities: Implement instruction search, PPU/SPU disasm improvements (#10968)
* GUI Utilities: Implement instruction search in PS3 memory
* String Searcher: Case insensitive search
* PPU DisAsm: Comment constants with ORI
* PPU DisAsm: Add 64-bit constant support
* SPU/PPU DisAsm: Print CELL errors in disasm
* PPU DisAsm: Constant comparison support
2021-10-12 23:12:30 +03:00
kd-11 d58df667b9 rsx: Fix some texture decode instructions
- Fix TEX1D_PROJ definition
- Make TEX3D_PROJ cubemap-compatible
2021-10-12 13:47:08 +03:00
kd-11 479150b214 rsx: Fix decoding of linear cubemaps
- 128-byte boundary is not observed in linear tiling. Verified in hw.
2021-10-10 16:15:28 +03:00
kd-11 e1d1d16227 gl: Alias register binding points a bit
- While aliasing is easy to break, it allows outdated hw to run
2021-10-10 16:15:28 +03:00
kd-11 b3725baf5a rsx: Rewrite shader decompiler texture dispatch 2021-10-09 15:10:36 +03:00
kd-11 f1d9a014c0 vk: Silence compiler warning 2021-10-09 15:10:36 +03:00
Megamouse af11546b1e Overlays: fix small performance overlay font sizes 2021-10-04 19:57:57 +02:00
kd-11 f90bf2dd40 vk: Use a dynamic number of descriptor allocations 2021-09-29 01:20:32 +03:00
kd-11 dc8fc9fc79 vk: Clean up around vkQueueSubmit handling
- Explicitly declare one version for CB flush and the other for Async flush
- Always flush descriptors on CB flush in case of page fault handling.
  Other threads other than offloader can also enter the method and require normal flow.
- Fix overlapping interrupt IDs.
- Minor formatting fixes
2021-09-28 23:18:26 +03:00
kd-11 3d49976b3c vk: Add deregister event for sets
- Unused in practice, but this is more for peace of mind.
2021-09-28 17:43:15 +03:00
kd-11 eed38e1bbc vk: Make the new descriptor system spec compliant 2021-09-28 17:43:15 +03:00
kd-11 9595297a3a Whitespace fix 2021-09-28 17:43:15 +03:00
kd-11 7c5b5d25e3 vk: Implement descriptor allocation batching 2021-09-28 17:43:15 +03:00
kd-11 2e22a0d9bb rsx: Optimize thread self-tests 2021-09-28 17:43:15 +03:00
kd-11 ba2a8ebf2e vk: Enable deferred descriptor updates via descriptor-indexing 2021-09-28 17:43:15 +03:00
kd-11 381c7544fa Optimize basic descriptor batching 2021-09-28 17:43:15 +03:00
kd-11 4752c4014b vk: Implement basic descriptor updates batching 2021-09-28 17:43:15 +03:00
kd-11 24642a4c18 vk: Refactor descriptors a bit 2021-09-28 17:43:15 +03:00
kd-11 62979c7bd9 vk: Enable descriptor indexing extension if supported 2021-09-28 17:43:15 +03:00
kd-11 7b9fb7ad9c rsx: refactor rsx_utils a bit
- Move obviously standalone things to their own utility files
2021-09-28 17:43:15 +03:00
kd-11 7f830d555d vk: Simplify texture cache OOM tracking a bit 2021-09-28 17:43:15 +03:00
kd-11 9aafd8c09f rsx: Avoid get_system_time for simple draw ordering 2021-09-28 17:43:15 +03:00
kd-11 6781eb7c76 rsx: Avoid calling get_system_time() every draw call 2021-09-28 17:43:15 +03:00
kd-11 3e09b97f58 rsx: Minor optimization; avoid preparing unused vertex streams
- Also discards unused program state variables
2021-09-28 17:43:15 +03:00
Megamouse 269c4604aa VFS: move VFS settings to seperate file 2021-09-25 19:21:59 +03:00
kd-11 e4aff539b0 vk: Fix scanning for upload heap types.
- HOST_CACHED support must be prioritized, but is not a mandate.
- Scan for that flag explicitly and fall back to uncached if it is not supported.
- Uncached memory is too slow for our requirements to contend with cached memory.
2021-09-23 01:45:37 +03:00
Megamouse f1037f75d9 perf_overlay: fix initial graph positions with detail level none 2021-09-22 08:06:58 +02:00
Megamouse 81a01134bb cellOsk: fix dialog abort w/o user interaction 2021-09-21 23:22:26 +02:00
kd-11 3c7ada8e83 rsx: Fix 3D texture decode
- 3D mipmaps are shrunk in all 3 axes, they are not 2D array textures.
- Fixes mip1-mipN for all situations
2021-09-21 19:53:46 +03:00
kd-11 46b3027981 rsx: Invariably clear the texture state if referenced. 2021-09-21 19:53:46 +03:00
kd-11 334999f639 vk: Enable sampler mirror-clamped-to-edge as an extension 2021-09-21 19:53:46 +03:00
kd-11 dabfce5c82 rsx: Rework how depth/stencil initialization+clear works 2021-09-21 19:53:46 +03:00
kd-11 0a8d9a12ab vk: Rewrite memory initialization 2021-09-21 19:53:46 +03:00
kd-11 19b2da2590 Enable stencil export extension when required 2021-09-21 19:53:46 +03:00
Megamouse a50e22a11f Overlays: Fix position of centered perf-overlay 2021-09-19 20:30:02 +02:00
Megamouse 14a425e487 rsx: wait when emulation is paused
This decreases my cpu usage by to <1% during Emu.Pause()
2021-09-17 23:13:24 +02:00
kd-11 c2ab3c664c rsx: Fix stupid overflow 2021-09-17 20:12:08 +03:00
xddxd bcda172ae7 Switch from r16ui to r16 2021-09-16 14:09:21 +03:00
xddxd d511e76a63 Enable the precise occlusion query feature 2021-09-16 14:09:21 +03:00
Eladash 5600430a05 Fix user_interface::alloc_thread_bit() usage 2021-09-13 22:36:53 +03:00
kd-11 53457262d4 rsx: Implement ZPASS results scaling for precise stats 2021-09-06 20:04:03 +03:00
kd-11 472efc08eb rsx: Implement precise ZCULL stats 2021-09-06 20:04:03 +03:00
Megamouse 0debcfed0a Silence some warnings 2021-09-02 19:39:42 +02:00
kd-11 b5dcfb3431 rsx: Rework gamma override mask from RGBA to ARGB to match other per-channel mask registers 2021-08-30 11:41:19 +03:00
kd-11 a5e455d8ed rsx/fp: Handle signed operator precedence
This was marked TODO for a long time
- Unsigned remap seems to be overriden by gamma mask (Resistance 3)
- We already know sign mask overrides gamma mask from UE3 titles
2021-08-30 11:41:19 +03:00
kd-11 3ab9e04db7 rsx: Fix surface access bit flags
- The previous enumeration was a holdover from older access management.
- A bitflag of 0 seriously messes up the mask tests
2021-08-29 11:10:30 +03:00
kd-11 b0e352c44e Add missing const 2021-08-26 13:55:00 +03:00
kd-11 2ff407ac6a rsx/fp: Fix perspective correction handling
- Perspective correction flag multiplies VP output by HPOS.w.
  NOTE: Not the same as division by w when it comes to NaN/Inf problems!!
- Restructure indexed loads a bit to avoid re-initializing registers unnecessarily
2021-08-26 13:55:00 +03:00
kd-11 b0e5de4c9c rsx: Texcoord control mask affects decompiler output! 2021-08-26 13:55:00 +03:00
kd-11 57b9acec62 rsx: Implement indexed dynamic attribute load 2021-08-24 16:52:18 +03:00
kd-11 c1f31d37f5 fsr: Mark output images explicitly as nonreadable 2021-08-24 15:30:46 +03:00