Commit graph

467 commits

Author SHA1 Message Date
Eladash 9fc5f6271b Update SPU reservation notifier mask 2023-07-23 17:58:54 +03:00
Eladash c44cddabfa CPUThread.cpp: Fix use of cpu_counter::add
This also fixes a crash when saving savestate because main thread uses cpu_counter::suspend_all which adds cuncurrency.
2023-07-23 17:58:54 +03:00
Eladash c0280b43f2 PPU/Debugger: View the currently used CR field content in register panel 2023-07-12 13:22:06 +03:00
Eladash 16579e0b1f Fix spu_thread::cleanup() 2023-06-06 09:48:27 +02:00
Elad Ashkenazi 23c710cf53 CPUThread.cpp: Fix an emulator crash on game exit 2023-05-22 20:04:49 +03:00
Ivan Chikish 45fecf0059 SPU LLVM: disable AVX2 shift intrinsics
Was incorrectly checked.
2023-04-23 18:36:45 +03:00
Ivan Chikish 22bd7dcc42 PPU LLVM: disable DSE pass and use volatile store/loads 2023-04-14 07:26:30 +03:00
Ivan Chikish 06b0e35fb9 Update to LLVM 16.0.1
Fix Zen4+ AVX-512 detection
2023-04-11 12:13:09 +03:00
Ivan Chikish fb88e1c1c9 Update to LLVM 16.0.0, switch to upstream LLVM 2023-04-06 10:19:31 +03:00
oltolm 520524285a
llvm: update code to new API (#13500)
* llvm: update code to new API

* llvm: remove OLDLLVM define
2023-03-11 01:57:21 +03:00
Eladash 0da81d22d3 SPU Profiler: fix CPU usage when emulation is paused
Avoid collecting samples when the thread paused.
2022-10-20 11:22:33 +03:00
Eladash 52b993095d SPU Profiler: nearly always print on Emu.Pause() 2022-10-20 11:22:33 +03:00
Eladash d25d1ecb3a LV2: Avoid using multi-variable atomic waiting on cpu_thread::state wait 2022-10-04 16:28:34 +03:00
Eladash 58dd2bff41 Savestates: Fix unintentional pause when saving with HLE VDEC contexts 2022-10-04 14:14:38 +03:00
Malcolm Jestadt d8897c585d PPU/SPU LLVM: Allow Zen4 cpus to use VPERMI2B/VPERMT2B instead of the vperm2b256to128 path
- Zen4 based cpus can process VPERM2B in a single uop, unlike intel where it is 3 uops.
2022-10-01 15:38:29 +03:00
Eladash 194f7375da SPU/LV2: Fix tiny race conditions 2022-09-21 20:35:34 +03:00
Nekotekina b49a1f27eb Warning fixes 2022-09-17 16:35:02 +03:00
Eladash 9d9e18f614 CPU preemption control: don't yield if we can't stop 2022-09-16 18:57:55 +03:00
Eladash fc331da883 CPU preemption control: remove yield before thread stop 2022-09-16 18:57:55 +03:00
Eladash b6d3fa8c66 CPU preemption control: avoidance in reservation operations 2022-09-16 18:57:55 +03:00
Eladash cf4da5c4d1 CPU preemption control: bugfixes 2022-09-16 18:57:55 +03:00
Eladash 9d1ec0b319 CPU preemption control: try to minimize sleep time gaps between setups 2022-09-16 18:57:55 +03:00
Eladash ec7b18dab5 Implement independent CPU preemptions 2022-09-13 19:28:20 +03:00
Eladash cfdc852f03 SPU: Power consumption reduction when using SPU inaccurate reservations 2022-09-13 11:21:01 +03:00
Eladash daf43989fc Thread.h: Improve thread abort performance 2022-08-25 23:54:56 +03:00
Eladash 133e9d4705 CPUThread.cpp: Fix cpu_flag::pending reset 2022-08-11 11:42:16 +03:00
Elad Ashkenazi c4cc0154be LV2: Optimizations and fixes
Fix and optimize sys_ppu_thread_yield

Fix LV2 syscalls with timeout bug. (use ppu_thread::cancel_sleep instead)

Move timeout notification out of mutex scope

Allow g_waiting timeouts to be awaked in scope
2022-08-11 11:42:16 +03:00
Eladash 73aaff1b29 LV2: allocation-free synchronization syscalls
* Show waiters' ID in kernel explorer.
* Remove deque dependency from sys_sync.h
2022-08-07 20:23:54 +03:00
sguo35 84a785ea67 arm64: implement pshufb intrinsic 2022-08-05 22:53:11 +02:00
sguo35 b02e6e222f arm64: enable fma and "avx" on Apple and Cortex CPUs 2022-07-15 12:37:33 +03:00
sguo35 488982f424 spu: external function calls should be marked non-tail
Mark external function calls as non-tail, since they aren't tail calls
and assuming they are will cause returns to fail in Arm64 GHC CC.
2022-07-15 12:37:33 +03:00
Eladash 3e51426379 Savestates/SPU: Kill emulation when its safe to save SPU state 2022-07-15 09:30:53 +03:00
Elad Ashkenazi fcd297ffb2
Savestates Support For PS3 Emulation (#10478) 2022-07-04 16:02:17 +03:00
Eladash 5e01ffdfd8 Debugger: Optimize cpu_thread::dump_regs()
Reuse string buffer. Copies and reallocations are expensive with such large strings.
2022-06-23 22:41:32 +02:00
Nekotekina 653a9e6e7f Debugger: always print cpu_thread::dump_misc()
Was removed for some reason.
2022-06-22 18:53:29 +03:00
Eladash ccb2724fc4 Debugger: Implement SPU breakpoints 2022-06-21 16:59:45 +03:00
Jeff Guo cefc37a553
PPU LLVM arm64+macOS port (#12115)
* BufferUtils: use naive function pointer on Apple arm64

Use naive function pointer on Apple arm64 because ASLR breaks asmjit.
See BufferUtils.cpp comment for explanation on why this happens and how
to fix if you want to use asmjit.

* build-macos: fix source maps for Mac

Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo
modes.

* LLVM PPU: fix aarch64 on macOS

Force MachO on macOS to fix LLVM being unable to patch relocations
during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by
PPUTranslator/Recompiler.

* virtual memory: use 16k pages on aarch64 macOS

Temporary hack to get things working by using 16k pages instead of 4k
pages in VM emulation.

* PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS

Fixes some intrinsics usage and patches usages of asmjit to properly
emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also
patches the SPU recompiler to properly work on arm64 by telling LLVM to
target arm64.

* virtual memory: fix W^X toggles on macOS aarch64

Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default
to RW mode. For both SPU and PPU execution threads, when initialization
finishes we toggle to RX mode. This exploits Apple's per-thread setting
for RW/RX to let us be technically compliant with the OS's W^X
    enforcement while not needing to actually separate the memory
    allocated for code/data.

* PPU: implement aarch64 specific functions

Implements ppu_gateway for arm64 and patches LLVM initialization to use
the correct triple. Adds some fixes for macOS W^X JIT restrictions when
entering/exiting JITed code.

* PPU: Mark rpcs3 calls as non-tail

Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you
call a function inside e.g. an L2 syscall, it will clobber LR on arm64
and subtly break returns in emulated code. Only JIT -> JIT "calls"
should be tail.

* macOS/arm64: compatibility fixes

* vm: patch virtual memory for arm64 macOS

Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to
existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix
memory unmapping on 16K page machines with a hack to mark "unmapped"
pages as RW.

* PPU: remove wrong comment

* PPU: fix a merge regression

* vm: remove 16k page hacks

* PPU: formatting fixes

* PPU: fix arm64 null function assembly

* ppu: clean up arch-specific instructions
2022-06-14 15:28:38 +03:00
Malcolm Jestadt ebeeafc94f SPU LLVM: Use vrangeps in clamp_smax
- This instruction can clamp a value between a range of values, something which previously needed 2 instructions.
- With the immediate byte set to 0x2 it will compute the minimum between the absolute value of the first input and the second input, and then copy the sign from the first input to the result.
2022-06-11 18:25:31 +03:00
Elad Ashkenazi 004d9b09b8 LLVM: Fix 0 vector constant observation 2022-06-08 19:31:39 +03:00
Eladash 1cab99b3ca Make CPU Profiler able to print stats which sum up the records of all SPU threads
Hitherto the statistics have been exclusively thread-specific.

Other improvements:
* Fixed container management so a collision of a new element with an older element of the record will become impossible.
* Added thread name to thread-specific information printing.
* Fixed condition to abort SPU block statistics collection, now matches SPU LLVM Profiler's.
* Fix possible division by 0 by checking `samples`.
2022-05-07 12:57:54 +03:00
Nekotekina 6d3052c5dd Optimization: disable atomic_wait_engine notify callback for SPU
Disable placebo callback calls in notify_all.
Don't use callback at all if TSX.
Based on kd-11 findings.
2022-04-24 13:15:54 +03:00
sguo35 e761b3235c macos: fix build for arm64
Adds arm64 branches to some x86 specific code and modifies some casting
logic to make Clang happy
2022-04-18 17:53:54 +03:00
Eladash 6783bcd273 Log a snippet of guest thread code at crash 2022-04-15 22:34:51 +03:00
Eladash 1d51f3af0c RSX-Debugger: Implement backwards scrolling
* Use 2 points of known true RSX code roots and follow them in order to peek at the current section of valid RSX code:
These roots are: current RSX instruction address and the last targeted address by a branch instruction.
2022-04-15 22:34:51 +03:00
Eladash e951c619c5
Implement Emulator::GracefulShutdown() 2022-02-05 11:49:29 +01:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
Nekotekina 3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
Nekotekina 6730dc1dc4 LLVM DSL: print some debug info in get_const_vector<v128> 2021-12-07 13:21:24 +03:00
Nekotekina 04c9d01390 PPU LLVM: modernize most vector instructions
Rewritten VSUM instructions:
VSUMSWS, VSUM2SWS, VSUM4SBS, VSUM4SHS, VSUM4UBS
2021-12-03 00:14:06 +03:00
Malcolm Jestadt 7573d7289b SPU LLVM: Hook up 128 bit spu verification
- Also fix FMA enablement for sapphirerapids
2021-11-06 21:12:12 +03:00